-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paginated github search response traversal does not work #5
Comments
Hi @theroys, Can you review and double check the below approach? If it is ok, then I can make the query manipulative by taking an user entry, and send a pull request.
|
Hi Long , yes that is the correct approach for this specific fix.This code also also calls the classifier , which in turn calls queryopenhub, there is a API key from openhub with a limit of 1000 calls per day. Thanks for working on this :) |
Hi @longhn i get 403 while fetching your fix from branch |
Hi @theroys
The error I'm facing locates in another class. I think it should be a separate issue. search for git-extras returned 3results
classifying iosched
Traceback (most recent call last):
File "/home/longhn/Project/OSSRank/gitdatacollection/SearchGitRepositories.py", line 78, in <module>
File "/home/longhn/Project/OSSRank/gitdatacollection/SearchGitRepositories.py", line 70, in main
File "/home/longhn/Project/OSSRank/gitdatacollection/SearchGitRepositories.py", line 26, in get_projects_metadata
File "/home/longhn/Project/OSSRank/gitdatacollection/ProjectClassifier.py", line 160, in classify_project
current_desc_words=get_desc_words(project_description)
File "/home/longhn/Project/OSSRank/gitdatacollection/ProjectClassifier.py", line 57, in get_desc_words
desc_words=set(wordpunct_tokenize(software_desc.replace('\n', '').lower()))
AttributeError: 'NoneType' object has no attribute 'replace'
>>> |
Hi @longhn , yes this is an error in projectclassifier.py , i will fix it.This is the issue with when description is not present and python returns NoneType object i need t change the logic there not categorize as per gitbub desc when not present. |
@theroys, @fsiddiqi In the ProjectClassifier.get_desc_words() function, I add a below check and it seems to work def get_desc_words(software_desc, stopwords=[]):
if (software_desc is None):
return 'undefined description'
.... I'm able to reach the daily access limit of openhub classifying codebox
best category match as per git desc ->Application Development -IDE
openhub api returned an error while searching for project codebox<error>This api_key has exceeded its daily access limit.</error>
file name is: github_search_dump_1200.txt
current_fetch_url is: https://api.github.com/search/repositories?q=stars:>100&per_page=100&page=11
Traceback (most recent call last):
File "/home/longhn/Project/worksapce/SearchGitRepos/src/gitdatacollection/SearchGitRepositories.py", line 78, in <module>
main()
File "/home/longhn/Project/worksapce/SearchGitRepos/src/gitdatacollection/SearchGitRepositories.py", line 70, in main
filtered_proj_data = get_projects_metadata(jsonRespData)
File "/home/longhn/Project/worksapce/SearchGitRepos/src/gitdatacollection/SearchGitRepositories.py", line 22, in get_projects_metadata
for item in jsonContent['items']:
KeyError: 'items' |
in SearchGitRepositories.py paginated search result traversal does not work.Only first page is traversed for projects , although this has been committed as such, i am creating this as a first bug to fix.
The text was updated successfully, but these errors were encountered: