Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scraper takes too long to run #6

Open
mrcnc opened this issue Sep 21, 2017 · 2 comments
Open

scraper takes too long to run #6

mrcnc opened this issue Sep 21, 2017 · 2 comments

Comments

@mrcnc
Copy link
Member

mrcnc commented Sep 21, 2017

Currently, on my machine with a 2.2 GHz i7 Processor, it's crawling ~600 pages/minute. Since there are ~165000 parcels, it will take ~4.5 hours to complete. This is too long and it would be a lot better if we could speed up this process.

@bhelx
Copy link
Member

bhelx commented Sep 21, 2017

Your mileage may vary, but doing 8 concurrent requests was a good number for me and their server had no noticeable dip in latency.

@mrcnc
Copy link
Member Author

mrcnc commented Sep 21, 2017

This will most likely require some profiling b/c the default number of concurrent requests per domain is already 8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants