-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed dataloader #14
Conversation
This looks nice and straightforward to me. Could we add some tests here to verify that we're returning the right things. One easy way is to print out the current line being read per process and make sure there is no overlap. |
See the newly added test which verifies that there is no mixing of yielded lines. Additionally, there is a check that no lines are missed or duplicated which revealed there was actually one line skipped between processing of different files. |
@ArmenAg Confirming that we're good to merge this now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I would clean up comments that are sprinkled around code here.
Implement line by line yielding distibuted dataloader