Skip to content

Commit

Permalink
Merge pull request #27 from YerevaNN/remove_dload_dup
Browse files Browse the repository at this point in the history
Remove dataloader duplication via yield on each process
  • Loading branch information
MenuaB authored May 14, 2024
2 parents 254e2b7 + ea2c700 commit 7be6d91
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 7 deletions.
12 changes: 6 additions & 6 deletions chemlactica/jsonl_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,12 @@ def samples_generator(
counter = 0
while line:
state["position"] = f.tell()
# if should_yield_on_current_rank(
# counter,
# distributed_state.num_processes,
# distributed_state.process_index,
# ):
# returned = True
if should_yield_on_current_rank(
counter,
distributed_state.num_processes,
distributed_state.process_index,
):
returned = True
ret = format_sample(line)
yield ret
counter = counter + 1
Expand Down
2 changes: 1 addition & 1 deletion test_status.yaml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
16b42bfd74750347376025e520905710f7b791d8: PASS
33f0567f336bd041b6b687cb3258a855d948b6b8: PASS

0 comments on commit 7be6d91

Please sign in to comment.