Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the issue that len(indices) and num_samples might not be equal #1339

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sunjq1
Copy link
Contributor

@sunjq1 sunjq1 commented Nov 15, 2024

What changes were proposed in this pull request?

Modified the definition of total_size in the load_state_dict function and the definition of indices in the __iter__ function to ensure that assert len(indices) == self.num_samples.

Why are the changes needed?

In the previous ElasticDistributedSampler code, the issue where completed_num not being divisible by num_replicas could cause an AssertionError in the scenario of importing a checkpoint was not considered.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

UT and training test.

…en importing a checkpoint to resume training
Copy link
Collaborator

@BalaBalaYi BalaBalaYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add ut to cover the issue mentioned

@sunjq1
Copy link
Contributor Author

sunjq1 commented Nov 26, 2024

please add ut to cover the issue mentioned

since I adjusted the logic for splitting indices after loading the checkpoint, I corrected the num values in the UT file

@sunjq1 sunjq1 requested a review from BalaBalaYi November 26, 2024 07:42
Copy link

codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.82%. Comparing base (f03b769) to head (81a0fd2).
Report is 113 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1339      +/-   ##
==========================================
- Coverage   80.48%   79.82%   -0.66%     
==========================================
  Files         219      240      +21     
  Lines       20208    22578    +2370     
==========================================
+ Hits        16264    18023    +1759     
- Misses       3944     4555     +611     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sunjq1
Copy link
Contributor Author

sunjq1 commented Dec 6, 2024

@BalaBalaYi Hi,the submitted code format optimization has been completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants