-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print step time for each step #361
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not sure that we want to print every step by default (since it'll introduce a significant amount of logs/summaries, and since avg step time is usually pretty stable after the first couple hundred steps)?
That makes sense. Should I hide it behind an option? For me it was important to troubleshoot a variable step time issue. Or would you rather totally leave this out of the code base. Note I'm fine with that too. |
I've added it as a config parameter and made it false by default |
I have been using this for 2 use cases:
Please let me know if there is any interest in merging it @markblee |
# Log the step time for each individual step | ||
log_step_time: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'll be more general to do:
# Log the step time for each individual step | |
log_step_time: bool = False | |
# Interval to log average step time. If None, defaults to 100. | |
log_step_time_every_n_steps: Optional[int] = None |
Note that we default to None instead of 100 to avoid triggering a large number of golden config updates.
step_time = time.perf_counter() - step_start_time | ||
self._step_log("Step %s took %s seconds", self.step, step_time) | ||
self.summary_writer(self.step, {"step_time": step_time}) | ||
|
||
num_steps += 1 | ||
if num_steps % 100 == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if num_steps % 100 == 0: | |
if num_steps % (cfg.log_step_time_every_n_steps or 100) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will defer to @markblee for review.
This is helpful in cases where there is variable step time and looking at the logs would quickly allow you to identify such cases.