Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make add_artifact replace by name (#685) #925

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gkanwar
Copy link
Contributor

@gkanwar gkanwar commented Feb 17, 2024

This addresses the feature request in Issue #685. In our experiments we have found this pattern to be very useful, e.g. for model checkpoints or intermediate artifact checkpoints. With this PR, artifact checkpoints can be updated frequently so the latest checkpoint is up-to-date. At a much slower rate, artifacts can be checkpointed to distinct names to preserve history with much less disk usage.

If it's preferable to not make this the default behavior, I would be happy to update the PR to thread a flag through add_artifact to optionally enable this behavior.

@thequilo
Copy link
Collaborator

Hi @gkanwar! I'm happy to accept this feature! But the default behavior shouldn't change so that we don't break other people's code.

I prefer a flag added to the MongoObserver instead of the artifact event. Or is there a use case where you want to manually enable it for a specific artifact and have the others keep the current behavior?

@thequilo
Copy link
Collaborator

At a much slower rate, artifacts can be checkpointed to distinct names to preserve history with much less disk usage.

With the current version of the PR this means that the user has to do the rate-limiting by renaming the artifacts, right? Would it make sense to add an option for this? E.g., to overwrite an artifact for 10 minutes and then add a new one?

@gkanwar
Copy link
Contributor Author

gkanwar commented Feb 19, 2024

Hi @thequilo, thanks for the quick reply! A flag in the MongoObserver sounds good to me. I'll go ahead and work this through and update the PR.

Yes, the user could do something like frequently saving over the checkpoint.pt artifact and infrequently also calling add_artifact with a unique checkpoint name like checkpoint_iter1000.pt. At least this is how we are setting up our runs now.

I'm hesitant to add such an automatic timed backup feature, since for example users may want to organize these files differently. In our case, it's checkpointing based on gradient descent steps, which is of course not known to sacred explicitly. We set our filenames based on this.

@thequilo
Copy link
Collaborator

I'm hesitant to add such an automatic timed backup feature, since for example users may want to organize these files differently. In our case, it's checkpointing based on gradient descent steps, which is of course not known to sacred explicitly. We set our filenames based on this.

Agreed, it makes more sense to let the users do it themselves. There are too many different usecases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants