-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Saving model checkpoints during training #103
Comments
Hi @nicesetup! @alex-krull is currently very busy moving the center of his life to the UK where he's starting his own group, so I'm sorry that our support is currently taking a bit. I have one random thought and a question for now:
Best, |
Hi Florian, thank you very much for your response! I did some further research and think your random thought is quite probably correct and the optimizer is what causes this behaviour. To pause training and resume from that exact state later on, it would therefore be required to checkpoint not just the weights, but the entire model, including the state of the optimizer. Do you think this is feaseable to implement in your N2V code? Alternatively, saving checkpoints of weights-only during training would already be helpful. Thank you very much and best regards. /Update: |
Hello everyone,
I am currently trying to use n2v 3D within the ZeroCostDL4Mic framework to denoise fluorescence images of calcium signals in order to make them useable for quantitative analysis. To better understand the training process, I tried to create a loop pausing the training every couple of epochs and create an intermediate prediction. However, unfortunately, storing the weights and reloading them degrades the model's performance and training that way is not equivalent to training in one go. (See this issue for details: HenriquesLab/ZeroCostDL4Mic#50 ).
I would therefore like to realize the suggestion Romain Laine made in that issue: Saving model checkpoints with a keras callback. That way, I could create "intermediate results" from the saved checkpoints later on, without having to interupt the training process. The N2V class seems to offer a way of saving checkpoints already (code lines 280-284). Would it be sufficient to configure the callback such that not just weights_best are saved, but the current weights at the end of each epoch? Or would I have to adapt the code further to achieve such a functionality? Further, would it be possible to name those checkpoints adequately, for example by passing formatstring like weights.{epoch:02d}-{val_loss:.2f}.h5 to the config-object?
Any help implementing this would be greatly apreciated since I am not really familiar with Keras.
Thank you very much and best regards!
The text was updated successfully, but these errors were encountered: