-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle datasets that require processing? #7
Comments
I compare this to regular software based on source code. To run it, we need to download it first which is handled by the "run" : [
{
"name" : "upscale",
"command" : "python upscale.py"
},{
"name" : "train",
"command" : "python train.py"
}
] I'd have to add stage tracker so each run does not repeat all stages from scratch but makes progress by picking up where the previously interrupted stage finished. In the case of At this point, I don't know how to cleanly introduce checkpointing of stages in case of interruptions due to e.g. allocation running out or hardware failing. |
In the past I was using the following mechanism to track the stage completion:
input:
model:
checksum: abcd
data:
checksum: xyz
config:
checksum: 1234 This way it is easy to check if the configs/processing/upstream data changed and processed data is stale and needs to be regenerated |
we also need some feature that tests condition, maybe we can generalize success:
note as lots of stuff in sh is complex, python may be also cool to have |
Some of the datasets used for training require processing, such us upscaling. Example: https://github.com/YudongYao/AutoPhaseNN/blob/main/PyTorch/prepare_defectFree_data.ipynb
The procedure here is:
I see few ways of handling this:
@luszczek @laszewsk What would be your preferred approach?
The text was updated successfully, but these errors were encountered: