-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi gpu training #69
base: main
Are you sure you want to change the base?
Conversation
Almost there, apparently there is a problematic interplay of tf and keras: |
done implementing multi-gpu training. I hope putting that into the constructor of I'll supply more extensive numbers later, my current estimate for training
I'll provide 4 GPU numbers later. Note that this "improvement" is expected to be non-linear as keras internally does parallize the batches, so a batch size of Would love to hear your feedback on this. |
Thank you for this PR! I have this on my to-do list, but wasn't able to get my hands on a multi-GPU system. I guess the cluster should work for testing. Although I am very confident that it just works, I would like to test it as well :) |
thanks for having a look. Last time I checked, all GPU configs with >=3 GPUs fail to run due to some problems with the keras data augmentations. Maybe this is leveraged by looking into bringing |
Hi,
I have set CUDA_VISIBLE_DEVICES to 1,2 before running the training. I have used pip install n2v to install N2V. My TF-GPU is 1.14.1, keras 2.2.5, numpy 1.19.1 The training still uses 1 GPU. Please let me know what I am missing. |
Hi @piby2, This functionality is not part of the official N2V release yet. If you would like to test it you would have to clone the fork |
This needs a bit more testing, but I think going multi-gpu is somewhat straight forward. Or did you try that already?