Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training time is too long #24

Open
jasonbau opened this issue Mar 6, 2019 · 4 comments
Open

Training time is too long #24

jasonbau opened this issue Mar 6, 2019 · 4 comments

Comments

@jasonbau
Copy link

jasonbau commented Mar 6, 2019

I'm training 17000 pictures on GCP compute engine.
Computer engine infomation:
CPU & Memory: n1-highmem-8 (8 vCPU,52 GB memory)
GPU: 8 x NVIDIA Tesla K80
Disk: 100G SSD

I've already executed the python3 watermarks.py --logdir=save/ until now.
but it has not completed yet. Is there any way to get faster?
image

@arnaudoff
Copy link

Same issue here. Compiling Tensorflow with SSE and AVX instructions as recommended seems to have improved the performance on my machine, but not significantly.

@kmrabhay
Copy link

kmrabhay commented May 7, 2019

I fixed it by removing while True: part in evaluation part(Line after saving the model) which was causing the evaluation to process to go infinite loop resulting training never complete beyond one checkpoint

@kmrabhay
Copy link

kmrabhay commented May 7, 2019

Fixed it by removing while True: part in evaluation part(Line after saving the model) which was causing the evaluation process to go infinite loop resulting training never complete beyond one checkpoint after saving it.

@arnaudoff
Copy link

I haven't changed anything in the code, but training time decreased dramatically after building Tensorflow myself, but now with the -march=native flag. I guess it does better job at optimization than manually setting up what instruction set to use, so you guys might try that and see if it works for you as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants