Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) in Docker #7

Open
JensenLZX opened this issue Jul 31, 2022 · 2 comments
Open

Segmentation fault (core dumped) in Docker #7

JensenLZX opened this issue Jul 31, 2022 · 2 comments

Comments

@JensenLZX
Copy link

Segmentation fault (core dumped) in Docker

Device: NVIDIA A100 40GB PCIe GPU Accelerator

Method: Docker

Details:

I run

python train.py --task=ShadowHandOver --algo=ppo

and

python train.py --task=ShadowHandOver --algo=happo

in ~\bi-dexhands

In both task the model weights xxx.pt had been saved in ~\bi-dexhands\logs correctly.

However, at the end of these tasks, it shows error in console as following.

Output:

some episodes done, average rewards:  tensor(16.7454, device='cuda:0')
some episodes done, average rewards:  tensor(14.1145, device='cuda:0')
some episodes done, average rewards:  tensor(15.4696, device='cuda:0')
some episodes done, average rewards:  tensor(15.4252, device='cuda:0')
some episodes done, average rewards:  tensor(14.8325, device='cuda:0')
some episodes done, average rewards:  tensor(19.7192, device='cuda:0')
some episodes done, average rewards:  tensor(15.9727, device='cuda:0')

Algo happo Exp check updates 48825/48828 episodes, total num timesteps 49997824/50000000, FPS 1922.

some episodes done, average rewards:  tensor(14.0804, device='cuda:0')
some episodes done, average rewards:  tensor(17.5084, device='cuda:0')
some episodes done, average rewards:  tensor(18.6891, device='cuda:0')
Segmentation fault (core dumped)

Is there any suggestion about dealing with this error?

Thx in advance!

@cypypccpy
Copy link
Collaborator

Dear @RogerLZX ,

I'm sorry that because we rarely use docker to run Isaac Gym, I don't know the reason for this bug. It looks like this bug only appears at the end of the task, so maybe you can increase the number of episodes to achieve the same effect.

Isaac Gym is still in development, so there will inevitably be many of these bugs. I recommend that you can go to the DevTalk Forum to find or ask about this bug, usually there will be NVIDIA developers to answer the questions if they know.

Hope this can help you.

@JensenLZX
Copy link
Author

@cypypccpy
Sorry~
This issue is duplicated with issue #8 by some mistakes.
Please delete it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants