Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error when using dp train for training #4123

Open
LLfbforever opened this issue Sep 13, 2024 · 0 comments
Open

[BUG] Error when using dp train for training #4123

LLfbforever opened this issue Sep 13, 2024 · 0 comments

Comments

@LLfbforever
Copy link

Bug summary

  1. When training with dp train, there is no problem with multi-card training
  2. Error message when training single card (mpirun -np 1 dp train input.json)
    image
  3. Replace the deepmd-kit/deepmd/utils/neighbor_stat.py file in v2.2.9 with v2.2.10, and the single card can calculate normally.
    For examples, please refer to: deepmd_source_dir/examples/water/se_e2_a/

DeePMD-kit Version

v2.2.10

Backend and its version

Tensorflow v2.11.0

How did you download the software?

Offline packages

Input Files, Running Commands, Error Log, etc.

input file: deepmd_source_dir/examples/water/se_e2_a/input.json
Run the script : mpirun -np 1 dp train input.json

Steps to Reproduce

deepmd_source_dir/examples/water/se_e2_a/input.json

Further Information, Files, and Links

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants