Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) on import torch_geometric #4363

Closed
DanielPerezJensen opened this issue Mar 28, 2022 · 19 comments
Closed

Segmentation fault (core dumped) on import torch_geometric #4363

DanielPerezJensen opened this issue Mar 28, 2022 · 19 comments
Labels

Comments

@DanielPerezJensen
Copy link

DanielPerezJensen commented Mar 28, 2022

🐛 Describe the bug

When I try to import torch_geometric I get a segmentation fault (core dumped) error.

I have tried reinstalling torch_geometric, and I have verified that PyTorch and torch-scatter library do work as intended (in the sense that importing them yields no errors).

I wish I could add more detail, but I am just trying to import torch_geometric to get started and as such haven't even used any functions yet, since every time I try to import I get this error.

import torch_geometric

I installed torch_geometric with the following command (from the getting started page):
pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.11.0+cu113.html

I have also installed the CPU version to check if that would not yield any error, but still I get the same segmentation fault (core dumped) error upon importing torch_geometric.

Environment

  • PyG version: latest
  • PyTorch version: 1.11.0+cu113
  • OS: Ubuntu 20.04.4 LTS x86_64
  • Python version: 3.8.10
  • CUDA/cuDNN version: 11.3
  • How you installed PyTorch and PyG (conda, pip, source): pip
  • Any other relevant information (e.g., version of torch-scatter):
    ** torch_scatter: 2.0.9
    ** torch_cluster: 1.6.0
    ** torch_spline_conv: 1.2.1

UPDATE:

I added the output from gdb python -c "import torch_geometric" figured it might be useful:

warning: Currently logging to gdb.txt.  Turn the logging off and on to make the new setting effective.
Signal        Stop	Print	Pass to program	Description
SIG33         No	No	Yes		Real-time event 33
Starting program: /home/daniel/.local/share/virtualenvs/flow-forecasting-thesis-VrA_h4b9/bin/python -c import\ torch_geometric
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4943700 (LWP 11265)]
[New Thread 0x7ffff4142700 (LWP 11266)]
[New Thread 0x7fffef941700 (LWP 11267)]
[New Thread 0x7fffef140700 (LWP 11268)]
[New Thread 0x7fffea93f700 (LWP 11269)]
[New Thread 0x7fffe813e700 (LWP 11270)]
[New Thread 0x7fffe593d700 (LWP 11271)]
[New Thread 0x7fffe313c700 (LWP 11272)]
[New Thread 0x7fffe093b700 (LWP 11273)]
[New Thread 0x7fffde13a700 (LWP 11274)]
[New Thread 0x7fffdb939700 (LWP 11275)]
[New Thread 0x7fffd9138700 (LWP 11276)]
[New Thread 0x7fffd6937700 (LWP 11277)]
[New Thread 0x7fffd4136700 (LWP 11278)]
[New Thread 0x7fffd1935700 (LWP 11279)]
[Thread 0x7fffd1935700 (LWP 11279) exited]
[Thread 0x7fffd4136700 (LWP 11278) exited]
[Thread 0x7fffd6937700 (LWP 11277) exited]
[Thread 0x7fffd9138700 (LWP 11276) exited]
[Thread 0x7fffdb939700 (LWP 11275) exited]
[Thread 0x7fffde13a700 (LWP 11274) exited]
[Thread 0x7fffe093b700 (LWP 11273) exited]
[Thread 0x7fffe313c700 (LWP 11272) exited]
[Thread 0x7fffe593d700 (LWP 11271) exited]
[Thread 0x7fffe813e700 (LWP 11270) exited]
[Thread 0x7fffea93f700 (LWP 11269) exited]
[Thread 0x7fffef140700 (LWP 11268) exited]
[Thread 0x7fffef941700 (LWP 11267) exited]
[Thread 0x7ffff4142700 (LWP 11266) exited]
[Thread 0x7ffff4943700 (LWP 11265) exited]
[Detaching after fork from child process 11280]
[New Thread 0x7fffd1935700 (LWP 11284)]

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffefc21a1e1 in google::protobuf::internal::ReflectionOps::FindInitializationErrors(google::protobuf::Message const&, std::string const&, std::vector<std::string, std::allocator<std::string> >*) () from /home/daniel/.local/share/virtualenvs/flow-forecasting-thesis-VrA_h4b9/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
@DanielPerezJensen
Copy link
Author

Another thing to add might be that I use Pipenv as a package manager. This has never brought problems in the past with any PyTorch library.

@rusty1s
Copy link
Member

rusty1s commented Mar 28, 2022

Thanks for reporting. You mentioned that you can import torch_scatter. Does that also hold for torch_sparse and other libraries? Furthermore, can you confirm that the error is the same when uninstalling the extension packages and importing torch_geometric again?

I am not entirely sure how to interpret the google::protobuf error. Can you try to remove the google-protobuf dependency as well?

@DanielPerezJensen
Copy link
Author

Thanks for reporting. You mentioned that you can import torch_scatter. Does that also hold for torch_sparse and other libraries? Furthermore, can you confirm that the error is the same when uninstalling the extension packages and importing torch_geometric again?

I am not entirely sure how to interpret the google::protobuf error. Can you try to remove the google-protobuf dependency as well?

Thanks for your speedy reply!

torch_geometric seems to be the only package that gives me this error. I am able to import torch_scatter, torch_sparse, torch_cluster, torch_spline_conv, and torch_sparse without error.

When I run the command pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.11.0+cu113.html --force-reinstall and then try to import the packages all do work except for torch_geometric.

I am not sure exactly what you mean by removing the dependency. Could you clarify what you mean by that?

@rusty1s
Copy link
Member

rusty1s commented Mar 29, 2022

I thought it would be interesting to know about the error of import torch_geometric when uninstalling torch-scatter torch-sparse torch-cluster torch-spline-conv first. Nonetheless, this issue seems indeed related to google::protobuf. Any chance you can confirm that PyG works when uninstalling this dependency/starting in a fresh environment?

@DanielPerezJensen
Copy link
Author

I thought it would be interesting to know about the error of import torch_geometric when uninstalling torch-scatter torch-sparse torch-cluster torch-spline-conv first. Nonetheless, this issue seems indeed related to google::protobuf. Any chance you can confirm that PyG works when uninstalling this dependency/starting in a fresh environment?

When I remove those from my current environment I get a ModuleNotFoundError for torch_sparse. Reinstalling that and then trying to import torch_geometric gives me another ModuleNotFoundError for torch_scatter. Reinstalling that and trying to import torch_geometric again gives the segmentation fault, with gdb again yielding something about protobuf:

Thread 1 "python" received signal SIGSEGV, Segmentation fault. 0x00007ffee5d191e1 in google::protobuf::internal::ReflectionOps::FindInitializationErrors(google::protobuf::Message const&, std::string const&, std::vector<std::string, std::allocator<std::string> >*) () from /home/daniel/.local/share/virtualenvs/flow-forecasting-thesis-VrA_h4b9/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so

When I try a clean environment, and do the installation instructions listed under Installation via Pip Wheels I am able to import torch_geometric.

This is very nice, but of course there might still be a problem with protobuf for whatever reason.

I will try to make a clean environment, with all my other dependencies, and see if I can get it to work!

@DanielPerezJensen
Copy link
Author

So I think I figured it out, it has to with a version mismatch. For my first install I simply followed the instructions under Quick Start which installs the version for PyTorch 1.11.0. While my current work necessitated PyTorch 1.10.0.

So, my bad for not paying attention. Thanks for the help anyway!

@rusty1s
Copy link
Member

rusty1s commented Mar 29, 2022

Super :) Is this issue resolved or is there anything to do from our side to close this? :)

@DanielPerezJensen
Copy link
Author

DanielPerezJensen commented Mar 29, 2022 via email

@rusty1s rusty1s closed this as completed Mar 29, 2022
@Sehaba95
Copy link

Sehaba95 commented Jan 9, 2023

I had the same issue, and I followed the instructions of @DanielPerezJensen : deleted the following libraries (torch-scatter, torch-sparse, torch-cluster, and torch-spline-conv), I checked the version of PyTorch and Cuda, and I installed from the Quick Start installation.

@hnisonoff
Copy link

hnisonoff commented Jan 13, 2023

I am also having this issue. I installed using conda with Pytorch 1.13.1

Update I fixed this by installing from pip using 1.13.1 instead of 1.13.0. The website suggested 1.13.0 for some reason -- may need to fix that.

@rusty1s
Copy link
Member

rusty1s commented Jan 14, 2023

Glad you got it working. Can you clarify what we need to fix?

@PabloVD
Copy link

PabloVD commented Mar 30, 2023

Hi, I have a similar issue. After reinstalling drivers, cuda and pytorch, I installed torch_geometric, but when I do import torch_geometric I get a segmentation fault (core dumped) error.
My torch version is '2.0.0+cu117', so following the instructions from the PyG docs I installed the package with pip install torch_geometric. So I don't think it is a matter of mismatch between versions right?
I can safely import torch_scatter, torch_cluster and torch_sparse.

@g-yichen
Copy link

g-yichen commented Apr 9, 2023

Hi, I have a similar issue. After reinstalling drivers, cuda and pytorch, I installed torch_geometric, but when I do import torch_geometric I get a segmentation fault (core dumped) error. My torch version is '2.0.0+cu117', so following the instructions from the PyG docs I installed the package with pip install torch_geometric. So I don't think it is a matter of mismatch between versions right? I can safely import torch_scatter, torch_cluster and torch_sparse.

This is likely a version mismatch problem. I had exactly the same issue after following the "Quick Start" on the web page. I remember my pytorch verion was pytorch 2.0.0 + cuda 11.8 and I downloaded pyg_lib in optional dependencies with the same pytorch and cuda version. However, pytorch was changed to 11.3.1+cuda117 after installing pyg. I fixed the problem by just uninstalling pyg_lib and reinstalling it with the correct pytorch and cuda version (pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-1.13.0+cu117.html). Hope this helps.

@PabloVD
Copy link

PabloVD commented Apr 13, 2023

Thanks for your answer. I managed to overcome the issue just by reinstalling torch and PyG in a new virtual environment, and everything works there.

@claCase
Copy link

claCase commented Mar 30, 2024

I have the same problem. I've followed the step by step guide. I have the following installed in my WSL system with python3.11

>>> import torch; print(torch.__version__)
2.2.1+cu121
>>> import torch; print(torch.version.cuda)
12.1

When first installing torch_geometric the import goes through without any problem. After installing:

pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.2.0+cu121.html

I've also updated the gcc version following this:

>>>  gcc --version
gcc (Ubuntu 13.1.0-8ubuntu1~22.04) 13.1.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I get a segmentation fault error when importing torch_geometric. Importing torch_scatter doesn't cause any problem, but importing torch_sparse gets a segmentation fault. I've tried installing different torch versions but the problem persists. Any help?

@rusty1s
Copy link
Member

rusty1s commented Apr 2, 2024

So import torch_sparse breaks for you? Do you know where the segmentation fault occurs?

@claCase
Copy link

claCase commented Apr 6, 2024

Hi @rusty1s,

by running the following script:

import faulthandler
import sys
faulthandler.enable(file=sys.stderr, all_threads=False)
try:
    import torch_sparse
except Exeption as e:
    faulthandler.dump_traceback_later(1)
    quit()

I get the following:

Fatal Python error: Segmentation fault

Stack (most recent call first):
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_script.py", line 1399 in script
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 1003 in try_compile_fn
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_script.py", line 1395 in script
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 1003 in try_compile_fn
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 61 in _compile_and_register_class
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch/jit/_script.py", line 1375 in script
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch_sparse/storage.py", line 21 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/wslvenv/lib/python3.11/site-packages/torch_sparse/__init__.py", line 39 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/mnt/a/Users/Claudio/Documents/4 - GitHub/VGRNN/import_script.py", line 8 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special (total: 20)
Segmentation fault

In case this log were not sufficient could you please suggest a better way to debug the error? Thank you!

Edit: By setting the break point to line 21 in torch_sparse/storage.py and stepping through the debugger I can't reproduce the segmentation fault

@rusty1s
Copy link
Member

rusty1s commented Apr 8, 2024

Looks like it breaks in the torch.jit.script decorator. Can you try to remove the @torch.jit.script decorator in torch_sparse/storage.py and torch_sparse/tensor.py and try again? Not totally sure why this breaks, but it should definitely fix your issue.

@claCase
Copy link

claCase commented Apr 8, 2024

After commenting the torch.jit.script decorator as you suggested I can import the library successfully. Many Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants