Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

Open
sorasoras opened this issue Oct 16, 2024 · 12 comments
Open

Comments

@sorasoras
Copy link

Your current environment

The output of `python env.py`

python env.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_n umpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.30.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.1.40093
MIOpen runtime version: 3.1.0
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X3D 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
BogoMIPS: 8399.84
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc re p_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misal ignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx 512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 512 KiB (16 instances)
L1i cache: 512 KiB (16 instances)
L2 cache: 16 MiB (16 instances)
L3 cache: 96 MiB (1 instance)
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44
[pip3] torch==2.1.2+rocm6.1.3
[pip3] torchvision==0.16.1+rocm6.1.3
[conda] Could not collect
ROCM Version: 6.1.40093-bd86f1708
Neuron SDK Version: N/A
Aphrodite Version: N/A
Aphrodite Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect
root@SORANET:/home/sora/aphrodite-engine# sudo python env.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.30.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.1.40093
MIOpen runtime version: 3.1.0
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X3D 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
BogoMIPS: 8399.84
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 512 KiB (16 instances)
L1i cache: 512 KiB (16 instances)
L2 cache: 16 MiB (16 instances)
L3 cache: 96 MiB (1 instance)
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44
[pip3] torch==2.1.2+rocm6.1.3
[pip3] torchvision==0.16.1+rocm6.1.3
[conda] Could not collect
ROCM Version: 6.1.40093-bd86f1708
Neuron SDK Version: N/A
Aphrodite Version: N/A
Aphrodite Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect

How did you install Aphrodite?

pip install aphrodite-engine

sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb
sudo apt install ./amdgpu-install_6.1.60103-1_all.deb

sudo amdgpu-install --list-usecase

If --usecase option is not present, the default selection is
"dkms,graphics,opencl,hip"
Available use cases:
dkms (to only install the kernel mode driver)

  • Kernel mode driver (included in all usecases)
    graphics (for users of graphics applications)
  • Open source Mesa 3D graphics and multimedia libraries
    multimedia (for users of open source multimedia)
  • Open source Mesa 3D multimedia libraries
    multimediasdk (for developers of open source multimedia)
  • Open source Mesa 3D multimedia libraries
  • Development headers for multimedia libraries
    workstation (for users of legacy WS applications)
  • Open source multimedia libraries
  • Closed source (legacy) OpenGL
    rocm (for users and developers requiring full ROCm stack)
  • OpenCL (ROCr/KFD based) runtime
  • HIP runtimes
  • Machine learning framework
  • All ROCm libraries and applications
    wsl (for using ROCm in a WSL context)
  • ROCr WSL runtime library (Ubuntu 22.04 only)
    rocmdev (for developers requiring ROCm runtime and
    profiling/debugging tools)
  • HIP runtimes
  • OpenCL runtime
  • Profiler, Tracer and Debugger tools
    rocmdevtools (for developers requiring ROCm profiling/debugging tools)
  • Profiler, Tracer and Debugger tools
    amf (for users of AMF based multimedia)
  • AMF closed source multimedia library
    lrt (for users of applications requiring ROCm runtime)
  • ROCm Compiler and device libraries
  • ROCr runtime and thunk
    opencl (for users of applications requiring OpenCL on Vega or later
    products)
  • ROCr based OpenCL
  • ROCm Language runtime
    openclsdk (for application developers requiring ROCr based OpenCL)
  • ROCr based OpenCL
  • ROCm Language runtime
  • development and SDK files for ROCr based OpenCL
    hip (for users of HIP runtime on AMD products)
  • HIP runtimes
    hiplibsdk (for application developers requiring HIP on AMD products)
  • HIP runtimes
  • ROCm math libraries
  • HIP development libraries
    openmpsdk (for users of openmp/flang on AMD products)
  • OpenMP runtime and devel packages
    mllib (for users executing machine learning workloads)
  • MIOpen hip/tensile libraries
  • Clang OpenCL
  • MIOpen kernels
    mlsdk (for developers executing machine learning workloads)
  • MIOpen development libraries
  • Clang OpenCL development libraries
  • MIOpen kernels
    asan (for users of ASAN enabled ROCm packages)
  • ASAN enabled OpenCL (ROCr/KFD based) runtime
  • ASAN enabled HIP runtimes
  • ASAN enabled Machine learning framework
  • ASAN enabled ROCm libraries

rocminfo

HSA System Attributes

Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: NO

==========
HSA Agents


Agent 1


Name: CPU
Uuid: CPU-XX
Marketing Name: CPU
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 49137460(0x2edc734) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 49137460(0x2edc734) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:


Agent 2


Name: gfx1100
Marketing Name: AMD Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 16(0x10)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2526
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
Coherent Host Access: FALSE
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 2280
SDMA engine uCode:: 21
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25086124(0x17ec8ac) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***

build log

rocm_gfx1100_wsl2.txt

@Naomiusearch
Copy link
Contributor

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

@sorasoras
Copy link
Author

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

I did try to compile but the patch doesn't seems to work

@Naomiusearch
Copy link
Contributor

Naomiusearch commented Oct 19, 2024

What error did you get?

EDIT: Sorry, I have the stupid and didn't notice you gave build logs

@Naomiusearch
Copy link
Contributor

Naomiusearch commented Oct 19, 2024

Your pytorch is way too out of date. Could you update to 2.5?

@sorasoras
Copy link
Author

Your pytorch is way too out of date. Could you update to 2.5?

I update it to pytorch 2.5, and now it show

~/aphrodite-engine$ ./amdpatch.sh
patching file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 397.
1 out of 1 hunk FAILED -- saving rejects to file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h.rej
patch: **** Can't reopen file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h : No such file or directory

@Naomiusearch
Copy link
Contributor

Naomiusearch commented Oct 22, 2024

So first check if aphrodite compiles. If it fails like "__test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where __clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

@sorasoras
Copy link
Author

So first check if aphrodite compiles. If it fails like "__test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where __clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

I cannot compile it and when I check the exact folder.

root@x:/opt/rocm/lib/llvm/lib/clang/18/include#
dir
__clang_hip_cmath.h.orig __clang_hip_cmath.h.rej
there is no __clang_hip_cmath.h under that folder

@Naomiusearch
Copy link
Contributor

Naomiusearch commented Oct 22, 2024

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

@sorasoras
Copy link
Author

sorasoras commented Oct 23, 2024

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

clang 17 should work.
looks like 17 is the only one exist.

@Naomiusearch
Copy link
Contributor

Does this work?
sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

@sorasoras
Copy link
Author

sorasoras commented Oct 23, 2024

Does this work? sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

Not quite.

cd aphrodite-engine/

root@SORANET:~/aphrodite-engine# sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
patching file /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 397 with fuzz 2.

and I try to run it anyway.

export PYTORCH_ROCM_ARCH=gfx1100

python3 setup.py develop
running develop
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing aphrodite_engine.egg-info/PKG-INFO
writing dependency_links to aphrodite_engine.egg-info/dependency_links.txt
writing entry points to aphrodite_engine.egg-info/entry_points.txt
writing requirements to aphrodite_engine.egg-info/requires.txt
writing top-level names to aphrodite_engine.egg-info/top_level.txt
reading manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
running build_ext
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found python matching: /usr/bin/python3.
Building PyTorch for GPU arch: gfx1100
HIP VERSION: 6.1.40093-bd86f1708
-- Caffe2: Header version is: 6.1.3

***** ROCm version from rocm_version.h ****

ROCM_VERSION_DEV: 6.1.3
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 1
ROCM_VERSION_DEV_PATCH: 3
ROCM_VERSION_DEV_INT:   60103
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 1
TORCH_HIP_VERSION: 601

***** Library versions from dpkg *****

rocm-developer-tools VERSION: 6.1.3.60103-122~22.04
rocm-device-libs VERSION: 1.0.0.60103-122~22.04
hsakmt-roct-dev VERSION: 20240125.5.08.60103-122~22.04
hsa-rocr-dev VERSION: 1.13.0.60103-122~22.04

***** Library versions from cmake find_package *****

CMake Error at /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake:80 (message):
  The imported target "hsa-runtime64::hsa-runtime64" references the file

     "/opt/rocm/lib/libhsa-runtime64.so.1.13.60103"

  but this file does not exist.  Possible reasons include:

  * The file was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and contained

     "/opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake"

  but not all the files it references.

Call Stack (most recent call first):
  /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64-config.cmake:82 (include)
  /usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.30/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
  /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:108 (find_dependency)
  /opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:36 (find_package)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:152 (find_package_and_print_version)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:74 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:67 (find_package)


-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/root/aphrodite-engine/setup.py", line 460, in <module>
    setup(
  File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/root/aphrodite-engine/setup.py", line 210, in build_extensions
    self.configure(ext)
  File "/root/aphrodite-engine/setup.py", line 193, in configure
    subprocess.check_call(
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/root/aphrodite-engine', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/root/aphrodite-engine/build/lib.linux-x86_64-3.10/aphrodite', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-3.10', '-DAPHRODITE_TARGET_DEVICE=cuda', '-DAPHRODITE_PYTHON_EXECUTABLE=/usr/bin/python3', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=32']' returned non-zero exit status 1.

@Naomiusearch
Copy link
Contributor

Naomiusearch commented Oct 23, 2024

So patch actually works, but you're missing a different file for some reason. I'm not sure if it's a WSL thing. I guess you could try to do this, though I have no idea if it will work.

location=`pip show torch | grep Location | awk -F ": " '{print $2}'
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so

Honestly, I don't recommend using WSL and if you can, you should probably just dualboot linux for ROCM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants