[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

sorasoras · 2024-10-16T17:23:38Z

Your current environment

The output of `python env.py`

python env.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_n umpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.30.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.1.40093
MIOpen runtime version: 3.1.0
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X3D 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
BogoMIPS: 8399.84
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc re p_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misal ignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx 512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 512 KiB (16 instances)
L1i cache: 512 KiB (16 instances)
L2 cache: 16 MiB (16 instances)
L3 cache: 96 MiB (1 instance)
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44
[pip3] torch==2.1.2+rocm6.1.3
[pip3] torchvision==0.16.1+rocm6.1.3
[conda] Could not collect
ROCM Version: 6.1.40093-bd86f1708
Neuron SDK Version: N/A
Aphrodite Version: N/A
Aphrodite Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect
root@SORANET:/home/sora/aphrodite-engine# sudo python env.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last): File "/home/sora/aphrodite-engine/env.py", line 17, in
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 1382, in
from .functional import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in
import torch.nn.functional as F
File "/usr/local/lib/python3.10/dist-packages/torch/nn/init.py", line 1, in
from .modules import * # noqa: F403
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/init.py", line 35, in
from .transformer import TransformerEncoder, TransformerDecoder,
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
PyTorch version: 2.1.2+rocm6.1.3
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.1.40093-bd86f1708

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.30.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.5.119
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 7900 XTXNoGCNArchNameOnOldPyTorch
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.1.40093
MIOpen runtime version: 3.1.0
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X3D 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
BogoMIPS: 8399.84
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm
Virtualization: AMD-V
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 512 KiB (16 instances)
L1i cache: 512 KiB (16 instances)
L2 cache: 16 MiB (16 instances)
L3 cache: 96 MiB (1 instance)
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] pytorch-triton-rocm==2.1.0+rocm6.1.3.4d510c3a44
[pip3] torch==2.1.2+rocm6.1.3
[pip3] torchvision==0.16.1+rocm6.1.3
[conda] Could not collect
ROCM Version: 6.1.40093-bd86f1708
Neuron SDK Version: N/A
Aphrodite Version: N/A
Aphrodite Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect

How did you install Aphrodite?

pip install aphrodite-engine

sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb
sudo apt install ./amdgpu-install_6.1.60103-1_all.deb

sudo amdgpu-install --list-usecase

If --usecase option is not present, the default selection is
"dkms,graphics,opencl,hip"
Available use cases:
dkms (to only install the kernel mode driver)

Kernel mode driver (included in all usecases)
graphics (for users of graphics applications)
Open source Mesa 3D graphics and multimedia libraries
multimedia (for users of open source multimedia)
Open source Mesa 3D multimedia libraries
multimediasdk (for developers of open source multimedia)
Open source Mesa 3D multimedia libraries
Development headers for multimedia libraries
workstation (for users of legacy WS applications)
Open source multimedia libraries
Closed source (legacy) OpenGL
rocm (for users and developers requiring full ROCm stack)
OpenCL (ROCr/KFD based) runtime
HIP runtimes
Machine learning framework
All ROCm libraries and applications
wsl (for using ROCm in a WSL context)
ROCr WSL runtime library (Ubuntu 22.04 only)
rocmdev (for developers requiring ROCm runtime and
profiling/debugging tools)
HIP runtimes
OpenCL runtime
Profiler, Tracer and Debugger tools
rocmdevtools (for developers requiring ROCm profiling/debugging tools)
Profiler, Tracer and Debugger tools
amf (for users of AMF based multimedia)
AMF closed source multimedia library
lrt (for users of applications requiring ROCm runtime)
ROCm Compiler and device libraries
ROCr runtime and thunk
opencl (for users of applications requiring OpenCL on Vega or later
products)
ROCr based OpenCL
ROCm Language runtime
openclsdk (for application developers requiring ROCr based OpenCL)
ROCr based OpenCL
ROCm Language runtime
development and SDK files for ROCr based OpenCL
hip (for users of HIP runtime on AMD products)
HIP runtimes
hiplibsdk (for application developers requiring HIP on AMD products)
HIP runtimes
ROCm math libraries
HIP development libraries
openmpsdk (for users of openmp/flang on AMD products)
OpenMP runtime and devel packages
mllib (for users executing machine learning workloads)
MIOpen hip/tensile libraries
Clang OpenCL
MIOpen kernels
mlsdk (for developers executing machine learning workloads)
MIOpen development libraries
Clang OpenCL development libraries
MIOpen kernels
asan (for users of ASAN enabled ROCm packages)
ASAN enabled OpenCL (ROCr/KFD based) runtime
ASAN enabled HIP runtimes
ASAN enabled Machine learning framework
ASAN enabled ROCm libraries

rocminfo

HSA System Attributes

Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: NO

==========
HSA Agents

Agent 1

Name: CPU
Uuid: CPU-XX
Marketing Name: CPU
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 49137460(0x2edc734) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 49137460(0x2edc734) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:

Agent 2

Name: gfx1100
Marketing Name: AMD Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 16(0x10)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2526
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
Coherent Host Access: FALSE
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 2280
SDMA engine uCode:: 21
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25086124(0x17ec8ac) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***

build log

rocm_gfx1100_wsl2.txt

The text was updated successfully, but these errors were encountered:

Naomiusearch · 2024-10-18T05:31:57Z

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

sorasoras · 2024-10-19T11:02:01Z

Could you compile aphrodite from source? I don't think there's a package for rocm. Here's how to do it.

I did try to compile but the patch doesn't seems to work

Naomiusearch · 2024-10-19T11:44:40Z

What error did you get?

EDIT: Sorry, I have the stupid and didn't notice you gave build logs

Naomiusearch · 2024-10-19T12:39:39Z

Your pytorch is way too out of date. Could you update to 2.5?

sorasoras · 2024-10-19T18:41:05Z

Your pytorch is way too out of date. Could you update to 2.5?

I update it to pytorch 2.5, and now it show

~/aphrodite-engine$ ./amdpatch.sh
patching file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 397.
1 out of 1 hunk FAILED -- saving rejects to file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h.rej
patch: **** Can't reopen file /opt/rocm/lib/llvm/lib/clang/18/include/__clang_hip_cmath.h : No such file or directory

Naomiusearch · 2024-10-22T14:02:52Z

So first check if aphrodite compiles. If it fails like "__test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where __clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

sorasoras · 2024-10-22T18:27:17Z

So first check if aphrodite compiles. If it fails like "__test is ambiguous" or something like that you will need to patch it yourself. It would be nice if you could send where __clang_hip_cmath.h is, since I can't test rocm 6.1 (slow internet), so I could edit the script.

I cannot compile it and when I check the exact folder.

root@x:/opt/rocm/lib/llvm/lib/clang/18/include#
dir
__clang_hip_cmath.h.orig __clang_hip_cmath.h.rej
there is no __clang_hip_cmath.h under that folder

Naomiusearch · 2024-10-22T19:25:43Z

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

sorasoras · 2024-10-23T17:58:38Z

Is there different version of clang? Like under /opt/rocm/lib/llvm/lib/clang/

clang 17 should work.
looks like 17 is the only one exist.

Naomiusearch · 2024-10-23T20:13:09Z

Does this work?
sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

sorasoras · 2024-10-23T20:22:57Z

Does this work? sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch

Not quite.

cd aphrodite-engine/

root@SORANET:~/aphrodite-engine# sudo patch /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h ./patches/amd.patch
patching file /opt/rocm/lib/llvm/lib/clang/17/include/__clang_hip_cmath.h
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 397 with fuzz 2.

and I try to run it anyway.

export PYTORCH_ROCM_ARCH=gfx1100

python3 setup.py develop
running develop
/usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing aphrodite_engine.egg-info/PKG-INFO
writing dependency_links to aphrodite_engine.egg-info/dependency_links.txt
writing entry points to aphrodite_engine.egg-info/entry_points.txt
writing requirements to aphrodite_engine.egg-info/requires.txt
writing top-level names to aphrodite_engine.egg-info/top_level.txt
reading manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'aphrodite_engine.egg-info/SOURCES.txt'
running build_ext
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found python matching: /usr/bin/python3.
Building PyTorch for GPU arch: gfx1100
HIP VERSION: 6.1.40093-bd86f1708
-- Caffe2: Header version is: 6.1.3

***** ROCm version from rocm_version.h ****

ROCM_VERSION_DEV: 6.1.3
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 1
ROCM_VERSION_DEV_PATCH: 3
ROCM_VERSION_DEV_INT:   60103
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 1
TORCH_HIP_VERSION: 601

***** Library versions from dpkg *****

rocm-developer-tools VERSION: 6.1.3.60103-122~22.04
rocm-device-libs VERSION: 1.0.0.60103-122~22.04
hsakmt-roct-dev VERSION: 20240125.5.08.60103-122~22.04
hsa-rocr-dev VERSION: 1.13.0.60103-122~22.04

***** Library versions from cmake find_package *****

CMake Error at /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake:80 (message):
  The imported target "hsa-runtime64::hsa-runtime64" references the file

     "/opt/rocm/lib/libhsa-runtime64.so.1.13.60103"

  but this file does not exist.  Possible reasons include:

  * The file was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and contained

     "/opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64Targets.cmake"

  but not all the files it references.

Call Stack (most recent call first):
  /opt/rocm/lib/cmake/hsa-runtime64/hsa-runtime64-config.cmake:82 (include)
  /usr/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.30/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
  /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:108 (find_dependency)
  /opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:36 (find_package)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/LoadHIP.cmake:152 (find_package_and_print_version)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:74 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:67 (find_package)


-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/root/aphrodite-engine/setup.py", line 460, in <module>
    setup(
  File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/usr/lib/python3/dist-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/root/aphrodite-engine/setup.py", line 210, in build_extensions
    self.configure(ext)
  File "/root/aphrodite-engine/setup.py", line 193, in configure
    subprocess.check_call(
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/root/aphrodite-engine', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/root/aphrodite-engine/build/lib.linux-x86_64-3.10/aphrodite', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-3.10', '-DAPHRODITE_TARGET_DEVICE=cuda', '-DAPHRODITE_PYTHON_EXECUTABLE=/usr/bin/python3', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=32']' returned non-zero exit status 1.

Naomiusearch · 2024-10-23T21:27:26Z

So patch actually works, but you're missing a different file for some reason. I'm not sure if it's a WSL thing. I guess you could try to do this, though I have no idea if it will work.

location=`pip show torch | grep Location | awk -F ": " '{print $2}'
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so

Honestly, I don't recommend using WSL and if you can, you should probably just dualboot linux for ROCM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

sorasoras commented Oct 16, 2024

Naomiusearch commented Oct 18, 2024

sorasoras commented Oct 19, 2024

Naomiusearch commented Oct 19, 2024 •

edited

Loading

Naomiusearch commented Oct 19, 2024 •

edited

Loading

sorasoras commented Oct 19, 2024

Naomiusearch commented Oct 22, 2024 •

edited

Loading

sorasoras commented Oct 22, 2024

Naomiusearch commented Oct 22, 2024 •

edited

Loading

sorasoras commented Oct 23, 2024 •

edited

Loading

Naomiusearch commented Oct 23, 2024

sorasoras commented Oct 23, 2024 •

edited

Loading

Naomiusearch commented Oct 23, 2024 •

edited

Loading

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

[Installation]: I tried compile GFX1100 on WSL2 but it does not seems work #780

Comments

sorasoras commented Oct 16, 2024

Your current environment

How did you install Aphrodite?

rocminfo

HSA System Attributes

========== HSA Agents

Naomiusearch commented Oct 18, 2024

sorasoras commented Oct 19, 2024

Naomiusearch commented Oct 19, 2024 • edited Loading

Naomiusearch commented Oct 19, 2024 • edited Loading

sorasoras commented Oct 19, 2024

Naomiusearch commented Oct 22, 2024 • edited Loading

sorasoras commented Oct 22, 2024

Naomiusearch commented Oct 22, 2024 • edited Loading

sorasoras commented Oct 23, 2024 • edited Loading

Naomiusearch commented Oct 23, 2024

sorasoras commented Oct 23, 2024 • edited Loading

Naomiusearch commented Oct 23, 2024 • edited Loading

==========
HSA Agents

Naomiusearch commented Oct 19, 2024 •

edited

Loading

Naomiusearch commented Oct 19, 2024 •

edited

Loading

Naomiusearch commented Oct 22, 2024 •

edited

Loading

Naomiusearch commented Oct 22, 2024 •

edited

Loading

sorasoras commented Oct 23, 2024 •

edited

Loading

sorasoras commented Oct 23, 2024 •

edited

Loading

Naomiusearch commented Oct 23, 2024 •

edited

Loading