Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup issue #5

Open
acejim opened this issue May 21, 2019 · 4 comments
Open

setup issue #5

acejim opened this issue May 21, 2019 · 4 comments

Comments

@acejim
Copy link

acejim commented May 21, 2019

Hi there,

I have successfully run gpgpu-sim, and followed the steps to run cutlass-test with gpgpu-sim, but I got the output like this:

GPGPU-Sim PTX: __cudaRegisterFatBinary, fat_cubin_handle = 2, filename=default
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memcpy_3d_deviceImLi1ELi1ELi1EEvPKhPhT_S3_S3_S3_S3_S3_S3_jjjjjjjjS3_S1_S2 : hostFun 0x0x4017f0, fat_cubin_handle = 2
GPGPU-Sim PTX: Parsing cutlass-test.1.sm_70.ptx
GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file cutlass-test.1.sm_70.ptx
GPGPU-Sim PTX: loading globals with explicit initializers...
GPGPU-Sim PTX: finished loading globals (0 bytes total).
GPGPU-Sim PTX: loading constants with explicit initializers... done.
GPGPU-Sim PTX: Loading PTXInfo from cutlass-test.1.sm_70.ptx
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memcpy_3d_deviceImLi1ELi1ELi1EEvPKhPhT_S3_S3_S3_S3_S3_S3_jjjjjjjjS3_S1_S2
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi1EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0 : hostFun 0x0x405dc0, fat_cubin_handle = 2
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi1EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi0EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0 : hostFun 0x0x405fe0, fat_cubin_handle = 2
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi0EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693160; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x6931a0; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x6931e0; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693220; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x692640; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 1992 bytes
GPGPU-Sim PTX registering global __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693140; deviceAddress = cudartErrorTable; deviceName = cudartErrorTable
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 8 bytes
GPGPU-Sim PTX registering global cudartErrorTable hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409000; deviceAddress = cudartErrorTableEntryCount; deviceName = cudartErrorTableEntryCount
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global cudartErrorTableEntryCount hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409020; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 104 bytes
GPGPU-Sim PTX registering global __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693148; deviceAddress = cudartErrorCnpMap; deviceName = cudartErrorCnpMap
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 8 bytes
GPGPU-Sim PTX registering global cudartErrorCnpMap hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409004; deviceAddress = cudartErrorCnpMapEntryCount; deviceName = cudartErrorCnpMapEntryCount
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global cudartErrorCnpMapEntryCount hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693150; deviceAddress = CNPRT_VERSION_NUMBER; deviceName = CNPRT_VERSION_NUMBER
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global CNPRT_VERSION_NUMBER hostVar to name mapping
GPGPU-Sim: *** exit detected ***

Could you tell me how to solve it? Thank you.

@damionfan
Copy link

use CFLAGS

@sunlex0717
Copy link

Hi I also got this error, have you fixed this?

use CFLAGS

@sxzhang1993
Copy link

sxzhang1993 commented Jun 3, 2021

Hello, have you resolved this issue?

@jielahou
Copy link

jielahou commented Oct 29, 2024

First, should use CUDA 9.0 not CUDA 11.0, and downgrade GCC & G++(So I directly use ubuntu 16.04, whose default GCC version is 5.4.0). Then there will no errors like above Warning: cannot find deviceFun. It still has... But it seems that it doesn't affect simulation.

After change to CUDA 9.0, when run this repo's cutlass, there may be error like:

cutlass-test.1.sm_70.ptx:347 Syntax error:

   wmma.load.a.sync.col.m16n16k16.f16 {%r590, %r589, %r588, %r587, %r586, %r585, %r584, %r583}, [%rd66], %r374;

My solution is to modify the src/cuda-sim/ptx.l, Line 172:

<INITIAL,NOT_OPCODE,IN_INST,IN_FUNC_DECL>{
\.a\.sync\.aligned TC; yylval->int_value = LOAD_A; return WMMA_DIRECTIVE;
\.b\.sync\.aligned TC; yylval->int_value = LOAD_B; return WMMA_DIRECTIVE;
\.c\.sync\.aligned TC; yylval->int_value = LOAD_C; return WMMA_DIRECTIVE;
\.d\.sync\.aligned TC; yylval->int_value = STORE_D; return WMMA_DIRECTIVE;
+ \.a\.sync TC; yylval->int_value = LOAD_A; return WMMA_DIRECTIVE;
+ \.b\.sync TC; yylval->int_value = LOAD_B; return WMMA_DIRECTIVE;
+ \.c\.sync TC; yylval->int_value = LOAD_C; return WMMA_DIRECTIVE;
+ \.d\.sync TC; yylval->int_value = STORE_D; return WMMA_DIRECTIVE;
\.mma\.sync\.aligned TC;yylval->int_value=MMA; return WMMA_DIRECTIVE;
+ \.mma\.sync TC;yylval->int_value=MMA; return WMMA_DIRECTIVE;

Then it can run in my environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants