Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ptx_parse() fuction doesn't return when executing different applications #287

Open
pipijing13 opened this issue Nov 20, 2023 · 3 comments
Open

Comments

@pipijing13
Copy link

pipijing13 commented Nov 20, 2023

I met the same problem as which mentioned in #160 when executing applications using gpgpu-sim 4.0. The code stops executing after this point:

######### cuobjdump parser ########
Adding new section PTX
Adding ptx filename: _cuobjdump_1.ptx
Adding arch: sm_13
Adding identifier: vectoradd.cu
Adding new section ELF
Adding arch: sm_13
Adding identifier: vectoradd.cu
Done parsing!!!
GPGPU-Sim PTX: __cudaRegisterFunction _Z6vecAddPdS_S_i : hostFun 0x0x400c40, fat_cubin_handle = 1//stoped here

After I tracked the execution, I realized the ptx_parse() function in ptx_parser.cc doesn't return, which is also found in the former issue. I tried several different applications, including vectoradd and simple_add, and I got same results.

Then I tried to execute applications using gpgpu-sim 3.0, and former problem no longer exists. A file called _1.ptx was generated, and the following is the content of _1.ptx.

.version 1.4
.target sm_13

.file 1 ""
.file 2 "/tmp/tmpxft_00003cca_00000000-6_vectoradd.cudafe2.gpu"
.file 3 "/sciclone/home/ysun32/packages/gcc-4.5.1/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/include/stddef.h"
.file 4 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/crt/device_runtime.h"
.file 5 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/host_defines.h"
.file 6 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/builtin_types.h"
.file 7 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/device_types.h"
.file 8 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/driver_types.h"
.file 9 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/surface_types.h"
.file 10 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/texture_types.h"
.file 11 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/vector_types.h"
.file 12 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/device_launch_parameters.h"
.file 13 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/crt/storage_class.h"
.file 14 "vectoradd.cu"
.file 15 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/common_functions.h"
.file 16 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/math_functions.h"
.file 17 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/math_constants.h"
.file 18 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/device_functions.h"
.file 19 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_11_atomic_functions.h"
.file 20 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_12_atomic_functions.h"
.file 21 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_13_double_functions.h"
.file 22 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_20_atomic_functions.h"
.file 23 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_20_intrinsics.h"
.file 24 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/sm_30_intrinsics.h"
.file 25 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/surface_functions.h"
.file 26 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/texture_fetch_functions.h"
.file 27 "/sciclone/home/ysun32/packages/cuda4.2/cuda/bin/../include/math_functions_dbl_ptx3.h"

.entry _Z6vecAddPdS_S_i (
.param .u64 __cudaparm__Z6vecAddPdS_S_i_a,
.param .u64 __cudaparm__Z6vecAddPdS_S_i_b,
.param .u64 __cudaparm__Z6vecAddPdS_S_i_c,
.param .s32 __cudaparm__Z6vecAddPdS_S_i_n)
{
.reg .u16 %rh<4>;
.reg .u32 %r<6>;
.reg .u64 %rd<10>;
.reg .f64 %fd<5>;
.reg .pred %p<3>;
.loc 14 21 0
$LDWbegin__Z6vecAddPdS_S_i:
mov.u16 %rh1, %ctaid.x;
mov.u16 %rh2, %ntid.x;
mul.wide.u16 %r1, %rh1, %rh2;
cvt.u32.u16 %r2, %tid.x;
add.u32 %r3, %r2, %r1;
ld.param.s32 %r4, [__cudaparm__Z6vecAddPdS_S_i_n];
setp.le.s32 %p1, %r4, %r3;
@%p1 bra $Lt_0_1026;
.loc 14 26 0
cvt.s64.s32 %rd1, %r3;
mul.wide.s32 %rd2, %r3, 8;
ld.param.u64 %rd3, [__cudaparm__Z6vecAddPdS_S_i_a];
add.u64 %rd4, %rd3, %rd2;
ld.global.f64 %fd1, [%rd4+0];
ld.param.u64 %rd5, [__cudaparm__Z6vecAddPdS_S_i_b];
add.u64 %rd6, %rd5, %rd2;
ld.global.f64 %fd2, [%rd6+0];
add.f64 %fd3, %fd1, %fd2;
ld.param.u64 %rd7, [__cudaparm__Z6vecAddPdS_S_i_c];
add.u64 %rd8, %rd7, %rd2;
st.global.f64 [%rd8+0], %fd3;
$Lt_0_1026:
.loc 14 27 0
exit;
$LDWend__Z6vecAddPdS_S_i:
}

Then I copied _1.ptx to the folder under gpgpu-sim 4.0, and I can successfully execute the applications using gpgpu-sim 4.0 without getting stuck. So I think maybe it stopped at the point generating _1.ptx in the beginning, and if _1.ptx exists, it won't get stuck at ptx_parse() function. But I am not sure whether it is acceptable to use the _1.ptx file generated in gpgpu-sim 3.0. Could anyone help me with this? Besides, the following is the output of execution in this way.

######### cuobjdump parser ########
Adding new section PTX
Adding ptx filename: _cuobjdump_1.ptx
Adding arch: sm_13
Adding identifier: vectoradd.cu
Adding new section ELF
Adding arch: sm_13
Adding identifier: vectoradd.cu
Done parsing!!!
GPGPU-Sim PTX: __cudaRegisterFunction _Z6vecAddPdS_S_i : hostFun 0x0x400c40, fat_cubin_handle = 1//stopped here before
GPGPU-Sim PTX: instruction assembly for function '_Z6vecAddPdS_S_i'... done.
GPGPU-Sim PTX: Warning %rh0 was declared previous at _1.ptx:55 skipping new declaration
GPGPU-Sim PTX: Warning %rh1 was declared previous at _1.ptx:55 skipping new declaration
GPGPU-Sim PTX: Warning %rh2 was declared previous at _1.ptx:55 skipping new declaration
GPGPU-Sim PTX: Warning %rh3 was declared previous at _1.ptx:55 skipping new declaration
GPGPU-Sim PTX: Warning %r0 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %r1 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %r2 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %r3 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %r4 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %r5 was declared previous at _1.ptx:56 skipping new declaration
GPGPU-Sim PTX: Warning %rd0 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd1 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd2 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd3 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd4 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd5 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd6 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd7 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd8 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %rd9 was declared previous at _1.ptx:57 skipping new declaration
GPGPU-Sim PTX: Warning %fd0 was declared previous at _1.ptx:58 skipping new declaration
GPGPU-Sim PTX: Warning %fd1 was declared previous at _1.ptx:58 skipping new declaration
GPGPU-Sim PTX: Warning %fd2 was declared previous at _1.ptx:58 skipping new declaration
GPGPU-Sim PTX: Warning %fd3 was declared previous at _1.ptx:58 skipping new declaration
GPGPU-Sim PTX: Warning %fd4 was declared previous at _1.ptx:58 skipping new declaration
GPGPU-Sim PTX: Warning %p0 was declared previous at _1.ptx:59 skipping new declaration
GPGPU-Sim PTX: Warning %p1 was declared previous at _1.ptx:59 skipping new declaration
GPGPU-Sim PTX: Warning %p2 was declared previous at _1.ptx:59 skipping new declaration
GPGPU-Sim PTX: instruction assembly for function '_Z6vecAddPdS_S_i'... done.
GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file _1.ptx
self exe links to: vectoradd
GPGPU-Sim PTX: extracting embedded .ptx to temporary file "_ptx_xnYjaY"
Running: cat _ptx_xnYjaY | sed 's/.version 1.5/.version 1.4/' | sed 's/, texmode_independent//' | sed 's/(.extern .const[1] .b8 \w+)[]/\1[1]/' | sed 's/const[.]/const[0]/g' > _ptx2_TklIcI
GPGPU-Sim PTX: generating ptxinfo using "$CUDA_INSTALL_PATH/bin/ptxas --gpu-name=sm_13 -v _ptx2_TklIcI --output-file /dev/null 2> _ptx_xnYjaYinfo"

What's more, I noticed if I use the command "exit" while getting stuck at ptx_parse() function, I can skip this problem and continue executing the application, but it won't generate _1.ptx. The following is the output of execution if I use "exit". Similarly, I am also not sure whether it is acceptable if I use "exit", so I really appreciate if anyone know something about it.

######### cuobjdump parser ########
Adding new section PTX
Adding ptx filename: _cuobjdump_1.ptx
Adding arch: sm_13
Adding identifier: vectoradd.cu
Adding new section ELF
Adding arch: sm_13
Adding identifier: vectoradd.cu
Done parsing!!!
GPGPU-Sim PTX: __cudaRegisterFunction _Z6vecAddPdS_S_i : hostFun 0x0x400c40, fat_cubin_handle = 1//stopped here before
exit
_1.ptx:1 Syntax error:

^

GPGPU-Sim PTX: instruction assembly for function '_Z6vecAddPdS_S_i'... done.
GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file _1.ptx
self exe links to: vectoradd
GPGPU-Sim PTX: extracting embedded .ptx to temporary file "_ptx_w9yRta"
Running: cat _ptx_w9yRta | sed 's/.version 1.5/.version 1.4/' | sed 's/, texmode_independent//' | sed 's/(.extern .const[1] .b8 \w+)[]/\1[1]/' | sed 's/const[.]/const[0]/g' > _ptx2_XACqHc
GPGPU-Sim PTX: generating ptxinfo using "$CUDA_INSTALL_PATH/bin/ptxas --gpu-name=sm_13 -v _ptx2_XACqHc --output-file /dev/null 2> _ptx_w9yRtainfo"

@quadpixels
Copy link

I encountered this error too, and I found it was because I had an extra 0x00 at the end of the PTX byte file. Maybe you can check your file as well?

@pipijing13
Copy link
Author

I encountered this error too, and I found it was because I had an extra 0x00 at the end of the PTX byte file. Maybe you can check your file as well?

Thank you so much! I will try it. Do you mean the file named "_cuobjdump_1.ptx" generated in the application folder?

@yao1yao1yao1
Copy link

I just started using gpgpu-sim recently, how do I execute different applications on gpgpu-sim, can you give the complete code to run ispass-2009 RAY and BFS at the same time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants