Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend support for CUDA features #3

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open

extend support for CUDA features #3

wants to merge 30 commits into from

Conversation

drossetti
Copy link
Owner

this branch is:

  • rebase of dadofixes on top of MLNX OFED 4.2 version of perftest
  • add support for multiple CUDA memory allocators and memory hints (cudaMemAdvise)

drossetti and others added 30 commits April 12, 2018 09:58
add error checking for cuMemFree
…. use separate error string for the different pp_init_ functions.
drossetti pushed a commit that referenced this pull request May 12, 2021
The sizeof 'struct ibv_qp' allocated by ibv_create_qp is 160.
If the memory holds the 'struct ibv_qp' was allocated at the
upper boundary of a memory page, cast it to 'struct verbs_qp',
whose size is 360, may across the memory page boundary. It will
trigger invalid memory access to the next memory page.

The issue can be reproduced with OPA and QIB HCA.

For example run over OPA:
 Server Node: $ ib_read_bw -F -N -n 1000 -u 20 -q 257 -s 4194304
 Client Node: $ ib_read_bw -F -N -n 1000 -u 20 -q 257 -s 4194304 <sever>

 Program received signal SIGSEGV, Segmentation fault.
 ibv_qp_to_qp_ex (qp=0x5555557a5f10) at libibverbs/verbs.c:624
 624             if (vqp->comp_mask & VERBS_QP_EX)
 (gdb) bt
 #0  ibv_qp_to_qp_ex (qp=0x5555557a5f10) at libibverbs/verbs.c:624
 #1  0x000055555556af4a in create_reg_qp_main (ctx=ctx@entry=0x7fffffffd500, user_param=user_param@entry=0x7fffffffd670, i=i@entry=21, num_of_qps=num_of_qps@entry=128)  at src/perftest_resources.c:1597
 #2  0x000055555556b6d7 in create_qp_main (num_of_qps=<optimized out>, i=21,  user_param=0x7fffffffd670, ctx=0x7fffffffd500) at src/perftest_resources.c:1613
 #3  ctx_init (ctx=0x7fffffffd500, user_param=0x7fffffffd670) at src/perftest_resources.c:1552
 #4  0x0000555555558e9c in main (argc=<optimized out>, argv=<optimized out>) at  src/read_bw.c:149

624             if (vqp->comp_mask & VERBS_QP_EX)
(gdb) p qp
$1 = (struct ibv_qp *) 0x5555557a5f10
(gdb) p vqp
$2 = (struct verbs_qp *) 0x5555557a5f10
(gdb) p *qp
$3 = {context = 0x55555578ad00, qp_context = 0x0, ....
(gdb) p *vqp
Cannot access memory at address 0x5555557a6000

Signed-off-by: Honggang Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants