openmp allocators #4

lucaparisi91 · 2024-06-26T16:02:53Z

Show how to allocate mpi shared memory and pinned memory .

shared memory
pinned memory

Examples in custom_allocations

HPE seems to be have some cray specific allocations based on last slide of https://www.openmp.org/wp-content/uploads/2022-04-29-ECP-OMP-Telecon-HPE-Compiler.pdf. There is also some material on allocators at https://doku.lrz.de/files/11497064/11497068/1/1684602040267/OpenMP+Workshop+Day+2.pdf .
Pinned memory is possible, as well as shared memory allocation.

The text was updated successfully, but these errors were encountered:

lucaparisi91 · 2024-07-20T09:14:17Z

Custom allocators can be used on the device using the use_allocators clause ( OpenMP 5.1) . This is supported in llvm 18.1 but not nvidia nvhpc 24.5 . I could not find a way to allocate a vector using a custom allocator.

Shared memory:

allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit.
use extensions to the standard
In LLVM one gets

int *shared_ptr =
      omp_alloc(N * sizeof(int), llvm_omp_target_shared_mem_alloc);

#pragma omp target is_device_ptr(shared_ptr)
  { ... }

omp_free( shared_ptr,  llvm_omp_target_shared_mem_alloc )

use a predefined openmp custom allocator. The array needs to be allocated statically.

double c[BLOCK_SIZE];
#pragma omp allocate(c) allocator(omp_pteam_mem_alloc)

One some compilers ( i.e. nvidia ) statically allocated arrays might be placed in shared memory instead of global memory even without the allocate statements.
Predefined allocators did not seem to be supported on the target on nvhpc 24.5
The Cray documentation instead suggests that omp_cgroup_mem_alloc should be used.

One can use allocators on target in conjuction with uses_allocators clause.

#pragma omp target teams num_teams(1) reduction(+:sum) shared(my_allocator) private(c)   uses_allocators(omp_pteam_mem_alloc) allocate(omp_pteam_mem_alloc:c)
        {
              ....
         }
This is supported in llvm 18.1 (clang) but not in nvida nvhpc 24.5

lucaparisi91 · 2024-07-22T08:58:06Z

Pinned Memory

One can define a host memory allocator to allocate pinned memory. Should map to cuda alloc. Better to use the omp_alloc subroutine, as the allocate clause is still poorly supported across compilers.

omp_memspace_handle_t c_memspace = omp_default_mem_space;
omp_alloctrait_t c_traits[2] = {  { omp_atk_pinned , true   }, {omp_atk_alignment, 128}  } ;
omp_allocator_handle_t c_alloc = omp_init_allocator(c_memspace,2,c_traits);
c = (double *) omp_alloc( n * sizeof(double),c_alloc);

This seemed to compile but to be ignored from the nvidia nvc++ sdk 24.5 compiler.

lucaparisi91 added the openmp-topics label Jun 26, 2024

lucaparisi91 changed the title ~~Shared memory with openmp~~ openmp allocators Jun 26, 2024

lucaparisi91 added the openmp-advanced label Jul 31, 2024

lucaparisi91 mentioned this issue Aug 15, 2024

Advanced Topics #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openmp allocators #4

openmp allocators #4

lucaparisi91 commented Jun 26, 2024 •

edited

Loading

lucaparisi91 commented Jul 20, 2024 •

edited

Loading

lucaparisi91 commented Jul 22, 2024 •

edited

Loading

openmp allocators #4

openmp allocators #4

Comments

lucaparisi91 commented Jun 26, 2024 • edited Loading

lucaparisi91 commented Jul 20, 2024 • edited Loading

Shared memory:

lucaparisi91 commented Jul 22, 2024 • edited Loading

Pinned Memory

lucaparisi91 commented Jun 26, 2024 •

edited

Loading

lucaparisi91 commented Jul 20, 2024 •

edited

Loading

lucaparisi91 commented Jul 22, 2024 •

edited

Loading