-
Notifications
You must be signed in to change notification settings - Fork 54
McStas McXtrace 3 and GPU terminology (table)
Peter Willendrup edited this page Nov 30, 2022
·
2 revisions
Concept | Description |
---|---|
Compute section | Section (containing one or more component codes) in the instrument performing calculations, either on CPU or in a GPU kernel |
GPU kernel | A GPU Compute Section. Section (containing one or more component codes) in the instrument performing calculations on GPU. In non-FUNNEL mode all components will be put in one GPU kernel, in FUNNEL mode multiple GPU kernels / Compute sections will run in succession, switching from GPU to CPU if needed. |
Calculation modes, parallelism | Description |
---|---|
non-FUNNEL mode (default) |
All TRACE code from all components is assembled in a single Compute section / GPU kernel. Each neutron thread is calculated from start to end of the instrument. On CPU this happens in series, on GPU in parallel. |
FUNNEL mode |
Calculations are performed for a bunch of neutrons (size gpu_innerloop ) within in multiple Compute sections / GPU kernels. Component code is put together in such sections until either a SPLIT occurs in the instrument or where passing from GPU to CPU or vice versa is needed, see CPU COMPONENT and NOACC
|
ncount |
Setting for the total number of particle histories to simulate through your instrument. |
gpu_innerloop |
Used on GPU only. (Default and maximum: ~2e9, corresponding to MAX_INT . 2e9 is the maximum number of thread executions in a kernel running on your NVIDIA GPU). When ncount is higher than gpu_innerloop setting, a series of calculations each with gpu_innerloop threads will run in your GPU kernel(s). |
numgangs |
Used on GPU only. (Default: 7813) Used to define the number of thread gangs used for parallelisation in your kernel(s), see explanation from the OpenACC course. |
vecsize |
Used on GPU only. (Default: 128) Used to define the number of thread vectors used for parallelisation in your kernel(s), see explanation from the OpenACC course. |
Particle-scope | Description |
---|---|
particle struct | Contains particle state variables and USERVARS as defined in the instrument (e.g. particle-dependent flags). |
x,y,z |
particle position variables |
vx,vy,vz |
particle velocity variables |
sx,sy,sz |
particle polarisation variables |
t |
particle time variable |
p |
particle weight/intensity variable |
randstate |
RNG state carried independently pr. particle |
_uid |
Particle "thread number" |
Instrument-scope | Description |
---|---|
USERVARS |
A new McStas 3.x section of the instrumentfile. Allows to define flags / particle dependent variables, embedded in the particle state. These can be used with the user1 .. user3 options of Monitor_nD and read by other components by means of the particle_getvar() function. |
CPU COMPONENT |
Keyword for the instrument-grammar, indicating that computations for this component should happen on CPU. Useful for components that work both on GPU and CPU, but where CPU computation is preferred. Automatically selects that the instrument will be running in the FUNNEL mode. |
SPLIT |
Keyword for the instrument-grammar indicating that neutrons making it here should be "repeated", preserving total intensity. In non-FUNNEL mode a grammar SPLIT 10 means that per incoming particle, 10 identical particles of weight 1/10 will be computed from this point onwards. In FUNNEL mode the repetition is not an input, but within the buffer size gpu_innerloop , any ABSORB 'ed particles are replaced by still active ones. Intensity is preserved. |
JUMP |
A in instrument-grammar keyword that let's particle teleport from one component to another (beware, a GOTO.) Not supported in FUNNEL mode |
Component-scope | Description |
---|---|
NOACC |
Header-keyword for the component-grammar, indicating that the component can not run on GPU. Useful for components that include algorithms that can not be ported to GPU, e.g. those that use external (non-mcstas and non-GPU) libraries / functions. Automatically selects that the instrument will be running in the FUNNEL mode. |
Component-struct | Data that represents a component instance. Two Guide elements of the same type will have independent structs. |
DECLARE |
In components, the declare section must contain single variable declarations, each on an independent line and without an assignment. |
SHARE |
Used to define functions and data-types to be used within the component. Functions that are to be used in TRACE on GPU should receive a #pragma acc routine
|
INITIALIZE |
Forms a functions that fills runtime-parameters in the component struct. |
TRACE |
Section in the component that performs calculations on an incoming particle, defining an outgoing particle. |
Thread-safe TRACE
|
As particles are calculated in a highly parallel fashion on GPU, quantities that depend on the particle state can not be DECLARE / component variables that are in a common scope. You should instead declare these quantities as local variables within the component TRACE . A symptom of a non-thread-safe component is that GPU and CPU results differ or that GPU results are not constant if repeated with the same seed. |
#pragma acc atomic [capture/write/update/] |
Used to define that the following line of code must be performed by one GPU thread at the time only. Use this to e.g. save data in Monitor histogram arrays and other multi-particle dependent data. |