Skip to content

Commit

Permalink
[oneMKL, DFT] Suggested changes for oneMKL DFT APIs (typo fixes, corr…
Browse files Browse the repository at this point in the history
…ections, revisions, and type-safety-motivated changes) (#593)

* [DFT] Suggested corrections and re-structuring for overall consistency of the oneMKL DFT specs.
Notes:
    - removed explicit reference to "periodic" sequences in intro
    - private conditional type 'real_scalar_t' added to the descriptor class template to alleviate ambiguity in declaration of workspace-related member functions;
    - fixed typo 'oneapi::mkl::dft::*precision*::{REAL,COMPLEX}' in _onemkl_dft_descriptor_template_parameters
    - revised "syntax" parts and _onemkl_dft_descriptor_member_table's description for the constructors
    - added case of "workspace not accessible to the device" in exceptions for the set_workspace member function
    - revised "syntax" parts for the scoped enumeration types
    - fixed typo in step 1 of _onemkl_dft_typical_usage_of_workspace_external
    - unified and fixed namespace ambiguities in illustrative code snippets
    - unified and generalized the use of inline literals where relevant (e.g., referring to types, enum, class, objects, args, ..i.), throughout (several internal links removed or slightly rephrased to that end)
    - revised all parts referring to "at construction time" as they were ambiguous w.r.t. the copy and move constructors added in the meantime
    - moved WORKSPACE_EXTERNAL_BYTES to read-only items in config_param
    - completed specification for config_value in page dedicated to scoped enumeration types

* [DFT] Suggested changes to deprecate variadic member function, clarify their behavior and introduce type-safe substitute overloads

* [DFT] slight rephrasing regarding commit step in introductory page
  • Loading branch information
raphael-egan authored Oct 15, 2024
1 parent 590a1f0 commit 7687188
Show file tree
Hide file tree
Showing 8 changed files with 1,366 additions and 984 deletions.
373 changes: 194 additions & 179 deletions source/elements/oneMKL/source/domains/dft/compute_backward.rst

Large diffs are not rendered by default.

364 changes: 189 additions & 175 deletions source/elements/oneMKL/source/domains/dft/compute_forward.rst

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,12 @@
.. _onemkl_dft_config_data_layouts:

Configuration of Data Layouts
Configuration of data layouts
-----------------------------

The usage of prepended namespace specifiers ``oneapi::mkl::dft`` is
omitted below for conciseness.

The DFT interface provides the configuration parameters
``config_param::FWD_STRIDES`` (resp. ``config_param::BWD_STRIDES``)
to define the data layout locating entries of relevant data sequences in the
Expand All @@ -22,8 +25,8 @@ superscript :math:`\text{fwd}` (resp. :math:`\text{bwd}`) for data sequences
belonging to forward (resp. backward) domain, for any :math:`m` and multi-index
:math:`\left(k_1, k_2, \ldots, k_d\right)` within :ref:`valid
range<onemkl_dft_elementary_range_of_indices>`, the corresponding entry
:math:`\left(\cdot\right)^{m}_{k_{1}, k_{2}, \dots, k_d }` - or the real or
imaginary part thereof - of the relevant data sequence is located at index
:math:`\left(\cdot\right)^{m}_{k_{1}, k_{2}, \dots, k_d }` or the real or
imaginary part thereof of the relevant data sequence is located at index

.. math::
s^{\text{xwd}}_0 + k_1\ s^{\text{xwd}}_1 + k_2\ s^{\text{xwd}}_2 + \dots + k_d\ s^{\text{xwd}}_d + m\ l^{\text{xwd}}
Expand Down Expand Up @@ -61,13 +64,13 @@ forward-domain (resp. backward-domain) data sequences and
.. rubric:: Implicitly-assumed elementary data type

When reading or writing an element at index :eq:`eq_idx_data_layout` of any
user-provided data container used at compute time, a
:ref:`descriptor<onemkl_dft_descriptor>` object may re-interpret the base data
type of that data container into an implicitly-assumed elementary data type.
user-provided data container used at compute time, a ``descriptor`` object may
re-interpret the base data type of that data container into an
implicitly-assumed elementary data type.
That implicitly-assumed data type depends on the object type, *i.e.*, on the
specialization values used for the template parameters when instantiating the
:ref:`descriptor<onemkl_dft_descriptor>` class, and, in case of complex
descriptors, on the configuration value set for its configuration parameter
``descriptor`` :ref:`class template<onemkl_dft_descriptor>`, and, in case of
complex descriptors, on the configuration value set for its configuration parameter
``config_param::COMPLEX_STORAGE``. The table below lists the implicitly-assumed
data type in either domain (last 2 columns) based on the object type and
its configuration value for ``config_param::COMPLEX_STORAGE`` (first 2 columns).
Expand Down Expand Up @@ -213,59 +216,59 @@ configuration parameter ``config_param::INPUT_STRIDES`` if
The values of :math:`s^{\text{i}}_{j}` and :math:`s^{\text{o}}_{j}` are to be
used and considered by oneMKL if and only if
:math:`s^{\text{fwd}}_{j} = s^{\text{bwd}}_{j} = 0, \forall j \in \lbrace 0, 1, \ldots, d\rbrace`.
(This will happen automatically if ``config_param::INPUT_STRIDES`` and ``config_param::OUTPUT_STRIDES``
are set and ``config_param::FWD_STRIDES`` and ``config_param::BWD_STRIDES`` are not. See note below.)
In such a case, :ref:`descriptor<onemkl_dft_descriptor>` objects must consider
the data layouts corresponding to the two compute directions separately. As
detailed above, relevant data sequence entries are accessed as elements of data
containers (``sycl::buffer`` objects or device-accessible USM allocations)
provided to the compute function, the base data type of which is (possibly
implicitly re-interpreted) as documented in :ref:`this
table<onemkl_dft_config_data_implicitly_assumed_elementary_data_type>`. If using
input and output strides, for any :math:`m` and multi-index
This will happen automatically if ``config_param::INPUT_STRIDES`` and
``config_param::OUTPUT_STRIDES`` are set and ``config_param::FWD_STRIDES`` and
``config_param::BWD_STRIDES`` are not (see note below).
In such a case, ``descriptor`` objects must consider the data layouts
corresponding to the two compute directions separately. As detailed above,
relevant data sequence entries are accessed as elements of data containers
(``sycl::buffer`` objects or device-accessible USM allocations) provided to the
compute function, the base data type of which is (possibly implicitly re-interpreted)
as documented in the above
:ref:`table<onemkl_dft_config_data_implicitly_assumed_elementary_data_type>`. If
using input and output strides, for any :math:`m` and multi-index
:math:`\left(k_1, k_2, \ldots, k_d\right)` within :ref:`valid
range<onemkl_dft_elementary_range_of_indices>`, the index to be used when
accessing a data sequence entry - or part thereof - in forward domain is
accessing a data sequence entry or part thereof in forward domain is

.. math::
s^{\text{x}}_0 + k_1\ s^{\text{x}}_1 + k_2\ s^{\text{x}}_2 + \dots + k_d\ s^{\text{x}}_d + m\ l^{\text{fwd}}
where :math:`\text{x} = \text{i}` (resp. :math:`\text{x} = \text{o}`) for
forward (resp. backward) DFT(s). Similarly, the index to be used when accessing
a data sequence entry - or part thereof - in backward domain is
a data sequence entry or part thereof in backward domain is

.. math::
s^{\text{x}}_0 + k_1\ s^{\text{x}}_1 + k_2\ s^{\text{x}}_2 + \dots + k_d\ s^{\text{x}}_d + m\ l^{\text{bwd}}
where :math:`\text{x} = \text{o}` (resp. :math:`\text{x} = \text{i}`) for
forward (resp. backward) DFT(s).

As a consequence, configuring :ref:`descriptor<onemkl_dft_descriptor>` objects
using these deprecated configuration parameters makes their configuration
direction-dependent when different stride values are used in
forward and backward domains. Since the intended compute direction is unknown
to the :ref:`descriptor<onemkl_dft_descriptor>` object when
As a consequence, configuring ``descriptor`` objects using these deprecated
configuration parameters makes their configuration direction-dependent when
different stride values are used in forward and backward domains. Since the
intended compute direction is unknown to the object when
:ref:`committing<onemkl_dft_descriptor_commit>` it, every direction that results
in a :ref:`consistent data layout<onemkl_dft_data_layout_requirements>` in
forward and backward domains must be supported by successfully committed
:ref:`descriptor<onemkl_dft_descriptor>` objects.
forward and backward domains must be supported by successfully-committed
``descriptor`` objects.

.. note::
For :ref:`descriptor<onemkl_dft_descriptor>` objects with strides configured
via these deprecated configuration parameters, the :ref:`consistency
requirements<onemkl_dft_data_layout_requirements>` may be satisfied for only
one of the two compute directions, *i.e.*, for only one of the forward or
backward DFT(s). Such a configuration should not cause an exception to be
thrown by the descriptor's :ref:`onemkl_dft_descriptor_commit` member
function but the behavior of oneMKL is undefined if using that object for
the compute direction that does not align with the :ref:`consistency
requirements<onemkl_dft_data_layout_requirements>`.
For ``descriptor`` objects with strides configured via these deprecated
configuration parameters, the
:ref:`consistency requirements<onemkl_dft_data_layout_requirements>` may be
satisfied for only one of the two compute directions, *i.e.*, for only one
of the forward or backward DFT(s). Such a configuration should not cause an
exception to be thrown by the descriptor's ``commit``
:ref:`member function<onemkl_dft_descriptor_commit>` but the behavior of
oneMKL is undefined if using that object for the compute direction that does
not align with the :ref:`consistency requirements<onemkl_dft_data_layout_requirements>`.

.. note::
Setting either of ``config_param::INPUT_STRIDES`` or
``config_param::OUTPUT_STRIDES`` triggers any default or previously-set
values for ``config_param::FWD_STRIDES`` and ``config_param::BWD_STRIDES``
to reset to ``std::vector<std::int64_t>(d+1, 0)`` values, and vice versa.
to reset to ``std::vector<std::int64_t>(d+1, 0)``, and vice versa.
This default behavior prevents mix-and-matching usage of either of
``config_param::INPUT_STRIDES`` or ``config_param::OUTPUT_STRIDES`` with
either of ``config_param::FWD_STRIDES`` or ``config_param::BWD_STRIDES``,
Expand All @@ -282,14 +285,15 @@ the reverse direction as shown below.

.. code-block:: cpp
namespace dft = oneapi::mkl::dft;
// ...
desc.set_value(config_param::INPUT_STRIDES, fwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, bwd_domain_strides);
desc.set_value(dft::config_param::INPUT_STRIDES, fwd_domain_strides);
desc.set_value(dft::config_param::OUTPUT_STRIDES, bwd_domain_strides);
desc.commit(queue);
compute_forward(desc, ...);
// ...
desc.set_value(config_param::INPUT_STRIDES, bwd_domain_strides);
desc.set_value(config_param::OUTPUT_STRIDES, fwd_domain_strides);
desc.set_value(dft::config_param::INPUT_STRIDES, bwd_domain_strides);
desc.set_value(dft::config_param::OUTPUT_STRIDES, fwd_domain_strides);
desc.commit(queue);
compute_backward(desc, ...);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,13 @@
Data storage
============

The data storage convention observed by a
:ref:`descriptor<onemkl_dft_descriptor>` object depends on whether it is a real
or complex descriptor and, in case of complex descriptors, on the configuration
value associated with configuration parameter ``config_param::COMPLEX_STORAGE``.
The usage of prepended namespace specifiers ``oneapi::mkl::dft`` is
omitted below for conciseness.

The data storage convention observed by a ``descriptor`` object depends on
whether it is a real or complex descriptor and, in case of complex descriptors,
on the configuration value associated with configuration parameter
``config_param::COMPLEX_STORAGE``.

.. _onemkl_dft_complex_storage:

Expand All @@ -24,14 +27,12 @@ associated with a configuration value ``config_value::COMPLEX_COMPLEX`` (default
behavior), those entries are accessed and stored as ``std::complex<float>``
(resp. ``std::complex<double>``) elements of a single data container
(device-accessible USM allocation or ``sycl::buffer`` object) if the
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
double-precision) descriptor. If the configuration value
``config_value::REAL_REAL`` is used instead, the real and imaginary parts of
those entries are accessed and stored as ``float`` (resp. ``double``) elements
of two separate, non-overlapping data containers (device-accessible USM
allocations or ``sycl::buffer`` objects) if the
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
double-precision) descriptor.
``descriptor`` object is a single-precision (resp. double-precision) descriptor.
If the configuration value ``config_value::REAL_REAL`` is used instead, the real
and imaginary parts of those entries are accessed and stored as ``float`` (resp.
``double``) elements of two separate, non-overlapping data containers
(device-accessible USM allocations or ``sycl::buffer`` objects) if the
``descriptor`` object is a single-precision (resp. double-precision) descriptor.

These two behaviors are further specified and illustrated below.

Expand All @@ -45,20 +46,19 @@ sequences must belong to a single data container (device-accessible USM
allocation or ``sycl::buffer`` object). Any relevant entry
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` is accessed/stored from/in
a data container provided at compute time at the index value expressed in eq.
:eq:`eq_idx_data_layout` (from :ref:`this page<onemkl_dft_config_data_layouts>`)
:eq:`eq_idx_data_layout` (see the page dedicated to the
:ref:`configuration of data layout<onemkl_dft_config_data_layouts>`)
of that data container, whose elementary data type is (possibly implicitly
re-interpreted as) ``std::complex<float>`` (resp. ``std::complex<double>``) for
single-precision (resp. double-precision) descriptors.

The same unique data container is to be used for forward- and backward-domain
data sequences for in-place transforms (for
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
``config_value::INPLACE`` for configuration parameter
data sequences for in-place transforms (for ``descriptor`` objects with
configuration value ``config_value::INPLACE`` for configuration parameter
``config_param::PLACEMENT``). Two separate data containers sharing no common
elements are to be used for out-of-place transforms (for
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
``config_value::NOT_INPLACE`` for configuration parameter
``config_param::PLACEMENT``).
elements are to be used for out-of-place transforms (for ``descriptor`` objects
with configuration value ``config_value::NOT_INPLACE`` for configuration
parameter ``config_param::PLACEMENT``).

The following snippet illustrates the usage of ``config_value::COMPLEX_COMPLEX``
for configuration parameter ``config_param::COMPLEX_STORAGE``, in the
Expand All @@ -84,8 +84,8 @@ USM allocations.
// initialize forward-domain data such that entry {m;k1,k2,k3}
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
compute_forward(desc, Z); // complex-to-complex in-place DFT
// in backward domain: entry {m;k1,k2,k3}
auto ev = compute_forward(desc, Z); // complex-to-complex in-place DFT
// Upon completion of ev, in backward domain: entry {m;k1,k2,k3}
// = Z[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
.. _onemkl_dft_complex_storage_real_real:
Expand All @@ -98,21 +98,20 @@ read/stored from/in two different, non-overlapping data containers
(device-accessible USM allocations or ``sycl::buffer`` objects) encapsulating
the real and imaginary parts of the relevant entries separately. The real and
imaginary parts of any relevant complex entry
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` are both stored at the index value
expressed in eq. :eq:`eq_idx_data_layout` (from :ref:`this
page<onemkl_dft_config_data_layouts>`) of their respective data containers, whose elementary
data type is (possibly implicitly re-interpreted as) ``float`` (resp.
``double``) for single-precision (resp. double-precision) descriptors.
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` are both stored at the
index value expressed in eq. :eq:`eq_idx_data_layout` (see the page dedicated to
the :ref:`configuration of data layout<onemkl_dft_config_data_layouts>`) of
their respective data containers, whose elementary data type is (possibly
implicitly re-interpreted as) ``float`` (resp. ``double``) for single-precision
(resp. double-precision) descriptors.

The same two data containers are to be used for real and imaginary parts of
forward- and backward-domain data sequences for in-place transforms (for
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
``config_value::INPLACE`` for configuration parameter
``config_param::PLACEMENT``). Four separate data containers sharing no common
elements are to be used for out-of-place transforms (for
:ref:`descriptor<onemkl_dft_descriptor>` objects with configuration value
``config_value::NOT_INPLACE`` for configuration parameter
``config_param::PLACEMENT``).
``descriptor`` objects with configuration value ``config_value::INPLACE`` for
configuration parameter ``config_param::PLACEMENT``). Four separate data
containers sharing no common elements are to be used for out-of-place transforms
(for ``descriptor`` objects with configuration value ``config_value::NOT_INPLACE``
for configuration parameter ``config_param::PLACEMENT``).

The following snippet illustrates the usage of ``config_value::REAL_REAL``
set for configuration parameter ``config_param::COMPLEX_STORAGE``, in the
Expand Down Expand Up @@ -141,8 +140,8 @@ USM allocations.
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
// in backward domain: the real part of entry {m;k1,k2,k3}
auto ev = compute_forward<decltype(desc), float>(desc, ZR, ZI); // complex-to-complex in-place DFT
// Upon completion of ev, in backward domain: the real part of entry {m;k1,k2,k3}
// = ZR[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
// and the imaginary part of entry {m;k1,k2,k3}
// = ZI[ strides[0] + k1*strides[1] + k2*strides[2] + k3*strides[3] + m*dist ]
Expand All @@ -156,14 +155,13 @@ Real descriptors observe only one type of data storage. Any relevant (real)
entry :math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data sequence
in forward domain is accessed and stored as a ``float`` (resp. ``double``)
element of a single data container (device-accessible USM allocation or
``sycl::buffer`` object) if the :ref:`descriptor<onemkl_dft_descriptor>` object
is a single-precision (resp. double-precision) descriptor. Any relevant
(complex) entry :math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data
sequence in backward domain is accessed and stored as a ``std::complex<float>``
(resp. ``std::complex<double>``) element of a single data container
(device-accessible USM allocation or ``sycl::buffer`` object) if the
:ref:`descriptor<onemkl_dft_descriptor>` object is a single-precision (resp.
double-precision) descriptor.
``sycl::buffer`` object) if the ``descriptor`` object is a single-precision
(resp. double-precision) descriptor. Any relevant (complex) entry
:math:`\left(\cdot\right)^{m}_{k_1, k_2,\dots ,k_d}` of a data sequence in
backward domain is accessed and stored as a ``std::complex<float>`` (resp.
``std::complex<double>``) element of a single data container (device-accessible
USM allocation or ``sycl::buffer`` object) if the
``descriptor`` object is a single-precision (resp. double-precision) descriptor.

The following snippet illustrates the usage of a real, single-precision
descriptor (and the corresponding data storage) for the in-place,
Expand All @@ -190,12 +188,13 @@ forward and backward domains, with USM allocations.
// initialize forward-domain data such that real entry {m;k1,k2,k3}
// = data[ fwd_strides[0] + k1*fwd_strides[1] + k2*fwd_strides[2] + k3*fwd_strides[3] + m*fwd_dist ]
compute_forward(desc, data); // real-to-complex in-place DFT
// in backward domain, the implicitly-assumed type is complex so, considering
auto ev = compute_forward(desc, data); // real-to-complex in-place DFT
// In backward domain, the implicitly-assumed type is complex so, consider
// std::complex<float>* complex_data = static_cast<std::complex<float>*>(data);
// we have entry {m;k1,k2,k3}
// upon completion of ev, the backward-domain entry {m;k1,k2,k3} is
// = complex_data[ bwd_strides[0] + k1*bwd_strides[1] + k2*bwd_strides[2] + k3*bwd_strides[3] + m*bwd_dist ]
// for 0 <= k3 <= n3/2.
// Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} = std::conj(entry {m;n1-k1,n2-k2,n3-k3})
// Note: if n3/2 < k3 < n3, entry {m;k1,k2,k3} is not stored explicitly
// since it is equal to std::conj(entry {m;n1-k1,n2-k2,n3-k3})
**Parent topic** :ref:`onemkl_dft_enums`
Loading

0 comments on commit 7687188

Please sign in to comment.