Skip to content

Commit

Permalink
Merge pull request #527 from DiamonDinoia/perftests
Browse files Browse the repository at this point in the history
Adding perftest.  This adds performance testers for devs, not part of CI. It adds a documentation page on perf changes from 2.2. to 2.3.
  • Loading branch information
ahbarnett authored Sep 5, 2024
2 parents ed79798 + 97bfe85 commit f7b062b
Show file tree
Hide file tree
Showing 41 changed files with 714 additions and 10 deletions.
11 changes: 5 additions & 6 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,24 @@ Flatiron Institute Nonuniform Fast Fourier Transform


Documentation contents
========================
========================

.. toctree::
:maxdepth: 3

install
install_gpu
performance
dirs
math
cex
cex
c
c_gpu
opts
error
trouble
tut
fortran
fortran
matlab
python
python_gpu
Expand All @@ -42,5 +43,3 @@ Documentation contents
users
ackn
refs


112 changes: 112 additions & 0 deletions docs/performance.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
Performance
============

This page shows the performance of the various version of FINUFFT starting from version 2.2.0. The goal is to ensure that performance does not regress between releases.
The results serve also as a user guideline to select the best compile time configuration (compiler, flags, fft implementation) and the best runtime parameters (upsampling factor, number of threads).
Please note that the performance depends on multiple parameters.
Notably: dimensions, size, digits requested, upsamplig factor, CPU utilised, compiler flags, SIMD instructions.
Due to the curse of dimensionality it is not possible to test all possible configurations.
Hence, we selected some use cases from this `GitHub discussion <https://github.com/flatironinstitute/finufft/discussions/398>`_.
If none of the following examples is of any help for your specific use case, participating in the discussion is the best way to communicate with us and the benchmarks may be updated to satisfy the majority of the users.

This `GitHub discussion <https://github.com/flatironinstitute/finufft/discussions/452>`_ shows instead the performance of the spreader/interpolator with different compilers and gives more insight on which one might be the faster for a specific CPU.


The CPU used for all benchmarks is Intel(R) Xeon(R) w5-3435X, the compiler is GCC 13.2.0.
The compiler flags are the ones used as a default in the CMakeLists.txt of the version tested we only impose a Release build and `-march=native`.
The title of the image contains the parameters used.
- pref: precision f=float, d=double
- N(x): dimension along one axes
- M: number of non-uniform points
- type: transform type (1, 2 or 3)

The other parameters are the same as finufft_opts.

To generate the results run `bench.py` as is inside perftest. It requires ``numpy``, ``pandas`` and ``matplotlib``.
The script assumes a bash like shell and it might not work on Windows.
.. warning::
DO NOT RUN the script from inside the finufft git directory as it will mess up the git directory and fail!

1D Transforms
---------------------------------------------

Type 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/10000x1x1-type-1-upsamp2.00-precf-thread1.png
.. image:: pics/10000x1x1-type-1-upsamp1.25-precf-thread1.png
.. image:: pics/10000x1x1-type-1-upsamp2.00-precd-thread1.png
.. image:: pics/10000x1x1-type-1-upsamp1.25-precd-thread1.png

Type 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/10000x1x1-type-2-upsamp1.25-precd-thread1.png
.. image:: pics/10000x1x1-type-2-upsamp1.25-precf-thread1.png
.. image:: pics/10000x1x1-type-2-upsamp2.00-precd-thread1.png
.. image:: pics/10000x1x1-type-2-upsamp2.00-precf-thread1.png

Type 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/10000x1x1-type-3-upsamp1.25-precd-thread1.png
.. image:: pics/10000x1x1-type-3-upsamp1.25-precf-thread1.png
.. image:: pics/10000x1x1-type-3-upsamp2.00-precd-thread1.png
.. image:: pics/10000x1x1-type-3-upsamp2.00-precf-thread1.png

2D transforms
---------------------------------------------
Type 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. image:: pics/320x320x1-type-1-upsamp1.25-precf-thread1.png
.. image:: pics/320x320x1-type-1-upsamp1.25-precd-thread1.png
.. image:: pics/320x320x1-type-1-upsamp2.00-precf-thread1.png
.. image:: pics/320x320x1-type-1-upsamp2.00-precd-thread1.png

Type 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/320x320x1-type-2-upsamp1.25-precf-thread1.png
.. image:: pics/320x320x1-type-2-upsamp1.25-precd-thread1.png
.. image:: pics/320x320x1-type-2-upsamp2.00-precf-thread1.png
.. image:: pics/320x320x1-type-2-upsamp2.00-precd-thread1.png

Type 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/320x320x1-type-3-upsamp1.25-precf-thread1.png
.. image:: pics/320x320x1-type-3-upsamp1.25-precd-thread1.png
.. image:: pics/320x320x1-type-3-upsamp2.00-precf-thread1.png
.. image:: pics/320x320x1-type-3-upsamp2.00-precd-thread1.png

2D transforms Multi-Threaded
---------------------------------------------

Type 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/320x320x1-type-1-upsamp1.25-precf-thread32.png
.. image:: pics/320x320x1-type-1-upsamp2.00-precf-thread32.png

Type 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/320x320x1-type-2-upsamp1.25-precf-thread32.png
.. image:: pics/320x320x1-type-2-upsamp2.00-precf-thread32.png

Type 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/320x320x1-type-3-upsamp1.25-precf-thread32.png
.. image:: pics/320x320x1-type-3-upsamp2.00-precf-thread32.png

3D transforms Multi-Threaded (float32)
---------------------------------------------

Type 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/192x192x128-type-1-upsamp1.25-precf-thread32.png
.. image:: pics/192x192x128-type-1-upsamp2.00-precf-thread32.png

Type 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/192x192x128-type-2-upsamp1.25-precf-thread32.png
.. image:: pics/192x192x128-type-2-upsamp2.00-precf-thread32.png

Type 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. image:: pics/192x192x128-type-3-upsamp1.25-precf-thread32.png
.. image:: pics/192x192x128-type-3-upsamp2.00-precf-thread32.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 14 additions & 4 deletions perftest/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,25 @@ set(PERFTESTS guru_timing_test manysmallprobs spreadtestnd spreadtestndall)

foreach(TEST ${PERFTESTS})
add_executable(${TEST} ${TEST}.cpp)
if (FINUFFT_USE_DUCC0)
if(FINUFFT_USE_DUCC0)
target_compile_definitions(${TEST} PRIVATE -DFINUFFT_USE_DUCC0)
endif ()
endif()
finufft_link_test(${TEST})

add_executable(${TEST}f ${TEST}.cpp)
target_compile_definitions(${TEST}f PRIVATE -DSINGLE)
if (FINUFFT_USE_DUCC0)
if(FINUFFT_USE_DUCC0)
target_compile_definitions(${TEST}f PRIVATE -DFINUFFT_USE_DUCC0)
endif ()
endif()
finufft_link_test(${TEST}f)
endforeach()

include(CheckIncludeFile)
check_include_file("getopt.h" HAVE_GETOPT_H)
if(HAVE_GETOPT_H)
add_executable(perftest perftest.cpp)
if(FINUFFT_USE_DUCC0)
target_compile_definitions(perftest PRIVATE -DFINUFFT_USE_DUCC0)
endif()
finufft_link_test(perftest)
endif()
Loading

0 comments on commit f7b062b

Please sign in to comment.