Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Experimental/corbett/jit #1333

Closed
wants to merge 10 commits into from
Closed

Conversation

corbett5
Copy link
Contributor

This is a proof of concept for a dynamic library based JIT. It takes the constitutive and element dispatch out of regionBasedKernelApplication and instead compiles and links in the kernel at runtime. I confirmed that it is passing the SSLE_sedov_01 problem on Quartz with Clang debug and GCC release and on Lassen with Clang debug.

It should pass all the one rank problems if I were to #include the appropriate headers for all the various kernels at the bottom of KernelBase.hpp like I did for SolidMechanicsSmallStrainExplicitNewmarkKernel.hpp. As is there are some compilation errors (with the main code not the JIT) if I do this, but structuring things like this is a terrible idea so I just proceeded with my hack.

Each run re-compiles the kernels it needs, this is wasteful but would be easily fixed.

Multiple rank runs fail because each rank tries to compile the kernel, easily fixed with some synchronization.

Related to
GEOS-DEV/LvArray#225

@corbett5 corbett5 changed the title Experimental/corbett/jit WIP: Experimental/corbett/jit Feb 23, 2021
@corbett5 corbett5 marked this pull request as draft February 23, 2021 01:30
@corbett5
Copy link
Contributor Author

@rrsettgast @klevzoff @francoishamon take a look.

@TotoGaz
Copy link
Contributor

TotoGaz commented Mar 3, 2021

Hi @corbett5 ! Would this jit approach means that instead of deploying GEOSX we need to provide the end-user with a complete GEOSX build environment? Will I still be able to work with models that I do not want to share (i.e. without accessing the sources)?

@corbett5
Copy link
Contributor Author

corbett5 commented Mar 3, 2021

@TotoGaz yes you will be able to pre-compile things.

#define KernelDispatch KernelDispatchJIT
#else
#define KernelDispatch KernelDispatchTemplate
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the comments, this is the part of the current implementation I want to change the most. It should be possible, but wound up eating hours while I was trying to get it implemented the way I would prefer.

Copy link
Contributor

@klevzoff klevzoff Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to make the two dispatcher classes similar by replacing NAME and HEADER parameters in KernelDispatchJIT with with just KERNEL_TYPE parameter (i.e. template template parameter, like in the non-JIT dispatcher).

NAME could then be extracted as (I'm trying to decide if this is 100% robust):

string const fullName = LvArray::system::demangleType< KERNEL_TYPE< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE > >();
string const name = fullName.substr( 0, fullName.find( '<' ) );

HEADER could be replaced by adding inside each "leaf" kernel class something like

template< typename SUBREGION_TYPE, typename FE_TYPE, typename CONSTITUTIVE_TYPE >
class MyKernel
{
  static constexpr char const source_location[] = __FILE__;
  ...
};

// Sad part: this is required in C++14 to avoid linker errors
template< typename SUBREGION_TYPE, typename FE_TYPE, typename CONSTITUTIVE_TYPE >
constexpr char const MyKernel< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE >::source_location[];

and accessing as

string const header = KERNEL_TYPE< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE >::source_location;

The downside is having this extra stuff to remember to put in kernels. Maybe there's a better way to associate kernel class with source file name?

These ideas aren't too pretty, just thought I'd mention them. They allow us to gets rid of JITTI_DECL/JITTI_TPARAM macros entirely and unify the two dispatchers, possibly even making them specializations as the comment above suggests, and allowing for a per-kernel JITting decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the end goal is to always use JITTI, although that doesn't mean that we need to always jit things at run-time. We need to have the capability to pre-jit everything at build-time, and I think that would suffice.

@wrtobin
Copy link
Collaborator

wrtobin commented Sep 1, 2021

I'll need to merge from dev and revert the submodule change prior to merge, but this is functional and in a good place to review. I still need to check some things on lassen, but was able to get 100% of the regression tests passing on quartz. It did require multiple passes as it is possible for multiple tests to try to compile the same kernel simultaneously resulting in file system errors that kill the runs.

This had to be avoiding internally via some mpi checks, but when multiple executables are running simultaneously it might require adding sentinel files or something similar, I'm open to ideas to mitigate that issue.

Copy link
Contributor

@klevzoff klevzoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just have one suggestion (not necessarily a good one) re: unifying the two dispatchers.

)
#
# Specify all sources
#
set( finiteElement_sources
FiniteElementDiscretization.cpp
FiniteElementDiscretizationManager.cpp
kernelInterface/kernelJIT.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not included in the PR, and I can't see it being generated anywhere either

camp::tuple< ARGS ... > m_args;
};

jitti::CompilationInfo getKernelCompilationInfo( const string & header );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the definition of this function anywhere. Is it in kernelJIT.cpp?

#define KernelDispatch KernelDispatchJIT
#else
#define KernelDispatch KernelDispatchTemplate
#endif
Copy link
Contributor

@klevzoff klevzoff Sep 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to make the two dispatcher classes similar by replacing NAME and HEADER parameters in KernelDispatchJIT with with just KERNEL_TYPE parameter (i.e. template template parameter, like in the non-JIT dispatcher).

NAME could then be extracted as (I'm trying to decide if this is 100% robust):

string const fullName = LvArray::system::demangleType< KERNEL_TYPE< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE > >();
string const name = fullName.substr( 0, fullName.find( '<' ) );

HEADER could be replaced by adding inside each "leaf" kernel class something like

template< typename SUBREGION_TYPE, typename FE_TYPE, typename CONSTITUTIVE_TYPE >
class MyKernel
{
  static constexpr char const source_location[] = __FILE__;
  ...
};

// Sad part: this is required in C++14 to avoid linker errors
template< typename SUBREGION_TYPE, typename FE_TYPE, typename CONSTITUTIVE_TYPE >
constexpr char const MyKernel< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE >::source_location[];

and accessing as

string const header = KERNEL_TYPE< SUBREGION_TYPE, FE_TYPE, CONSTITUTIVE_TYPE >::source_location;

The downside is having this extra stuff to remember to put in kernels. Maybe there's a better way to associate kernel class with source file name?

These ideas aren't too pretty, just thought I'd mention them. They allow us to gets rid of JITTI_DECL/JITTI_TPARAM macros entirely and unify the two dispatchers, possibly even making them specializations as the comment above suggests, and allowing for a per-kernel JITting decision.

Comment on lines +49 to +56
add_custom_command( OUTPUT ${CMAKE_BINARY_DIR}/include/kernelJITCompileCommands.hpp
WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
COMMAND python ${CMAKE_CURRENT_LIST_DIR}/../LvArray/src/jitti/generateCompileCommandsHeader.py
${CMAKE_BINARY_DIR}/compile_commands.json
--cpp ${CMAKE_CURRENT_LIST_DIR}/kernelInterface/kernelJIT.cpp
--hpp ${CMAKE_BINARY_DIR}/include/kernelJITCompileCommands.hpp
--include ${CMAKE_BINARY_DIR}/include
--linker ${CMAKE_CXX_COMPILER} )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of relying on headers that you generate on the file system, could it have been more robust to store the C++ pre-processing output as a string alongside the compile/link commands directly into the lib?
And that would be more "modern" JIT style imho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you could do that, although it would require the user to list out the functions that they want to JIT at configuration time in CMake. Then for each of those functions we could pre-process jitti/templateSource.cpp. But only up to a point, since we don't know what template params the user will want. Then at run time when we know the params we could compile the pre-processed source with the additional command line definitions of JITTI_TEMPLATE_PARAMS and JITTI_TEMPLATE_PARAMS_STRING.

However, how we pass a string embedded in our library to the compiler without using the file system is beyond me. Not to mention that at the end of the day the compiler is going to spit out a library on the file system that we then have to open. So this frees us from having to have access to the same headers used to build, but unless we do something really fancy I think file system access is a requirement.

Also at least with CUDA the cost of the compilation time itself will greatly outweigh the cost of opening and pre-processing the source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also at least with CUDA the cost of the compilation time itself will greatly outweigh the cost of opening and pre-processing the source.

It's more a problem of having a consistent self contained GEOSX installation than a performance problem imho.
Embedding the pre-processed source files directly into GEOSX would help at non relying on sources that may be modified behind the scenes.
I do not know all the JIT details so maybe it's not a problem anyway: I do not want to solve problems that do not exist.

option( ENABLE_JITTI "Build all compute kernels just-in-time at runtime." OFF )

if ( ENABLE_JITTI )
message( "JITTI is ENABLED")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way we've been doing this is GEOSX_ENABLE_JITTI, or maybe it's GEOSX_USE_JITTI, I forget. But either way I like defined/not defined instead of 1/0.

#define KernelDispatch KernelDispatchJIT
#else
#define KernelDispatch KernelDispatchTemplate
#endif
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the end goal is to always use JITTI, although that doesn't mean that we need to always jit things at run-time. We need to have the capability to pre-jit everything at build-time, and I think that would suffice.

constexpr bool compilerIsNVCC = false;
#endif

jitti::CompilationInfo getKernelCompilationInfo( const string & header )
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my idea is that for each kernel header we'd have a source that includes the header and has the method to get thejitti::CompilationInfo(s) for the kernel(s) in the header. That way if the kernel (header) changes and the user rebuilds GEOSX then the compile time in the compilation info gets updated and so things will be re-jitted.

I think with this implementation it won't rejit if say the solid mechanics kernels are modified and GEOSX is rebuilt, only if some dependency of KernelBase.hpp is modified.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is probably true in the current implementation.

This is the first thing I'll take a look at changing from the current implementation, and then I'll look at adding in the compile-time pre-jit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants