Version 4.0.0
Feature Additions:
- RISC-V support for
host
target (either cross compiled or native) - Features can be enabled for
host
targets using the cmake
optionCA_HOST_TARGET_<ARCH>_FEATURES
or the environment variable
CA_HOST_TARGET_FEATURES. This uses a format similar to -mattr on tools like
opt e.g. "+v;-zfencei". - The compiler-passes directory has been created to just allow important
compiler passes to be made available externally such as the vectorizer vecz. - Support for SPIR 1.2 programs (cl_khr_spir) has been dropped.
Upgrade guidance:
- LLVM 17 and 18 only are supported in this release.
- Added support for a Remote HAL over tcp/ip, primarily for demoing new RISC-V
targets quickly - The cmake variable
CA_HOST_TARGET_CPU
has been split into capitalized
architecture variants of the formCA_HOST_TARGET_<ARCH>_CPU
e.g
CA_HOST_TARGET_X86_64_CPU
,CA_HOST_TARGET_AARCH64_CPU
and
CA_HOST_TARGET_RISCV64_CPU
. The environment variableCA_HOST_TARGET_CPU
remains the same name. Note thatCA_HOST_TARGET_CPU_NATIVE
is no longer
supported but can be achieved by using native as a value for the variants. - The mux spec has been bumped:
- 0.77.0: to loosen the requirements on the mux event type used by
DMA builtins. - 0.78.0: to introduce mux builtins for sub-group, work-group, and
vector-group operations. - 0.79.0: to introduce mux builtins for sub-group shuffle operations.
- 0.80.0: to introduce support for 64-bit atomic operations.
- 0.77.0: to loosen the requirements on the mux event type used by
- The
compiler::ImageArgumentSubstitutionPass
now replaces sampler typed
parameters in kernel functions with i32 parameters via a wrapper function. - The host target as a consequence now passes samplers to kernels as 32-bit
integer arguments, not as integer arguments disguised as pointer values. - The
compiler::utils::ReplaceBarriersPass
has been replaced with the
compiler::utils::LowerToMuxBuiltinsPass
. - The
compiler::utils::HandleBarriersPass
has been renamed to the
compiler::utils::WorkItemLoopsPass
. - The
compiler::utils::createLoop
API has moved its list of IVs parameter
into itscompiler::utils::CreateLoopOpts
parameter. It can now also set the
IV names via a second CreateLoopOpts field. - Support for LLVM versions is now limited to LLVM 17 and LLVM 18. Support for
earlier LLVM versions has been removed. - Support for FMA (fused multiply-add) is required for the device. For the host
device for x86-64, this means only x86-64-v3 and newer are supported. This
roughly translates to 2015 or newer, both for Intel and for AMD. - Although hardware support for FMA is available on all platforms we currently
test, if you are using OCK on a platform we do not test and encounter
issues, please let us know by opening an issue! - compiler-utils library has been split into compiler-pipeline and
compiler-binary-metadata to allow use of compiler pipeline utilities without
the binary metadata requirements. Both will be needed for mux targets. - The utility function
addParamToAllFunctions
has been moved to
ReplaceLocalModuleScopeVariablesPas
s and renamed as it is only used there. - OpenCL-Headers now fetches from the github repo with tag v2024.05.08.This can
be overridden using -DFETCHCONTENT_SOURCE_DIR_OPENCL_HEADERS to point to a
different repo.