This sample is a simple LD_PRELOAD based tool that allows to collect such GPU hardware metrics as execution unit (EU) active, stall and idle ratios attributed to OpenCL(TM) kernels based on query metrics collection mode.
As a result, table like the following will be printed. For each kernel its call count, total time and metric values will be shown.
=== Device Metrics: ===
Total Execution Time (ns): 417558875
Total Kernel Time (ns): 168604332
Kernel, Calls, Time (ns), Time (%), Average (ns), EU Active (%), EU Stall (%), EU Idle (%)
GEMM, 4, 168604332, 100.00, 42151083, 65.01, 34.89, 0.10
To set target device and sub-device to collect metrics from one can specify PTI_DEVICE_ID
and PTI_SUB_DEVICE_ID
environment variables.
- Linux
- Windows
- CMake (version 3.12 and above)
- Git (version 1.8 and above)
- Python (version 2.7 and above)
- OpenCL(TM) ICD Loader
- Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver
- Intel(R) Metrics Discovery Application Programming Interface
Run the following commands to build the sample:
cd <pti>/samples/cl_gpu_query
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Use this command line to run the tool:
./cl_gpu_query <target_application>
One may use cl_gemm as target application:
./cl_gpu_query ../../cl_gemm/build/cl_gemm gpu
Since Intel(R) Metrics Discovery Application Programming Interface library is loaded at runtime, one may need to set its path explicitly, e.g.:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib ./cl_gpu_query ../../cl_gemm/build/cl_gemm gpu
On Linux one may need to enable metrics collection for non-root users:
sudo echo 0 > /proc/sys/dev/i915/perf_stream_paranoid
Use Microsoft* Visual Studio x64 command prompt to run the following commands and build the sample:
cd <pti>\samples\cl_gpu_query
mkdir build
cd build
cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_LIBRARY_PATH=<opencl_icd_lib_path> ..
nmake
Use this command line to run the tool:
cl_gpu_query.exe <target_application>
One may use cl_gemm as target application:
cl_gpu_query.exe ..\..\cl_gemm\build\cl_gemm.exe
Since Intel(R) Metrics Discovery Application Programming Interface library is loaded at runtime, one may need to set its path explicitly (see the output of cmake), e.g.:
set PATH=%PATH%;C:\Windows\system32\DriverStore\FileRepository\igdlh64.inf_amd64_d59561bc9241aaf5
cl_gpu_query.exe ..\..\cl_gemm\build\cl_gemm.exe