Skip to content

Commit

Permalink
New Sysman API for VF telemetry (#254)
Browse files Browse the repository at this point in the history
Resolves: #248

Signed-off-by: Kumar, Sanil <[email protected]>
  • Loading branch information
sanilkumar0 authored Feb 1, 2024
1 parent d00e2bb commit 05880d1
Show file tree
Hide file tree
Showing 3 changed files with 361 additions and 1 deletion.
88 changes: 88 additions & 0 deletions scripts/sysman/EXT_Exp_VirtualFunctionManagement.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
<%
import re
from templates import helper as th
%><%
OneApi=tags['$OneApi']
x=tags['$x']
X=x.upper()
s=tags['$s']
S=s.upper()
%>
:orphan:

.. _ZES_experimental_virtual_function_management:

========================================
Virtual Function Management Extension
========================================

API
----

* Functions

* ${s}DeviceEnumActiveVFExp
* ${s}VFManagementGetVFPropertiesExp
* ${s}VFManagementGetVFMemoryUtilizationExp
* ${s}VFManagementGetVFEngineUtilizationExp
* ${s}VFManagementSetVFTelemetryModeExp
* ${s}VFManagementSetVFTelemetrySamplingIntervalExp

* Enumerations

* ${s}_vf_management_exp_version_t
* ${s}_vf_info_mem_type_exp_flags_t
* ${s}_vf_info_util_exp_flags_t

* Structures

* ${s}_vf_exp_properties_t
* ${s}_vf_util_mem_exp_t
* ${s}_vf_util_engine_exp_t

Virtual Function Management
~~~~~~~~~~~~~~~~~~~~~~~~~~~
This feature adds the ability to retrieve telemetry from PF domain for monitoring per VF memory and engine utilization.
This telemetry is used to determine if a VM has oversubscribed GPU memory or observe engine business for a targeted workload.
If VF has no activity value to report, then implementation shall reflect that appropriately in ${s}_vf_util_engine_exp_t struct so that percentage
calculation results in value of 0.

The following pseudo-code demonstrates a sequence for obtaining the engine activity for all Virtual Functions from Physical Function environment:

.. parsed-literal::
// Gather count of VF handles
uint32_t numVf = 0;
${s}_vf_exp_properties_t vfProps {};
${s}DeviceEnumActiveVFExp(hDevice, &numVf, nullptr);
// Allocate memory for vf handles and call back in to gather handles
std::vector<${s}_vf_handle_t> vfs(numVf, nullptr);
${s}DeviceEnumActiveVFExp(hDevice, &numVf, vfs.data());
// Gather VF properties
std::vector <${s}_vf_exp_properties_t> vfProps(numVf);
for (uint32_t i = 0; i < numVf; i++) {
${s}VFManagementGetVFPropertiesExp(vfs[i], &vfProps[i]);
}
// Detect the info types a particular VF supports
// Using VF# 0 to demonstrate how to detect engine info type and query engine util info
${s}_vf_handle_t activeVf = vfs[0];
uint32_t count = 1;
if (vfProps[0].flags & ZES_VF_INFO_ENGINE) {
${s}_vf_util_engine_exp_t engineUtil0 = {};
${s}VFManagementGetVFEngineUtilizationExp(activeVf, &count, &engineUtil0);
sleep(1)
${s}_vf_util_engine_exp_t engineUtil1 = {};
${s}VFManagementGetVFEngineUtilizationExp(activeVf, &count, &engineUtil1);
// Use formula to calculate engine utilization % based on the 2 snapshots above
}
// Demonstrate using setter to switch off Engine telemetry for VF0 and then check if Getter returns INVALID
${s}VFManagementSetVFTelemetryModeExp(activeVf, ZES_VF_INFO_ENGINE, false);
${x}_result_t res = ${s}VFManagementGetVFEngineUtilizationExp(activeVf, &count, &engineUtil0);
if (res != ZES_RESULT_SUCCESS) {
printf("Engine utilization successfully disabled for VF");
}
20 changes: 19 additions & 1 deletion scripts/sysman/common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,12 @@ class: $sOverclock
name: "$s_overclock_handle_t"
version: "1.5"
--- #--------------------------------------------------------------------------
type: handle
desc: "Handle for a Sysman virtual function management domain"
class: $sVFManagement
name: "$s_vf_handle_t"
version: "1.9"
--- #--------------------------------------------------------------------------
type: enum
desc: "Defines structure types"
name: $s_structure_type_t
Expand Down Expand Up @@ -226,7 +232,19 @@ etors:
- name: SUB_DEVICE_EXP_PROPERTIES
value: "0x00020004"
desc: $s_subdevice_exp_properties_t
version: "1.9"
version: "1.9"
- name: VF_EXP_PROPERTIES
value: "0x00020005"
desc: $s_vf_exp_properties_t
version: "1.9"
- name: VF_UTIL_MEM_EXP
value: "0x00020006"
desc: $s_vf_util_mem_exp_t
version: "1.9"
- name: VF_UTIL_ENGINE_EXP
value: "0x00020007"
desc: $s_vf_util_engine_exp_t
version: "1.9"
--- #-------------------------------------------------------------------------
type: struct
desc: "Base for all properties types"
Expand Down
254 changes: 254 additions & 0 deletions scripts/sysman/virtualFunctionManagement.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
#
# Copyright (C) 2024 Intel Corporation
#
# SPDX-License-Identifier: MIT
#
# See YaML.md for syntax definition
#
--- #--------------------------------------------------------------------------
type: header
desc: "Intel $OneApi Level-Zero Sysman Extension APIs for Virtual Function Management Properties"
version: "1.9"
--- #--------------------------------------------------------------------------
type: macro
desc: "Virtual Function Management Extension Name"
version: "1.9"
name: $S_VIRTUAL_FUNCTION_MANAGEMENT_EXP_NAME
value: '"$XS_experimental_virtual_function_management"'
--- #--------------------------------------------------------------------------
type: enum
desc: "Virtual Function Management Extension Version(s)"
version: "1.9"
name: $s_vf_management_exp_version_t
etors:
- name: "1_0"
value: "$X_MAKE_VERSION( 1, 0 )"
desc: "version 1.0"
--- #--------------------------------------------------------------------------
type: enum
desc: "Virtual function memory types"
version: "1.9"
class: $sVFManagement
name: $s_vf_info_mem_type_exp_flags_t
etors:
- name: MEM_TYPE_SYSTEM
desc: "System memory"
- name: MEM_TYPE_DEVICE
desc: "Device local memory"
--- #--------------------------------------------------------------------------
type: enum
desc: "Virtual function utilization flag bit fields"
version: "1.9"
class: $sVFManagement
name: $s_vf_info_util_exp_flags_t
etors:
- name: INFO_NONE
desc: "No info associated with virtual function"
- name: INFO_MEM_CPU
desc: "System memory utilization associated with virtual function"
- name: INFO_MEM_GPU
desc: "Device memory utilization associated with virtual function"
- name: INFO_ENGINE
desc: 'Engine utilization associated with virtual function'
--- #--------------------------------------------------------------------------
type: struct
desc: "Virtual function management properties"
version: "1.9"
class: $sVFManagement
name: $s_vf_exp_properties_t
base: $s_base_properties_t
members:
- type: $s_pci_address_t
name: "address"
desc: "[out] Virtual function BDF address"
- type: $s_uuid_t
name: uuid
desc: "[out] universal unique identifier of the device"
- type: $s_vf_info_util_exp_flags_t
name: "flags"
desc: "[out] utilization flags available. May be 0 or a valid combination of $s_vf_info_util_exp_flag_t."
--- #--------------------------------------------------------------------------
type: struct
desc: "Provides memory utilization values for a virtual function"
version: "1.9"
class: $sVFManagement
name: $s_vf_util_mem_exp_t
base: $s_base_state_t
members:
- type: $s_vf_info_mem_type_exp_flags_t
name: "memTypeFlags"
desc: "[out] Memory type flags."
- type: uint64_t
name: "free"
desc: "[out] Free memory size in bytes."
- type: uint64_t
name: "size"
desc: "[out] Total allocatable memory in bytes."
- type: uint64_t
name: "timestamp"
desc: "[out] Wall clock time from VF when value was sampled."
--- #--------------------------------------------------------------------------
type: struct
desc: "Provides engine utilization values for a virtual function"
version: "1.9"
class: $sVFManagement
name: $s_vf_util_engine_exp_t
base: $s_base_state_t
members:
- type: $s_engine_group_t
name: "type"
desc: "[out] The engine group."
- type: uint64_t
name: "activeCounterValue"
desc: "[out] Represents active counter."
- type: uint64_t
name: "samplingCounterValue"
desc: "[out] Represents counter value when activeCounterValue was sampled."
- type: uint64_t
name: "timestamp"
desc: "[out] Wall clock time when the activeCounterValue was sampled."
--- #--------------------------------------------------------------------------
type: function
desc: "Get handle of virtual function modules"
version: "1.9"
class: $sDevice
name: EnumActiveVFExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_device_handle_t
name: hDevice
desc: "[in] Sysman handle of the device."
- type: "uint32_t*"
name: pCount
desc: |
[in,out] pointer to the number of components of this type.
if count is zero, then the driver shall update the value with the total number of components of this type that are available.
if count is greater than the number of components of this type that are available, then the driver shall update the value with the correct number of components.
- type: "$s_vf_handle_t*"
name: phVFhandle
desc: |
[in,out][optional][range(0, *pCount)] array of handle of components of this type.
if count is less than the number of components of this type that are available, then the driver shall only retrieve that number of component handles.
--- #--------------------------------------------------------------------------
type: function
desc: "Get virtual function management properties"
version: "1.9"
class: $sVFManagement
name: GetVFPropertiesExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_vf_handle_t
name: hVFhandle
desc: "[in] Sysman handle for the VF component."
- type: $s_vf_exp_properties_t*
name: pProperties
desc: "[in,out] Will contain VF properties."
--- #--------------------------------------------------------------------------
type: function
desc: "Get memory activity stats for each available memory types associated with Virtual Function (VF)"
version: "1.9"
class: $sVFManagement
name: GetVFMemoryUtilizationExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_vf_handle_t
name: hVFhandle
desc: "[in] Sysman handle for the component."
- type: "uint32_t*"
name: pCount
desc: |
[in,out] Pointer to the number of VF memory stats descriptors.
- if count is zero, the driver shall update the value with the total number of memory stats available.
- if count is greater than the total number of memory stats available, the driver shall update the value with the correct number of memory stats available.
- The count returned is the sum of number of VF instances currently available and the PF instance.
- type: $s_vf_util_mem_exp_t*
name: pMemUtil
desc: |
[in,out][optional][range(0, *pCount)] array of memory group activity counters.
- if count is less than the total number of memory stats available, then driver shall only retrieve that number of stats.
- the implementation shall populate the vector pCount-1 number of VF memory stats.
--- #--------------------------------------------------------------------------
type: function
desc: "Get engine activity stats for each available engine group associated with Virtual Function (VF)"
version: "1.9"
class: $sVFManagement
name: GetVFEngineUtilizationExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_vf_handle_t
name: hVFhandle
desc: "[in] Sysman handle for the component."
- type: "uint32_t*"
name: pCount
desc: |
[in,out] Pointer to the number of VF engine stats descriptors.
- if count is zero, the driver shall update the value with the total number of engine stats available.
- if count is greater than the total number of engine stats available, the driver shall update the value with the correct number of engine stats available.
- The count returned is the sum of number of VF instances currently available and the PF instance.
- type: $s_vf_util_engine_exp_t*
name: pEngineUtil
desc: |
[in,out][optional][range(0, *pCount)] array of engine group activity counters.
- if count is less than the total number of engine stats available, then driver shall only retrieve that number of stats.
- the implementation shall populate the vector pCount-1 number of VF engine stats.
--- #--------------------------------------------------------------------------
type: function
desc: "Configure utilization telemetry enabled or disabled associated with Virtual Function (VF)"
version: "1.9"
class: $sVFManagement
name: SetVFTelemetryModeExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_vf_handle_t
name: hVFhandle
desc: "[in] Sysman handle for the component."
- type: $s_vf_info_util_exp_flags_t
name: "flags"
desc: "[in] utilization flags to enable or disable. May be 0 or a valid combination of $s_vf_info_util_exp_flag_t."
- type: $x_bool_t
name: "enable"
desc: "[in] Enable utilization telemetry."
--- #--------------------------------------------------------------------------
type: function
desc: "Set sampling interval to monitor for a particular utilization telemetry associated with Virtual Function (VF)"
version: "1.9"
class: $sVFManagement
name: SetVFTelemetrySamplingIntervalExp
details:
- "The application may call this function from simultaneous threads."
- "The implementation of this function should be lock-free."
params:
- type: $s_vf_handle_t
name: hVFhandle
desc: "[in] Sysman handle for the component."
- type: $s_vf_info_util_exp_flags_t
name: "flag"
desc: "[in] utilization flags to set sampling interval. May be 0 or a valid combination of $s_vf_info_util_exp_flag_t."
- type: uint64_t
name: "samplingInterval"
desc: "[in] Sampling interval value."
--- #--------------------------------------------------------------------------

type: class
desc: "C++ wrapper for a Sysman virtual function management group"
version: "1.9"
name: $sVFManagement
owner: $sDevice
members:
- type: $s_vf_handle_t
name: handle
desc: "[in] handle of Sysman virtual function object"
init: nullptr
- type: $sDevice*
name: pDevice
desc: "[in] pointer to owner object"

0 comments on commit 05880d1

Please sign in to comment.