Intel Software OneAPI Level Zero Sysman Overview

Slide Note
Embed
Share

Introducing Intel Software OneAPI Level Zero Sysman, a powerful API that provides fine-grained control over accelerator devices, offering features such as monitoring power profiles, controlling device performance, and managing peer-to-peer interconnects. With a focus on security and flexibility, Level Zero Sysman aims to deliver a public API for efficient management of various compute device architectures.


Uploaded on Aug 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Intel Software OneAPI Level Zero Sysman SaiKishore Konda, Ravindra babu Ganapathi XDC 2021 Sept16,2021

  2. AGENDA AGENDA Introduction Goals Level zero Sysman architecture Sysman resources Sysman Init & resources flowgraph Example usage Python based Sysman CLI example and demo Questions

  3. Why ONE API Why ONE API

  4. oneAPI and Level-Zero Software Stack

  5. oneAPI and Level-Zero Software Stack Level Zero APIs Core Tools System Management

  6. Level Zero API Objectives Level Zero API Objectives Single API to cover vector, Matrix and Spatial Compute Primary low level driver interface for language runtime libraries Fine gain control over accelerator capabilities Low latency and high performant interface to device Open source and standardized & It can be implemented by any vendor for any type of accelerator While heavily influenced by GPU architecture, the Level-Zero APIs are designed to be supportable across different compute device architectures, such as FPGAs and other accelerators.

  7. Level zero Sysman Goals Level zero Sysman Goals Deliver public API to manage accelerator devices: Monitor and control device power profiles. Monitor and control accelerator performance. Access to Reliability, Availability and Serviceability features. Monitor and control high speed peer-to-peer interconnects. Provide event driven model for device changes. Device resets and updates.

  8. Security Security All device modification operations require default admin privileges imposed at the OS layer (not imposed by Level0).

  9. Sysman System Diagram (overview) End user Systems administrator Server management tools Host System Resource monitoring Daemon GUI Command-line tool Control Level0 Sysman API Telemetry XPU2 XPU1 XPUn Intel

  10. Sysman Architecture Diagram User space Application Level0 (C/C++ API) Level0 Sysman Legend: OS Kernel space Customer tools PCIe bus driver Accelerator KMD System KMD Telemetry KMD Level0 public package Hardware Intel device drivers XPU device Accelerator device function System Telemetry device function OS drivers device function Depends on

  11. Sysman Sysman System Diagram (resources) System Diagram (resources) Application Level0 Sysman API Reliability management Workload management System management Power management HW event management Scale-out management RAS Events Fabric Frequency Utilization Device Diagnostics Power Performance Firmware Fan Scheduler Standby Temperature Memory PSU XPU device System management Power unit Accelerators Fabric hardware Intel

  12. SYSMAN INIT CODE EXAMPLE SYSMAN INIT CODE EXAMPLE Enables driver initialization and dependencies for system management Sysman ZES_ENABLE_SYSMAN {0, 1} Intel

  13. Sysman Execution Diagram (initialization) Legend 1 Application flow XPU device Initialize Level0 Loader Level0 flow PCIe bus driver Load driver Sysman library 2 3 OS flow Ioctl to get device SKU properties Accelerator device function Accelerator device driver Level0 Sysman library Driver flow 4 Device flow System device driver System device function C/C++ App Initialize data structures for every XPU device and manageable resources on the device. 3rd party driver Telemetry device driver Telemetry device function Intel driver Sysman device data structure Level0 Sysman resource properties Sysman resource properties HW block

  14. Sysman Execution Diagram (resources) Legend 1 Application flow XPU device Enumerate devices Level0 Core Level0 flow PCIe bus driver 3 2 Get resources handles OS flow Enumerate resources Level0 Sysman library Accelerator device function Accelerator device driver Driver flow 4 Get resource properties Device flow Level0 Sysman library System device driver System device function C/C++ App 3rd party driver Telemetry device driver Telemetry device function Intel driver Sysman device data structure Level0 5 Sysman resource properties Sysman resource properties Return static resource properties HW block

  15. Example usage: Example usage: Intel

  16. Python Command Line Interface (CLI) Tool Python Command Line Interface (CLI) Tool Initial CLI tool implemented to query and configure the system Commands map to SysMan API, serves as validation tool Implemented initial versions of all four listing formats Simple stdout print, table, csv and xml file Added support for driver selection in the event there are multiple supported driver types Added this tool at open source as reference implementation Intel

  17. CLI Help CLI Help Intel

  18. Available Devices Available Devices Gen9 and DG1 Gen9 and DG1 Intel

  19. Zesysman Zesysman usage demo on DG1 usage demo on DG1 ./zesysman show-processes ./zesysman show-telemetry Intel

  20. Zesysman Zesysman utilization demo on DG1 .. utilization demo on DG1 .. ./zesysman show-util Intel

  21. Collectd tool Intel

  22. PublicRepositories Level Zero Specification https://spec.oneapi.com/versions/latest/elements/l0/source/index.html Level Zero Loader (Device/Vendor independent) https://github.com/oneapi-src/level-zero Level Zero Intel GPU Driver https://github.com/intel/compute-runtime Level Zero Tests https://github.com/oneapi-src/level-zero-tests Intel

  23. Questions?

  24. 24

Related