Intel Software OneAPI Level Zero Sysman Overview

undefined
 
OneAPI Level Zero Sysman
 
Intel Software
 
SaiKishore Konda, Ravindra babu Ganapathi
 
XDC 2021
 Sept16,2021
 
A
G
E
N
D
A
 
 Introduction
 Goals
 Level zero Sysman architecture
 Sysman resources
 Sysman Init & resources flowgraph
 Example usage
 Python based Sysman CLI example and demo
 Questions
 
W
h
y
 
O
N
E
 
A
P
I
 
oneAPI
 and Level-Zero Software Stack
 
oneAPI
 and Level-Zero Software 
Stack
Core
Tools
System
Management
 
Level Zero APIs
 
L
e
v
e
l
 
Z
e
r
o
 
A
P
I
 
O
b
j
e
c
t
i
v
e
s
 
Single API to cover vector, Matrix and Spatial Compute
Primary low level driver interface for language runtime libraries
Fine gain control over accelerator capabilities
Low latency and high performant interface to  device
Open source and standardized & 
It can be implemented by any vendor for any type of
accelerator
While heavily influenced by GPU architecture, the Level-Zero APIs are designed to be
supportable across different compute device architectures, such as FPGAs
 and other
accelerators.
 
 
L
e
v
e
l
 
z
e
r
o
 
S
y
s
m
a
n
 
G
o
a
l
s
 
Deliver public API to manage accelerator devices
:
Monitor and control device power profiles.
Monitor and control accelerator performance.
Access to Reliability, Availability and Serviceability features.
Monitor and control high speed peer-to-peer interconnects.
Provide event driven model for device changes.
Device resets and updates.
 
S
e
c
u
r
i
t
y
 
All device modification operations require default admin privileges imposed at the OS
layer (not imposed by Level0).
 
Sysman – System Diagram (overview)
System Resource
monitoring Daemon
Command-line tool
GUI
XPU1
XPU2
XPU
n
Level0 Sysman API
 
Telemetry
 
Control
 
 
End user
 
Systems administrator
 
Server management tools
 
Host
 
Intel
 
Sysman – Architecture Diagram
Accelerator KMD
Level0 Sysman
 
Hardware
 
OS Kernel space
 
Legend:
Level0 public
package
Intel device
drivers
 
Depends on
XPU device
Accelerator
device function
System
device function
Telemetry
device function
PCIe bus driver
System KMD
Telemetry KMD
OS drivers
 
User space
Customer
tools
Application
Level0 (C/C++ API)
 
S
y
s
m
a
n
 
 
S
y
s
t
e
m
 
D
i
a
g
r
a
m
 
(
r
e
s
o
u
r
c
e
s
)
Application
Level0 Sysman API
XPU device
Frequency
Power
Standby
Temperature
Memory
Utilization
Fan
PSU
Scheduler
RAS
Performance
Firmware
Events
Device
Fabric
Diagnostics
Power management
Workload
management
System management
Reliability
management
HW event
management
Scale-out
management
Power unit
Accelerators
System
management
Fabric hardware
 
Intel
 
S
Y
S
M
A
N
 
I
N
I
T
 
C
O
D
E
 
E
X
A
M
P
L
E
 
Intel
Sysman device data structure
Sysman resource
properties
 
Sysman – Execution Diagram (initialization)
Level0 Loader
C/C++
App
 
Initialize
Level0 Sysman
library
Accelerator
device driver
PCIe bus driver
XPU device
System
device driver
Telemetry
device driver
Sysman resource
properties
Accelerator
device function
System
device function
Telemetry
device function
 
Load driver
Sysman library
 
Initialize data structures
for every XPU device and
manageable resources on
the device.
 
Ioctl to get device
SKU properties
 
Application flow
 
Legend
 
Level0 flow
3
rd
 party driver
Intel driver
HW block
 
OS flow
 
Driver flow
 
Device flow
Level0
1
2
3
4
Sysman device data structure
Sysman resource
properties
 
Sysman – Execution Diagram (resources)
Level0 Core
C/C++
App
 
Enumerate devices
Level0 Sysman
library
Accelerator
device driver
PCIe bus driver
XPU device
System
device driver
Telemetry
device driver
Sysman resource
properties
Accelerator
device function
System
device function
Telemetry
device function
 
Enumerate resources
 
Application flow
 
Legend
 
Level0 flow
3
rd
 party driver
Intel driver
HW block
 
OS flow
 
Driver flow
 
Device flow
Level0
 
Get resource
properties
Level0 Sysman
library
1
2
 
Get resources handles
 
Return static
resource properties
3
4
5
 
E
x
a
m
p
l
e
 
u
s
a
g
e
:
 
Intel
 
P
y
t
h
o
n
 
C
o
m
m
a
n
d
 
L
i
n
e
 
I
n
t
e
r
f
a
c
e
 
(
C
L
I
)
 
T
o
o
l
 
Initial CLI tool implemented to query and configure the system
Commands map to SysMan API, serves as validation tool
Implemented initial versions of all four listing formats
Simple stdout print, table, csv and xml file
Added support for driver selection in the event there are multiple supported driver
types
Added this tool at open source as reference implementation
 
Intel
 
C
L
I
 
H
e
l
p
 
Intel
 
A
v
a
i
l
a
b
l
e
 
D
e
v
i
c
e
s
 
 
G
e
n
9
 
a
n
d
 
D
G
1
 
Intel
 
Z
e
s
y
s
m
a
n
 
u
s
a
g
e
 
d
e
m
o
 
o
n
 
D
G
1
 
./zesysman –show-processes
 
 
 
./zesysman –show-telemetry
 
 
Intel
 
Z
e
s
y
s
m
a
n
 
u
t
i
l
i
z
a
t
i
o
n
 
d
e
m
o
 
o
n
 
D
G
1
 
.
.
 
./zesysman –show-util
 
Intel
 
Collectd tool
 
Intel
 
Public
 
Repositories
 
Level Zero Specification
https://spec.oneapi.com/versions/latest/elements/l0/source/index.html
Level Zero Loader (Device/Vendor independent)
https://github.com/oneapi-src/level-zero
Level Zero Intel GPU Driver
https://github.com/intel/compute-runtime
Level Zero Tests
https://github.com/oneapi-src/level-zero-tests
 
 
 
Intel
 
Questions?
 
undefined
Slide Note
Embed
Share

Introducing Intel Software OneAPI Level Zero Sysman, a powerful API that provides fine-grained control over accelerator devices, offering features such as monitoring power profiles, controlling device performance, and managing peer-to-peer interconnects. With a focus on security and flexibility, Level Zero Sysman aims to deliver a public API for efficient management of various compute device architectures.

  • Intel
  • Software
  • OneAPI
  • Level Zero
  • Sysman

Uploaded on Aug 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Intel Software OneAPI Level Zero Sysman SaiKishore Konda, Ravindra babu Ganapathi XDC 2021 Sept16,2021

  2. AGENDA AGENDA Introduction Goals Level zero Sysman architecture Sysman resources Sysman Init & resources flowgraph Example usage Python based Sysman CLI example and demo Questions

  3. Why ONE API Why ONE API

  4. oneAPI and Level-Zero Software Stack

  5. oneAPI and Level-Zero Software Stack Level Zero APIs Core Tools System Management

  6. Level Zero API Objectives Level Zero API Objectives Single API to cover vector, Matrix and Spatial Compute Primary low level driver interface for language runtime libraries Fine gain control over accelerator capabilities Low latency and high performant interface to device Open source and standardized & It can be implemented by any vendor for any type of accelerator While heavily influenced by GPU architecture, the Level-Zero APIs are designed to be supportable across different compute device architectures, such as FPGAs and other accelerators.

  7. Level zero Sysman Goals Level zero Sysman Goals Deliver public API to manage accelerator devices: Monitor and control device power profiles. Monitor and control accelerator performance. Access to Reliability, Availability and Serviceability features. Monitor and control high speed peer-to-peer interconnects. Provide event driven model for device changes. Device resets and updates.

  8. Security Security All device modification operations require default admin privileges imposed at the OS layer (not imposed by Level0).

  9. Sysman System Diagram (overview) End user Systems administrator Server management tools Host System Resource monitoring Daemon GUI Command-line tool Control Level0 Sysman API Telemetry XPU2 XPU1 XPUn Intel

  10. Sysman Architecture Diagram User space Application Level0 (C/C++ API) Level0 Sysman Legend: OS Kernel space Customer tools PCIe bus driver Accelerator KMD System KMD Telemetry KMD Level0 public package Hardware Intel device drivers XPU device Accelerator device function System Telemetry device function OS drivers device function Depends on

  11. Sysman Sysman System Diagram (resources) System Diagram (resources) Application Level0 Sysman API Reliability management Workload management System management Power management HW event management Scale-out management RAS Events Fabric Frequency Utilization Device Diagnostics Power Performance Firmware Fan Scheduler Standby Temperature Memory PSU XPU device System management Power unit Accelerators Fabric hardware Intel

  12. SYSMAN INIT CODE EXAMPLE SYSMAN INIT CODE EXAMPLE Enables driver initialization and dependencies for system management Sysman ZES_ENABLE_SYSMAN {0, 1} Intel

  13. Sysman Execution Diagram (initialization) Legend 1 Application flow XPU device Initialize Level0 Loader Level0 flow PCIe bus driver Load driver Sysman library 2 3 OS flow Ioctl to get device SKU properties Accelerator device function Accelerator device driver Level0 Sysman library Driver flow 4 Device flow System device driver System device function C/C++ App Initialize data structures for every XPU device and manageable resources on the device. 3rd party driver Telemetry device driver Telemetry device function Intel driver Sysman device data structure Level0 Sysman resource properties Sysman resource properties HW block

  14. Sysman Execution Diagram (resources) Legend 1 Application flow XPU device Enumerate devices Level0 Core Level0 flow PCIe bus driver 3 2 Get resources handles OS flow Enumerate resources Level0 Sysman library Accelerator device function Accelerator device driver Driver flow 4 Get resource properties Device flow Level0 Sysman library System device driver System device function C/C++ App 3rd party driver Telemetry device driver Telemetry device function Intel driver Sysman device data structure Level0 5 Sysman resource properties Sysman resource properties Return static resource properties HW block

  15. Example usage: Example usage: Intel

  16. Python Command Line Interface (CLI) Tool Python Command Line Interface (CLI) Tool Initial CLI tool implemented to query and configure the system Commands map to SysMan API, serves as validation tool Implemented initial versions of all four listing formats Simple stdout print, table, csv and xml file Added support for driver selection in the event there are multiple supported driver types Added this tool at open source as reference implementation Intel

  17. CLI Help CLI Help Intel

  18. Available Devices Available Devices Gen9 and DG1 Gen9 and DG1 Intel

  19. Zesysman Zesysman usage demo on DG1 usage demo on DG1 ./zesysman show-processes ./zesysman show-telemetry Intel

  20. Zesysman Zesysman utilization demo on DG1 .. utilization demo on DG1 .. ./zesysman show-util Intel

  21. Collectd tool Intel

  22. PublicRepositories Level Zero Specification https://spec.oneapi.com/versions/latest/elements/l0/source/index.html Level Zero Loader (Device/Vendor independent) https://github.com/oneapi-src/level-zero Level Zero Intel GPU Driver https://github.com/intel/compute-runtime Level Zero Tests https://github.com/oneapi-src/level-zero-tests Intel

  23. Questions?

  24. 24

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#