Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is
the Control Plane
Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, et al.
presented by Jimmy You
EECS 582 – F16
1
Background
Today’s hardware is fast!
Typical commodity desktop (Dell PowerEdge R520 ~$1000):
EECS 582 – F16
2
10G NIC
~2us / 1KB pkt
6-core CPU
RAID w/ 1G cache
~25 us / 1KB write
Background
But Data Center is not as fast as hardware.
EECS 582 – F16
3
8.7 us
163 us
Background
Who is dragging?
EECS 582 – F16
4
Background
Who is dragging?
System Calls 
are slow:
epoll : 
27% time of read
recv  : 
11% time of read
send  : 
37% time of read
fsync : 
84% time of write
EECS 582 – F16
5
Motivation
Design Goals:
Skip kernel for data-plane operations (low overhead)
Retain classical server OS features (transparency)
Appropriate OS/hardware abstractions (virtualization)
EECS 582 – F16
6
Hardware I/O virtualization
Already de facto on NICs
Multiplexing:
SR-IOV
: split into virtual NICs, each w/ own queues, registers, etc.
Protection:
IOMMU 
(e.g. intel VT-d): devices use virtual memory of apps
Packet Filter
: control the I/O
I/O Scheduling:
Rate limiting, packet scheduling
EECS 582 – F16
7
Traditional OS
EECS 582 – F16
8
Apps
Kernel
Hardware
Libs
API
Multiplexing
Resource limit
Naming
Access Ctrl
I/O Scheduling
Protection
I/O Processing
Skipping the kernel
EECS 582 – F16
9
Apps
Kernel
Hardware
Libs
API
Multiplexing
Resource limit
Naming
Access Ctrl
I/O Scheduling
Protection
I/O Processing
Skipping the kernel
EECS 582 – F16
10
Apps
Kernel
Hardware
Libs
API
Multiplexing
Resource limit
Naming
Access Ctrl
I/O Scheduling
Protection
I/O Processing
Skipping the kernel
EECS 582 – F16
11
Apps
“Kernel”
Hardware
libos
Virtual Interface
Control Plane
Data Plane
User Space
HW Space
Data
Data
Control
Control
Hardware Model
NICs (Multiplexing, Protection, Scheduling)
Storage
VSIC (Virtual Storage Interface Controller)
each w/ queues etc.
VSA (Virtual Storage Areas)
mapped to physical devices
associated with VSICs
VSA & VSIC : many-to-many mapping
EECS 582 – F16
12
Control Plane Interface
VIC (Virtual Interface Card)
Apps can create/delete VICs, associate them to doorbells
doorbells (like interrupt?)
associated with events on VICs
filter creation
e.g. create_filter(rx,*,tcp.port == 80)
EECS 582 – F16
13
Control Plane Features
Access control
enforced by filters
infrequently invoked (during set-up etc.)
Resource limiting
send commands to hardware I/O schedulers
Naming
VFS in kernel
actual storage implemented in apps
EECS 582 – F16
14
Network Data Interface
Apps send/receive directly through sets of queues
filters applied for multiplexing
doorbell used for asynchronous notification (e.g. packet arrival)
both native (w/ zero-copy) and POSIX are implemented
EECS 582 – F16
15
Storage Data Interface
VSA supports 
read
, 
write
, 
flush
persistent data structure (log, queue)
modified Redis by 109 LOC
operations immediately persistent on disk
eliminate marshaling (layout in memory = in disk)
data structure specific caching & early allocation
EECS 582 – W16
16
Evaluation
1.
UDP echo server
2.
Memcached key-value store
3.
Redis NoSQL store
4.
HTTP load balancer (haproxy)
5.
IP-layer middle box
6.
Performance isolation (rate limiting)
EECS 582 – W16
17
Case 1: UDP echo
EECS 582 – W16
18
Case 2: Memcached
EECS 582 – W16
19
Case 3: Redis NoSQL
EECS 582 – W16
20
Case 3: Redis NoSQL cont’d
EECS 582 – W16
21
Adapted from the original presentation at OSDI’14
9 us
163 us
4 us
31 us
Case 4: HTTP load balancer (haproxy)
EECS 582 – W16
22
Case 5: IP-layer middlebox
EECS 582 – W16
23
Case 6: Performance Isolation
EECS 582 – W16
24
Conclusion
Pros:
much better 
raw
 performance (for I/O intensive Data Center apps)
Redis: up to 9x throughput and 81% speedup
Memcached: scales to 3x throughput
Cons:
some features require hardware functionality that is no yet available
require modification of applications
not clear about storage abstractions
not easy to track behaviors inside the hardware
EECS 582 – W16
25
Discussion
Related work (IX, Exokernel, Multikernel, etc.)
Is Arrakis trading “OS features” for raw performance? How will new
techniques change this trade-off? (SDN, NetFPGA)
And of course, how much does raw performance matter?
Security concerns
EECS 582 – W16
26
Related Work
90’s library Oses
Exokernel
, SPIN, Nemesis
Kernel-bypass
U-Net, Infiniband, Netmap, Moneta-D
High-performance I/O stacks
mTCP, OpenOnLoad, Sandstorm, Aerie
IX
, Dune; Barrelfish (
Multikernel
)
EECS 582 – W16
27
Adapted from the original presentation at OSDI’14
IX, Arrakis, Exokernel, Multikernel
EECS 582 – W16
28
Arrakis is like Exokernel built on Barrelfish (multikernel)
raw performance vs. (everything else)
Two potential (and maybe diverging) direction:
Be hardware-dependent (NetFPGA etc.)
Be software-controllable (SDN etc.)
EECS 582 – W16
29
60’s switchboard operator
Modern Operating Systems
Security concerns
Will bypassing the kernel be safe?
EECS 582 – W16
30
Slide Note
Embed
Share

Arrakis is an innovative operating system that focuses on the control plane, designed to optimize performance in data centers by skipping the kernel for data-plane operations, retaining classical server OS features, and providing appropriate OS/hardware abstractions. It addresses slow system calls, hardware I/O virtualization, and traditional OS API limitations.

  • Operating System
  • Control Plane
  • Data Center
  • Hardware Abstractions
  • Performance Optimization

Uploaded on Sep 29, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, et al. presented by Jimmy You EECS 582 F16 1

  2. Background Today s hardware is fast! Typical commodity desktop (Dell PowerEdge R520 ~$1000): 6-core CPU RAID w/ 1G cache ~25 us / 1KB write 10G NIC ~2us / 1KB pkt EECS 582 F16 2

  3. Background But Data Center is not as fast as hardware. % of processing time (Redis NoSQL) read 8.7 us 163 us write 0% 20% 40% 60% 80% 100% Hardware Kernel App EECS 582 F16 3

  4. Background Who is dragging? EECS 582 F16 4

  5. Background Who is dragging? System Calls are slow: epoll : 27% time of read recv : 11% time of read send : 37% time of read fsync : 84% time of write EECS 582 F16 5

  6. Motivation Design Goals: Skip kernel for data-plane operations (low overhead) Retain classical server OS features (transparency) Appropriate OS/hardware abstractions (virtualization) EECS 582 F16 6

  7. Hardware I/O virtualization Already de facto on NICs Multiplexing: SR SR- -IOV Protection: IOMMU IOMMU (e.g. intel VT-d): devices use virtual memory of apps Packet Filter Packet Filter: control the I/O I/O Scheduling: Rate limiting, packet scheduling Rate limiting, packet scheduling IOV: split into virtual NICs, each w/ own queues, registers, etc. EECS 582 F16 7

  8. Traditional OS API Multiplexing Naming Resource limit Access Ctrl I/O Scheduling Protection I/O Processing Apps Hardware Kernel Libs EECS 582 F16 8

  9. Skipping the kernel API Multiplexing Naming Resource limit Access Ctrl I/O Scheduling Protection I/O Processing Apps Hardware Kernel Libs EECS 582 F16 9

  10. Skipping the kernel API Multiplexing Naming Resource limit Access Ctrl I/O Scheduling Protection I/O Processing Apps Libs Hardware Kernel EECS 582 F16 10

  11. Skipping the kernel Control Plane Data Plane Apps libos Control User Space Data HW Space Kernel Virtual Interface Control Data Hardware EECS 582 F16 11

  12. Hardware Model NICs (Multiplexing, Protection, Scheduling) Storage VSIC (Virtual Storage Interface Controller) each w/ queues etc. VSA (Virtual Storage Areas) mapped to physical devices associated with VSICs VSA & VSIC : many-to-many mapping EECS 582 F16 12

  13. Control Plane Interface VIC (Virtual Interface Card) Apps can create/delete VICs, associate them to doorbells doorbells (like interrupt?) associated with events on VICs filter creation e.g. create_filter(rx,*,tcp.port == 80) EECS 582 F16 13

  14. Control Plane Features Access control enforced by filters infrequently invoked (during set-up etc.) Resource limiting send commands to hardware I/O schedulers Naming VFS in kernel actual storage implemented in apps EECS 582 F16 14

  15. Network Data Interface Apps send/receive directly through sets of queues filters applied for multiplexing doorbell used for asynchronous notification (e.g. packet arrival) both native (w/ zero-copy) and POSIX are implemented EECS 582 F16 15

  16. Storage Data Interface VSA supports read, write, flush persistent data structure (log, queue) modified Redis by 109 LOC operations immediately persistent on disk eliminate marshaling (layout in memory = in disk) data structure specific caching & early allocation EECS 582 W16 16

  17. Evaluation 1. UDP echo server 2. Memcached key-value store 3. Redis NoSQL store 4. HTTP load balancer (haproxy) 5. IP-layer middle box 6. Performance isolation (rate limiting) EECS 582 W16 17

  18. Case 1: UDP echo EECS 582 W16 18

  19. Case 2: Memcached EECS 582 W16 19

  20. Case 3: Redis NoSQL EECS 582 W16 20

  21. Case 3: Redis NoSQL contd Reduced in-mem GET latency by 65% 9 us 4 us Linux Arrakis 0% 20% 40% 60% 80% 100% Hardware Kernel/libIO App Reduced persistent SET latency by 81% 163 us 31 us Linux Arrakis 0% 20% 40% 60% 80% 100% Hardware Kernel/libIO App Adapted from the original presentation at OSDI 14 EECS 582 W16 21

  22. Case 4: HTTP load balancer (haproxy) EECS 582 W16 22

  23. Case 5: IP-layer middlebox EECS 582 W16 23

  24. Case 6: Performance Isolation EECS 582 W16 24

  25. Conclusion Pros: much better raw Redis: up to 9x throughput and 81% speedup Memcached: scales to 3x throughput raw performance (for I/O intensive Data Center apps) Cons: some features require hardware functionality that is no yet available require modification of applications not clear about storage abstractions not easy to track behaviors inside the hardware EECS 582 W16 25

  26. Discussion Related work (IX, Exokernel, Multikernel, etc.) Is Arrakis trading OS features for raw performance? How will new techniques change this trade-off? (SDN, NetFPGA) And of course, how much does raw performance matter? Security concerns EECS 582 W16 26

  27. Related Work 90 s library Oses Exokernel Exokernel, SPIN, Nemesis Kernel-bypass U-Net, Infiniband, Netmap, Moneta-D High-performance I/O stacks mTCP, OpenOnLoad, Sandstorm, Aerie IX, Dune; Barrelfish (Multikernel) Adapted from the original presentation at OSDI 14 EECS 582 W16 27

  28. IX, Arrakis, Exokernel, Multikernel Arrakis is like Exokernel built on Barrelfish (multikernel) IX Arrakis Reduce SysCall overhead Adaptive batching Run to completion No SysCall in data-plane Hardware virtualization No IOMMU No SR-IOV Expect more than what we have Enforcement of network I/O policy Under software control Rely on hardware EECS 582 W16 28

  29. raw performance vs. (everything else) Two potential (and maybe diverging) direction: Be hardware-dependent (NetFPGA etc.) Be software-controllable (SDN etc.) 60 s switchboard operator Modern Operating Systems EECS 582 W16 29

  30. Security concerns Will bypassing the kernel be safe? EECS 582 W16 30

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#