Data Center Storage and Networking

Data Center Storage and
Networking
Hakim Weatherspoon
Assistant Professor, Dept of Computer Science
CS 5413: High Performance Systems and Networking
December 1, 2014
 
Slides from ACM SOSP 2013 presentation on “IOFlow: A Software-Defined Storage
Architecture.” Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis,
Antony Rowstron, Tom Talpey, and Timothy Zhu.  In SOSP'13, Farmington, PA, USA.
November 3-6, 2013. “
 
Overview and Basics
Data Center Networks
Basic switching technologies
Data Center Network Topologies (today and Monday)
Software Routers (eg. Click, Routebricks, NetMap, Netslice)
Alternative Switching Technologies
Data Center Transport
Data Center Software Networking
Software Defined networking (overview, control plane, data
plane, NetFGPA)
Data Center Traffic and Measurements
Virtualizing Networks
Middleboxes
Advanced Topics
Where are we in the semester?
Goals for Today
IOFlow: a software-defined storage architecture
E. Thereska, H. Ballani, G. O'Shea, T. Karagiannis, A.
Rowstron, T. Talpey, R. Black, T. Zhu. ACM Symposium
on Operating Systems Principles (SOSP), October 2013,
pages 182-196.
Background: Enterprise data centers
General purpose applications
Application runs on several VMs
Separate network for VM-to-VM 
    traffic and 
VM-to-Storage
 traffic
Storage is virtualized
Resources are 
shared
2
Motivation
It is hard to provide such SLAs today
Want: predictable application behaviour and performance
Need system to provide end-to-end SLAs, e.g.,
Guaranteed storage bandwidth B
Guaranteed high IOPS and priority
Per-application control over decisions along IOs’ path
5
6
Deep IO path with 18+ different layers that are configured
and operate independently and do not understand SLAs
Example: guarantee aggregate bandwidth B for Red
tenant
Challenges in enforcing end-to-end SLAs
 
 
No storage control plane
No enforcing mechanism along storage data plane
Aggregate performance SLAs
       - Across VMs, files and storage operations
Want non-performance SLAs: control over IOs’
path
Want to support unmodified applications and
VMs
7
IOFlow architecture
App
OS
App
OS
Controller
 
High-level SLA
8
 
IOFlow API
Decouples the data plane (enforcement) from the
control plane (policy logic)
Contributions
Defined and built storage control plane
Controllable queues in data plane
Interface between control and data plane 
(IOFlow
API)
Built centralized control applications that
demonstrate power of architecture
9
SDS: Storage-specific challenges
Storage flows
 
Storage “Flow” refers to all IO requests to which an SLA applies
 
  
<{VMs}, {File Operations}, {Files}, {Shares}>  ---> SLA
 
Aggregate, per-operation and per-file SLAs, e.g.,
      
<{VM 1-100}, write, *, \\share\db-log}>---> high priority
      <{VM 1-100}, *, *, \\share\db-data}> ---> min 100,000 IOPS
Non-performance SLAs, e.g., path routing
     
<
VM 1, *, *, \\share\dataset>---> bypass malware scanner
11
source set
destination sets
IOFlow API: programming data plane queues
1. Classification [IO Header -> 
Queue
]
2. Queue servicing [Queue -> 
<token rate, priority, queue size>
]
3. Routing [Queue -> 
Next-hop
]
Malware 
scanner
12
Lack of common IO Header for storage traffic
SLA: <VM 4, *, *,  \\share\dataset> --> Bandwidth B
13
Block device
Z: (/device/scsi1)
Server and VHD
\\serverX\AB79.vhd
Volume and file
H:\AB79.vhd
Block device
/device/ssd5
Flow name resolution through controller
SLA: {VM 4, *, *, //share/dataset} --> Bandwidth B
Controller
SMBc exposes IO Header it
understands:
<VM_SID, //server/file.vhd>
Queuing rule (per-file handle):
<VM4_SID, //serverX/AB79.vhd> --> Q1
Q1.token rate --> B
14
Rate limiting for congestion control
 
Queue servicing [Queue -> <token rate, priority, queue size>]
 
Important for performance SLAs
Today: no storage congestion control
 
 
Challenging for storage: e.g., how to rate limit two VMs, one
reading, one writing to get equal storage bandwidth?
 
15
Rate limiting on payload bytes does not work
16
VM
VM
8KB Writes
8KB Reads
Rate limiting on bytes does not work
17
VM
VM
8KB Writes
8KB Reads
Rate limiting on IOPS does not work
18
VM
VM
8KB Writes
64KB Reads
Need to rate limit based on cost
Rate limiting based on cost
Controller constructs empirical cost models based
on device type and workload characteristics
RAM, SSDs, disks: 
read/write ratio, request size
Cost models assigned to each queue
ConfigureTokenBucket [Queue -> cost model]
Large request sizes split for pre-emption
19
Recap: Programmable queues on data plane
Classification [IO Header -> 
Queue
]
Per-layer metadata exposed to controller
Controller out of critical path
Queue servicing [Queue -> 
<token rate, priority,
queue size>
]
Congestion control based on operation cost
Routing [Queue -> 
Next-hop
]
How does controller enforce SLA?
20
Distributed, dynamic enforcement
SLA needs per-VM enforcement
Need to control the aggregate rate of
VMs 1-4 that reside on different
physical machines
Static partitioning of bandwidth is
sub-optimal
<{Red VMs 1-4}, *, * //share/dataset> --> Bandwidth 40 Gbps
21
40Gbps
Work-conserving solution
VMs with traffic demand
should be able to send it as
long as the aggregate rate does
not exceed 40 Gbps
Solution:
 Max-min fair sharing
22
Max-min fair sharing
Well studied problem in networks
Existing solutions are distributed
Each VM varies its rate based on congestion
Converge to max-min sharing
Drawbacks
: complex and requires 
congestion signal
But we have a centralized controller
Converts to simple algorithm at controller
23
Controller-based max-min fair sharing
What does controller do?
Infers VM demands
Uses centralized max-min 
within
a tenant and 
across
 tenants
Sets VM token rates
Chooses best place to enforce
Controller
24
INPUT: 
per-VM demands
OUTPUT: 
per-VM allocated token rate
t
s
t = control interval
s = stats sampling interval
Controller decides 
where
 to enforce
25
SLA constraints
Queues where resources shared
Bandwidth enforced close to source
Priority enforced end-to-end
Efficiency considerations
Overhead in data plane ~ # queues
Important at 40+ Gbps
Minimize # times IO is queued and distribute rate limiting load
Centralized vs. decentralized control
 
Centralized controller in SDS allows for simple
algorithms that focus on SLA enforcement and 
not
on distributed system challenges
Analogous to benefits of centralized control in software-
defined networking (SDN)
26
IOFlow implementation
27
2 key layers for
VM-to-Storage
performance SLAs
4 other layers
. Scanner driver (routing)
. User-level (routing)
. Network driver
. Guest OS file system
Implemented as filter drivers on top of layers
Evaluation map
IOFlow’s ability to enforce end-to-end SLAs
 
Aggregate bandwidth SLAs
 
Priority SLAs and routing application in paper
Performance of data and control planes
28
Evaluation setup
29
Clients
:
10 hypervisor servers, 12 VMs each
4 tenants (
Red
, 
Green
, 
Yellow
, 
Blue
)
30 VMs/tenant, 3 VMs/tenant/server
Storage network
:
Mellanox 40Gbps RDMA RoCE full-duplex
1 storage server
:
16 CPUs, 2.4GHz (Dell R720)
SMB 3.0 file server protocol
3 types of backend: RAM, 
SSDs, Disks
Controller
:
 1 separate server
1 sec control interval (configurable)
Workloads
4 Hotmail tenants {
Index
, 
Data
, 
Message
, 
Log
}
 
 Used for trace replay on SSDs (see paper)
IoMeter is parametrized with Hotmail tenant
characteristics (read/write ratio, request size)
30
Enforcing bandwidth SLAs
4 tenants with different storage bandwidth SLAs
Tenants have different workloads
Red
 tenant is aggressive: generates more requests/second
31
Things to look for
Distributed enforcement across 4 competing
tenants
Aggressive tenant(s) under control
Dynamic inter-tenant work conservation
Bandwidth released by idle tenant given to active
tenants
Dynamic intra-tenant work conservation
Bandwidth of tenant’s idle VMs given to its active VMs
32
Results
Controller
notices red
tenant’s
performance
Tenants’ SLAs
enforced. 120
queues cfg.
33
Inter-tenant
work
conservation
Intra-tenant
work
conservation
Data plane overheads at 40Gbps RDMA
Negligible in previous experiment. To bring out
worst case varied IO sizes from 512Bytes to 64KB
34
Reasonable overheads for enforcing SLAs
Control plane overheads: network and CPU
35
 
Overheads (MB)
<0.3% CPU
overhead at
controller
Controller configures queue rules, receives
statistics and updates token rates every
interval
Before
 Next time
Final Project Presentation/Demo
Due 
Friday, December 12
.
Presentation and Demo
Written submission required:
Report
Website: index.html that points to report, presentation, and project (e.g.
code)
Required review and reading for Wednesday, December
3
Plug into the Supercloud, D. Williams, H. Jamjoom, H. Weatherspoon.  
IEEE
Internet Computing
, Vol. 17, No 2, March/April 2013, pp 28-34.
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6365162
Check piazza: http://piazza.com/cornell/fall2014/cs5413
Check website for updated schedule
Slide Note
Embed
Share

This presentation discusses IOFlow, a software-defined storage architecture introduced in ACM SOSP 2013. It addresses the challenges in enforcing end-to-end SLAs, the motivation behind predictable application behavior, and the need for per-application control over I/O paths. The content also covers the background of enterprise data centers and examples of guaranteeing aggregate bandwidth.

  • Software-defined storage
  • Architecture
  • IOFlow
  • SLA enforcement
  • Enterprise data centers

Uploaded on Feb 23, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Data Center Storage and Networking Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking December 1, 2014 Slides from ACM SOSP 2013 presentation on IOFlow: A Software-Defined Storage Architecture. Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Antony Rowstron, Tom Talpey, and Timothy Zhu. In SOSP'13, Farmington, PA, USA. November 3-6, 2013.

  2. Goals for Today IOFlow: a software-defined storage architecture E. Thereska, H. Ballani, G. O'Shea, T. Karagiannis, A. Rowstron, T. Talpey, R. Black, T. Zhu. ACM Symposium on Operating Systems Principles (SOSP), October 2013, pages 182-196.

  3. Background: Enterprise data centers General purpose applications Application runs on several VMs VM VM VM VM VM VM Virtual Machine Virtual Machine vDisk vDisk Hypervisor S-NIC Separate network for VM-to-VM traffic and VM-to-Storage traffic NIC S-NIC NIC Switch Switch Switch S-NIC S-NIC Storage server Storage server Storage is virtualized Resources are shared 2

  4. Motivation Want: predictable application behaviour and performance Need system to provide end-to-end SLAs, e.g., Guaranteed storage bandwidth B Guaranteed high IOPS and priority Per-application control over decisions along IOs path It is hard to provide such SLAs today 5

  5. Example: guarantee aggregate bandwidth B for Red tenant App OS App OS Malware scan VM VM Virtual Machine Virtual Machine Compression vDisk vDisk File system File system Caching Scheduling Hypervisor S-NIC Storage server File system Deduplication Caching Scheduling NIC S-NIC NIC Switch Switch Switch Hypervisor IO Manager Drivers S-NIC S-NIC Storage server Storage server Caching Scheduling Drivers Deep IO path with 18+ different layers that are configured and operate independently and do not understand SLAs 6

  6. Challenges in enforcing end-to-end SLAs No storage control plane No enforcing mechanism along storage data plane Aggregate performance SLAs - Across VMs, files and storage operations Want non-performance SLAs: control over IOs path Want to support unmodified applications and VMs 7

  7. IOFlow architecture Decouples the data plane (enforcement) from the control plane (policy logic) IO Packets App OS Malware scan App OS Compression High-level SLA ... Storage server File system Deduplication File system File system Scheduling Queue 1 Queue n Scheduling Hypervisor IO Manager Drivers Client-side IO stack Controller Caching Scheduling Drivers Server-side IO stack IOFlow API 8

  8. Contributions Defined and built storage control plane Controllable queues in data plane Interface between control and data plane (IOFlow API) Built centralized control applications that demonstrate power of architecture 9

  9. SDS: Storage-specific challenges Low-level primitives Old networks SDN Storage today SDS End-to-end identifier Data plane queues Control plane

  10. Storage flows Storage Flow refers to all IO requests to which an SLA applies <{VMs}, {File Operations}, {Files}, {Shares}> ---> SLA destination sets source set Aggregate, per-operation and per-file SLAs, e.g., <{VM 1-100}, write, *, \\share\db-log}>---> high priority <{VM 1-100}, *, *, \\share\db-data}> ---> min 100,000 IOPS Non-performance SLAs, e.g., path routing <VM 1, *, *, \\share\dataset>---> bypass malware scanner 11

  11. IOFlow API: programming data plane queues 1. Classification [IO Header -> Queue] 2. Queue servicing [Queue -> <token rate, priority, queue size>] 3. Routing [Queue -> Next-hop] IO Header Malware scanner 12

  12. Lack of common IO Header for storage traffic SLA: <VM 4, *, *, \\share\dataset> --> Bandwidth B VM1 VM2 VM3 VM4 Application Block device Z: (/device/scsi1) File system Guest OS Volume and file H:\AB79.vhd Block device SMBs VHD Server and VHD \\serverX\AB79.vhd Hypervisor Scanner SMBc Block device /device/ssd5 File system Disk driver Network driver Network driver Physical NIC Physical NIC Compute Server Storage Server 13

  13. Flow name resolution through controller SLA: {VM 4, *, *, //share/dataset} --> Bandwidth B SMBc exposes IO Header it understands: <VM_SID, //server/file.vhd> VM1 VM2 Controller VM3 VM4 Application File system Guest OS Block device Queuing rule (per-file handle): <VM4_SID, //serverX/AB79.vhd> --> Q1 Q1.token rate --> B SMBs VHD Hypervisor Scanner SMBc File system Disk driver Network driver Network driver Physical NIC Physical NIC Compute Server Storage Server 14

  14. Rate limiting for congestion control Queue servicing [Queue -> <token rate, priority, queue size>] tokens IOs Important for performance SLAs Today: no storage congestion control Challenging for storage: e.g., how to rate limit two VMs, one reading, one writing to get equal storage bandwidth? 15

  15. Rate limiting on payload bytes does not work VM VM 8KB Writes 8KB Reads Storage server 16

  16. Rate limiting on bytes does not work VM VM 8KB Writes 8KB Reads Storage server 17

  17. Rate limiting on IOPS does not work VM VM 64KB Reads 8KB Writes Storage server Need to rate limit based on cost 18

  18. Rate limiting based on cost Controller constructs empirical cost models based on device type and workload characteristics RAM, SSDs, disks: read/write ratio, request size Cost models assigned to each queue ConfigureTokenBucket [Queue -> cost model] Large request sizes split for pre-emption 19

  19. Recap: Programmable queues on data plane Classification [IO Header -> Queue] Per-layer metadata exposed to controller Controller out of critical path Queue servicing [Queue -> <token rate, priority, queue size>] Congestion control based on operation cost Routing [Queue -> Next-hop] How does controller enforce SLA? 20

  20. Distributed, dynamic enforcement <{Red VMs 1-4}, *, * //share/dataset> --> Bandwidth 40 Gbps SLA needs per-VM enforcement Need to control the aggregate rate of VMs 1-4 that reside on different physical machines VM VM VM VM VM VM VM VM Hypervisor Hypervisor 40Gbps Static partitioning of bandwidth is sub-optimal Storage server 21

  21. Work-conserving solution VMs with traffic demand should be able to send it as long as the aggregate rate does not exceed 40 Gbps VM VM VM VM VM VM VM VM Hypervisor Hypervisor Solution: Max-min fair sharing Storage server 22

  22. Max-min fair sharing Well studied problem in networks Existing solutions are distributed Each VM varies its rate based on congestion Converge to max-min sharing Drawbacks: complex and requires congestion signal But we have a centralized controller Converts to simple algorithm at controller 23

  23. Controller-based max-min fair sharing t = control interval s = stats sampling interval What does controller do? Infers VM demands Uses centralized max-min within a tenant and across tenants Sets VM token rates Chooses best place to enforce INPUT: per-VM demands Controller s t OUTPUT: per-VM allocated token rate 24

  24. Controller decides where to enforce Minimize # times IO is queued and distribute rate limiting load SLA constraints VM VM VM VM VM VM VM VM Queues where resources shared Hypervisor Hypervisor Bandwidth enforced close to source Priority enforced end-to-end Efficiency considerations Storage server Overhead in data plane ~ # queues Important at 40+ Gbps 25

  25. Centralized vs. decentralized control Centralized controller in SDS allows for simple algorithms that focus on SLA enforcement and not on distributed system challenges Analogous to benefits of centralized control in software- defined networking (SDN) 26

  26. IOFlow implementation 2 key layers for VM-to-Storage performance SLAs VM1 VM2 VM3 VM4 Application File Controller system Guest OS Block device SMBs VHD 4 other layers . Scanner driver (routing) . User-level (routing) Hypervisor Scanner SMBc File system Disk driver Network driver Network driver Physical NIC Physical NIC . Network driver . Guest OS file system Compute Server Storage Server Implemented as filter drivers on top of layers 27

  27. Evaluation map IOFlow s ability to enforce end-to-end SLAs Aggregate bandwidth SLAs Priority SLAs and routing application in paper Performance of data and control planes 28

  28. Evaluation setup Clients:10 hypervisor servers, 12 VMs each 4 tenants (Red, Green, Yellow, Blue) 30 VMs/tenant, 3 VMs/tenant/server Storage network: Mellanox 40Gbps RDMA RoCE full-duplex 1 storage server: 16 CPUs, 2.4GHz (Dell R720) SMB 3.0 file server protocol 3 types of backend: RAM, SSDs, Disks VM VM VM VM VM VM VM VM Hypervisor Hypervisor Switch Storage server Controller: 1 separate server 1 sec control interval (configurable) 29

  29. Workloads 4 Hotmail tenants {Index, Data, Message, Log} Used for trace replay on SSDs (see paper) IoMeter is parametrized with Hotmail tenant characteristics (read/write ratio, request size) 30

  30. Enforcing bandwidth SLAs 4 tenants with different storage bandwidth SLAs Tenant SLA Red {VM1 30} -> Min 800 MB/s Green Yellow {VM31 60} -> Min 800 MB/s {VM61 90} -> Min 2500 MB/s Tenants have different workloads Red tenant is aggressive: generates more requests/second Blue {VM91 120} -> Min 1500 MB/s 31

  31. Things to look for Distributed enforcement across 4 competing tenants Aggressive tenant(s) under control Dynamic inter-tenant work conservation Bandwidth released by idle tenant given to active tenants Dynamic intra-tenant work conservation Bandwidth of tenant s idle VMs given to its active VMs 32

  32. Results Controller notices red tenant s performanceTenants SLAs enforced. 120 Inter-tenant work conservation Intra-tenant work conservation queues cfg. 33

  33. Data plane overheads at 40Gbps RDMA Negligible in previous experiment. To bring out worst case varied IO sizes from 512Bytes to 64KB Reasonable overheads for enforcing SLAs 34

  34. Control plane overheads: network and CPU Controller configures queue rules, receives statistics and updates token rates every interval <0.3% CPU overhead at controller Overheads (MB) 35

  35. Before Next time Final Project Presentation/Demo Due Friday, December 12. Presentation and Demo Written submission required: Report Website: index.html that points to report, presentation, and project (e.g. code) Required review and reading for Wednesday, December 3 Plug into the Supercloud, D. Williams, H. Jamjoom, H. Weatherspoon. IEEE Internet Computing, Vol. 17, No 2, March/April 2013, pp 28-34. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6365162 Check piazza: http://piazza.com/cornell/fall2014/cs5413 Check website for updated schedule

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#