Efficient Rule Management for Data Centers

undefined
Scalable Rule Management for
Data Centers
Masoud Moshref
, Minlan Yu,
Abhishek Sharma, Ramesh Govindan
4/3/2013
Introduction: Definitions
2
Access control
Rate limiting
Traffic measurement
Traffic engineering
 
Examples:
Deny
Accept
Enqueue
Flow fields examples:
Src IP / Dst IP
Protocol
Src Port / Dst Port
Introduction: Definitions
3
R2
R1
 
Src IP
R1: Accept
SrcIP: 12.0.0.0/7
DstIP: 10.0.0.0/8
 
Dst IP
R2: Deny
SrcIP: 12.0.0.0/8
DstIP: 8.0.0.0/6
Current practice
4
 
On hypervisors
On switches
Machines have limited resources
5
Top-of-Rack
switch
Network
Interface Card
Software
switches on
servers
Future datacenters will have many fine-grained rules
6
1B – 20B rules
10M – 100M rules
1M rules
Rule location trade-off (resource vs. bandwidth usage)
7
R0
Rule location trade-off (resource vs. bandwidth usage)
8
R0
Rule location trade-off: Offload to servers
9
R1
Challenges: Concrete example
10
 
Src IP
 
Dst IP
VM0
VM1
VM2
VM3
VM4
VM5
VM6
VM7
Challenges: Overlapping rules
11
R0
         R1
R5
R6
R2
R3
R4
0
1
2
3
4
5
6
7
1
2
3
4
5
6
7
0
VM2
VM6
Source Placement: Saving rules on the source machine means
minimum overhead
Src IP
Dst IP
Challenges: Overlapping rules
12
R0
         R1
R2
R3
R4
0
1
2
3
4
5
6
7
1
2
3
4
5
6
7
0
VM2
VM6
R4
If Source Placement is not feasible
Src IP
Dst IP
Heterogeneous devices
Challenges
13
Traffic changes
Rule changes
VM Migration
Contribution: vCRIB, a Virtual Cloud Rule Information Base
14
 
Rules
R1
R2
R3
R4
R3
vCRIB design
15
Source
Partitioning with
Replication
Topology &
Routing
Rules
Partitions
Overlapping
Rules
Minimum Traffic
Feasible
Placement
Partitioning with replication
16
 
P1
 
P3
 
P2
 
P1
 
P3
Smaller partitions have more flexibility
Cutting causes rule inflation
 
P2
Partitioning with replication
17
P1 (5 rules)
P3 (5 rules)
Per-source partitions
18
Limited resource for forwarding
No need for replication to
approximate source-placement
Closer partitions are more similar
Src IP
Dst IP
vCRIB design: Placement
19
Source
Partitioning with
Replication
Topology &
Routing
Rules
Partitions
Traffic Overhead
Resource
Constraints
Minimum Traffic
Feasible
Placement
vCRIB design: Placement
20
Source
Partitioning with
Replication
Topology &
Routing
Rules
Partitions
Minimum Traffic
Feasible
Placement
Feasible
Placement
Traffic Overhead
Resource
Constraints
Traffic Overhead
Resource-Aware
Placement
Traffic-Aware
Refinement
Traffic-Aware
Refinement
FFDS (First Fit Decreasing Similarity)
21
1.
Put a random partition on an empty device
2.
Add the most similar partitions to the initial partition
until the device is full
Find the lower bound for optimal solution for rules
Prove the algorithm is a 2-approximation of the lower
bound
vCRIB design: Heterogeneous resources
22
Source
Partitioning with
Replication
Topology &
Routing
Rules
Partitions
Minimum Traffic
Feasible
Placement
Resource-Aware
Placement
Traffic-Aware
Refinement
Feasible
Placement
Resource
Heterogeneity
Resource
Usage
Function
vCRIB design: Traffic-Aware Refinement
23
Source
Partitioning with
Replication
Resource-Aware
Placement
Traffic-Aware
Refinement
Partitions
Feasible
Placement
Minimum Traffic
Feasible
Placement
Resource
Usage
Function
Traffic Overhead
Topology &
Routing
Rules
Traffic-aware refinement
Overhead greedy approach
1.
Pick maximum overhead partition
2.
Put it where minimizes the overhead and maintains feasibility
24
VM
2
VM
4
Traffic-aware refinement
 
Overhead greedy approach
1.
Pick maximum overhead partition
2.
Put it where minimizes the overhead and maintains feasibility
Problem: Local minima
Our approach: Benefit greedy
25
VM
2
VM
4
vCRIB design: Dynamics
26
Source
Partitioning with
Replication
Resource-Aware
Placement
Traffic-Aware
Refinement
Rule/VM
Dynamics
Partitions
Feasible
Placement
Minimum Traffic
Feasible
Placement
Resource
Usage
Function
Major
Traffic
Changes
Dynamics
Topology &
Routing
Rules
vCRIB design
27
Source
Partitioning with
Replication
Resource-Aware
Placement
Traffic-Aware
Refinement
Rule/VM
Dynamics
Partitions
Feasible
Placement
Minimum Traffic
Feasible
Placement
Resource
Usage
Function
Traffic Overhead
Major
Traffic
Changes
Dynamics
Topology &
Routing
Rules
Resource
Constraints
Resource
Heterogeneity
Overlapping
Rules
Evaluation
 
 
 
 
 
 
 
 
 
 
 
C
o
m
p
a
r
i
n
g
 
v
C
R
I
B
 
v
s
.
 
S
o
u
r
c
e
-
P
l
a
c
e
m
e
n
t
P
a
r
a
m
e
t
e
r
 
s
e
n
s
i
t
i
v
i
t
y
 
a
n
a
l
y
s
i
s
R
u
l
e
s
 
i
n
 
p
a
r
t
i
t
i
o
n
s
T
r
a
f
f
i
c
 
l
o
c
a
l
i
t
y
V
M
s
 
p
e
r
 
s
e
r
v
e
r
D
i
f
f
e
r
e
n
t
 
m
e
m
o
r
y
 
s
i
z
e
s
W
h
e
r
e
 
i
s
 
t
h
e
 
t
r
a
f
f
i
c
 
o
v
e
r
h
e
a
d
 
a
d
d
e
d
?
T
r
a
f
f
i
c
-
a
w
a
r
e
 
r
e
f
i
n
e
m
e
n
t
 
f
o
r
 
o
n
l
i
n
e
 
s
c
e
n
a
r
i
o
s
H
e
t
e
r
o
g
e
n
e
o
u
s
 
r
e
s
o
u
r
c
e
 
c
o
n
s
t
r
a
i
n
t
s
S
w
i
t
c
h
-
o
n
l
y
 
s
c
e
n
a
r
i
o
s
28
Simulation setup
 
1k servers with 20 VMs per server in a Fat-tree network
200k rules generated by ClassBench and random action
IPs are assigned in two ways:
Random
Range
Flows
Size follows long-tail distribution
Local traffic matrix (0.5 same rack, 0.3 same pod, 0.2 interpod)
29
Comparing vCRIB vs. Source-Placement
30
Maximum Load is 5K
Capacity is 4K
Range is better as similar partitions are from the same
source
Random: Average load is
4.2K
Adding more resources helps vCRIB reduce traffic
overhead
vCRIB finds low traffic feasible solution
Parameter sensitivity analysis: Rules in partitions
31
Total space
vCRIB
Defined by maximum
load on a server
 
Feasible
Source
placement
Total space
vCRIB
Parameter sensitivity analysis: Rules in partitions
32
 
<10% Traffic
vCRIB
<10% Traffic
Source
placement
 
A
 
A
Future work
Conclusion
Conclusion and future work
33
undefined
Scalable Rule Management for
Data Centers
Masoud Moshref
, Minlan Yu,
Abhishek Sharma, Ramesh Govindan
4/3/2013
Related work
35
DIFANE
vCRIB in
HotCloud’12
vCRIB
Where to place rules? Hypervisors vs. Switches
 
36
Limited CPU budget
Software, Slow
Complex rules
# TCAM entries
Hardware, Fast
OpenFlow rules
Close to VMs
External traffic,
Aggregate traffic
Challenges: Dynamics
 
Traffic changes
Rule changes
VM migration
37
VM6
VM2
Partitioning with split
38
Finding the balanced
minimum inflation
partitions is NP-Hard
Partitioning with split
39
S1
S2
Redirection rules in Partitioning with replication
40
VM6
VM2
R2
F3
Per-source partitions
41
Rules are always contiguous blocks crossing only the
neighboring partitions
If i<j<k, any common rule between P_i and P_k is also in P_j
i
j
k
Divide the problem into two phases
42
Generalized Assignment Problem
 
T
11
 
T
21
 
T
22
 
T
23
 
T
32
 
T
33
Different partition sizes
Different device capacities
Different traffic overhead for
each partition location
Bin-packing for similar partitions
43
ToR1
S1
S2
ToR1
S1
S2
The bin-packing algorithm must be similarity-aware
FFDS algorithm and per-source partitions
44
Devices have 5 spaces
Approximation bound proof
Use Sindelar algorithm[1] on binary tree
Partitions are leaves
Create tree by joining the most similar nodes
Count the number of devices the fractional placement needs
Run a greedy algorithm on tree
Fractional placement is lower bound for placement if
tree captures 100% similarity of rules
For per-source partitions and prefix-wildcard rules, no
similarity will be lost
FFDS is always better than Sindelar greedy algorithm
45
[1] M. Sindelar, R. K. Sitaram, and P. Shenoy. Sharing-Aware Algorithms for Virtual Machine
Colocation. In SPAA, 2011
Extend to CPU model
46
Traffic-aware refinement
47
VM
2
VM
4
Traffic monitoring
No need for VM migration and rule changes
Sources may not know the rules so we can only get flow
statistics
Monitoring implementation choices
Open vSwitch already has active flows cache
MicroTE adds a kernel module
Only report large changes
The controller sets a threshold in response of each report
Set threshold for
Flows for each rule per source VM and destination machine
Flows per source VM and destination machine
48
Traffic overhead
49
Traffic on core links increased by 2.5%
70% of traffic overhead is inside the racks
CPU model and heterogeneous case evaluation
50
Maximum is 48% (68%)
Range is not so good as before as similarity function changed
FFDS algorithm performance
 
51
Movement Definition
 
52
Avg Partition Size
Avg Partition Similarity
Add micro rules
Always make sure maximum load on
machines does not change
So there is a bound in each direction
Top: Extending a rule reduce
load on max machine
Right: All machines have the
same maximum load
Left: No micro rule on any non-
max load machine
Down: Halving any rule
increases the load more than
max machine
Remove micro
rules
Half
rules
Extend
rules
Effect of similarity on traffic overhead
Change similarity by selecting IP addresses closer to
each other
53
Other results
54
Online scenarios
 
55
Benefit Greedy can evade local
minima of Overhead Greedy
The benefit is not only for a
special scenario
FAQ
1.
What kind of policies vCRIB does not support?
2.
How do you get traffic changes?
3.
What is the bandwidth overhead of migrating
partitions?
4.
How do you ensure consistency in migration?
5.
What is the proof of approximation bound?
6.
How do you change the ruleset to explore the space?
7.
How partitioning with replication works with
overlapping rules?
56
Slide Note

Today, I talk about how to enforce management policies in a data center with limited resources.

Embed
Share

This research discusses scalable rule management for data centers, exploring the use of rules for implementing management policies related to access control, rate limiting, traffic measurement, traffic engineering, and flow fields. It evaluates current practices, highlights the challenges of limited resources on machines, and anticipates the need for future data centers to handle a vast number of fine-grained rules efficiently.

  • Data Centers
  • Rule Management
  • Scalability
  • Traffic Engineering
  • Future Technologies

Uploaded on Sep 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013

  2. Introduction: Definitions Datacenters use rules to implement management policies Datacenters use rules to implement management policies Datacenters use rules to implement management policies Access control Rate limiting Traffic measurement Traffic engineering Flow fields examples: Src IP / Dst IP Protocol Src Port / Dst Port An action on a set of ranges on flow fields An actionon a set of ranges on flow fields Examples: Deny Accept Enqueue Motivation Introduction Motivation Design Evaluation 2

  3. Introduction: Definitions Datacenters use rules to implement management policies An actionon a set of ranges on flow fields Src IP R1: Accept SrcIP: 12.0.0.0/7 DstIP: 10.0.0.0/8 Dst IP R2 R2: Deny SrcIP: 12.0.0.0/8 DstIP: 8.0.0.0/6 R1 Motivation Introduction Motivation Design Evaluation 3

  4. Current practice Rules are saved on predefined fixed machines Agg2 Agg1 R3 R3 Agg2 Agg1 R4 R4 R0 R0 ToR1 ToR2 ToR3 ToR1 ToR2 ToR3 R3 R3 R3 R4 R4 R4 R0 R0 R0 R3 R3 R3 R3 R3 R3 R4 R4 R4 R4 R4 R4 R0 R0 R0 R0 R0 R0 On hypervisors On switches Motivation Introduction Motivation Design Evaluation 4

  5. Machines have limited resources Top-of-Rack switch Network Interface Card Software switches on servers Motivation Introduction Motivation Design Evaluation 5

  6. Future datacenters will have many fine-grained rules VLAN per server Traffic management (NetLord, Spain) 1M rules Per flow decision Flow measurement for traffic engineering (MicroTE, Hedera) 10M 100M rules Regulating VM pair communication Access control (CloudPolice) Bandwidth allocation (Seawall) 1B 20B rules Motivation Introduction Motivation Design Evaluation 6

  7. Rule location trade-off (resource vs. bandwidth usage) R0 Storing rules at hypervisor incurs CPU overhead Architecture Motivation Introduction Design Evaluation 7

  8. Rule location trade-off (resource vs. bandwidth usage) R0 Storing rules at hypervisor incurs CPU overhead Move the rule to ToR switch and forward traffic Architecture Motivation Introduction Design Evaluation 8

  9. Rule location trade-off: Offload to servers R1 Architecture Motivation Introduction Design Evaluation 9

  10. Challenges: Concrete example VM1 VM3 VM5 VM7 Agg1 Agg2 Src IP VM0 VM2 VM4 VM6 Dst IP 0 1 2 3 4 5 6 7 0 R4 1 2 3 4 5 6 7 ToR1 ToR2 ToR3 R5 R1 R0 R3 R2 R6 S1 S2 S3 S4 S5 S6 Architecture Motivation Introduction Design Evaluation 10

  11. Challenges: Overlapping rules Source Placement: Saving rules on the source machine means minimum overhead Agg1 Agg2 Src IP Dst IP 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 R4 R4 1 2 3 4 5 6 7 7 ToR1 ToR2 ToR3 R5 R1 R1 R2 R0 R0 R3 R3 R2 R6 S1 VM2 VM6 S2 S3 S4 S5 S6 Architecture Motivation Introduction Design Evaluation 11

  12. Challenges: Overlapping rules If Source Placement is not feasible Agg1 Agg2 Src IP Dst IP 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 R4 R4 1 2 3 4 5 6 7 7 ToR1 ToR2 ToR3 R1 R2 R1 R2 R0 R0 R3 R3 S1 VM2 VM6 S2 S3 S4 S5 S6 Architecture Motivation Introduction Design Evaluation 12

  13. Challenges Preserve the semantics of overlapping rules Respect resource constraints Heterogeneous devices Minimize traffic overhead Handle Dynamics Traffic changes Rule changes VM Migration Architecture Motivation Introduction Design Evaluation 13

  14. Contribution: vCRIB, a Virtual Cloud Rule Information Base Proactive rule placement abstraction layer Optimize traffic given resource constraints & changes Network State Agg2 Agg1 Rules ToR1 ToR2 ToR3 vCRIB R1 R4 R2 R3 R3 Design Introduction Motivation Evaluation 14

  15. vCRIB design Topology & Routing Source Overlapping Rules Partitioning with Replication Rules Partitions Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 15

  16. Partitioning with replication 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 7 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 6 0 1 2 3 4 5 0 Smaller partitions have more flexibility Cutting causes rule inflation R3 R3 R3 R3 1 2 3 4 5 6 7 R6 R6 R5 R5 R4 R4 R0 R0 R0 R0 R2 R2 R8 R8 R3 R3 R6 R5 R4 R 1 R 1 R1 R1 R7 R7 R0 R2 R0 R8 P2 P1 P3 R 1 R7 R 1 P1 P2 P3 Design Introduction Motivation Evaluation 16

  17. Partitioning with replication R3 Introduce the concept of similarity to mitigate inflation ??? ?2,?3 = ?2 ?3 = R0,R1,R3 R6 R5 P1 P2 (7 rules) R0 R2 = 3 R1 R7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 R3 R3 1 2 3 4 5 6 7 R3 1 2 3 4 5 6 7 1 2 3 4 5 6 7 R5R2 R6 R4 R0 R0 R0 R8 R1 R1 R 1 A7 P2(5 rules) P3 (5 rules) P1 (5 rules) Design Introduction Motivation Evaluation 17

  18. Per-source partitions Src IP Dst IP 0 1 2 3 4 5 6 7 0 R4 1 2 3 4 5 6 7 Limited resource for forwarding No need for replication to approximate source-placement Closer partitions are more similar R5 R1 R0 R3 R2 R6 Design Introduction Motivation Evaluation 18

  19. vCRIB design: Placement Topology & Routing Source Partitioning with Replication Rules Partitions Placement T11 Resource Constraints ToR1 T21 T22 T23 T32 T33 Traffic Overhead Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 19

  20. vCRIB design: Placement Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement Resource Constraints Feasible Placement Traffic Overhead Traffic Overhead Traffic-Aware Refinement Refinement Traffic-Aware Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 20

  21. FFDS (First Fit Decreasing Similarity) 1. Put a random partition on an empty device 2. Add the most similar partitions to the initial partition until the device is full Find the lower bound for optimal solution for rules Prove the algorithm is a 2-approximation of the lower bound Design Introduction Motivation Evaluation 21

  22. vCRIB design: Heterogeneous resources Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement Resource Heterogeneity Resource Usage Function Feasible Placement Traffic-Aware Refinement Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 22

  23. vCRIB design: Traffic-Aware Refinement Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement Resource Usage Function Feasible Placement Traffic-Aware Refinement Traffic Overhead Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 23

  24. Traffic-aware refinement Overhead greedy approach 1. Pick maximum overhead partition 2. Put it where minimizes the overhead and maintains feasibility Agg1 Agg2 P4 ToR1 ToR2 P2 ToR3 S1 VM 2 S2 S3 S4 VM 4 S5 S6 Design Introduction Motivation Evaluation 24

  25. Traffic-aware refinement Overhead greedy approach 1. Pick maximum overhead partition 2. Put it where minimizes the overhead and maintains feasibility Problem: Local minima Our approach: Benefit greedy Agg1 Agg2 ToR1 ToR2 ToR3 S1 VM 2 P2 S2 S3 S4 VM 4 P4 S5 S6 Design Introduction Motivation Evaluation 25

  26. vCRIB design: Dynamics Topology & Routing Source Partitioning with Replication Rules Partitions Rule/VM Dynamics Resource-Aware Placement Resource Usage Function Feasible Placement Traffic-Aware Refinement Dynamics Major Traffic Changes Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 26

  27. vCRIB design Topology & Routing Source Overlapping Rules Partitioning with Replication Rules Partitions Resource Constraints Rule/VM Dynamics Resource-Aware Placement Resource Heterogeneity Resource Usage Function Feasible Placement Traffic-Aware Refinement Traffic Overhead Major Traffic Changes Dynamics Minimum Traffic Feasible Placement Design Introduction Motivation Evaluation 27

  28. Evaluation Comparing vCRIB vs. Source-Placement Parameter sensitivity analysis Rules in partitions Traffic locality VMs per server Different memory sizes Where is the traffic overhead added? Traffic-aware refinement for online scenarios Heterogeneous resource constraints Switch-only scenarios Evaluation Introduction Motivation Design 28

  29. Simulation setup 1k servers with 20 VMs per server in a Fat-tree network 200k rules generated by ClassBench and random action IPs are assigned in two ways: 0 1 2 3 4 5 6 7 Random Range Flows Size follows long-tail distribution Local traffic matrix (0.5 same rack, 0.3 same pod, 0.2 interpod) Evaluation Introduction Motivation Design 29

  30. Comparing vCRIB vs. Source-Placement Maximum Load is 5K Capacity is 4K 0.3 Range Random 0.25 Traffic overhead ratio Random: Average load is 4.2K 0.2 0.15 0.1 0.05 0 4k_0 Server memory_Switch memory 4k_4k 4k_6k vCRIB finds low traffic feasible solution Range is better as similar partitions are from the same source Adding more resources helps vCRIB reduce traffic overhead Evaluation Introduction Motivation Design 30

  31. Parameter sensitivity analysis: Rules in partitions Total space 2000 Average Similarity vCRIB 1000 Source placement 0 0 1000 Average Partition Size 2000 Defined by maximum load on a server Evaluation Introduction Motivation Design 31

  32. Parameter sensitivity analysis: Rules in partitions Total space 2000 Average Similarity vCRIB A vCRIB <10% Traffic 1000 A Source placement 0 0 1000 Average Partition Size 2000 Lower traffic overhead for smaller partitions and more similar ones Evaluation Introduction Motivation Design 32

  33. Conclusion and future work Conclusion vCRIB allows operators and users to specify rules, and manages their placement in a way that respects resource constraints and minimizes traffic overhead. Future work Support reactive placement by adding the controller in the loop Break a partition for large number of rules per VM Test for other rulesets Evaluation Introduction Motivation Design 33

  34. Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#