Cloud Load Balancing Overview and Requirements

Ananta:
Cloud Scale Load Balancing
Presenter: Donghwi Kim
1
Background: Datacenter
Each server has a
hypervisor and VMs
Each VM is assigned a
Direct IP(
DIP
)
2
Each service has zero or
more external 
end-points
Each service is assigned one
Virtual IP (
VIP
)
Background: Datacenter
Each datacenter has
many services
A service may work with
Another service in same
datacenter
Another service in other
datacenter
A client over the internet
3
Background: Load-balancer
Entrance of server pool
Distribute workload to
worker servers
Hide server pools from
client with network
address translator (NAT)
4
Do destination address translation (DNAT)
Inbound VIP Communication
5
Front-end
VM
LB
Front-end
VM
Front-end
VM
Internet
Do source address translation (SNAT)
Outbound VIP Communication
6
Front-end
VM
LB
Back-end
VM
Front-end
VM
LB
Front-end
VM
Front-end
VM
State of the Art
A load balancer is a hardware device
Expensive, slow failover, no scalability
7
LB
Cloud Requirements
Scale
Reliability
8
Cloud Requirements
Any service anywhere
Tenant isolation
9
Ananta
 
10
SDN
SDN: Managing a flexible data plane via a
centralized control plane
11
Controller
Control Plane
Data plane
Switch
Break down
Load-balancer’s functionality
Control plane:
VIP configuration
Monitoring
Data plane
Destination/source
selection
address translation
12
Design
Ananta Manager
Source selection
Not scalable
(like SDN controller)
Multiplexer (Mux)
Destination selection
Host Agent
Address translation
Reside in each server’s
hypervisor
13
Data plane
14
Multiplexer
Multiplexer
Multiplexer
. . .
. . .
dst: VIP1
dst: VIP2
dst: VIP1
dst: 
DIP1
dst: 
DIP2
dst: 
DIP3
1
st
 tier (Router)
packet-level
load spreading
via ECMP.
2
nd
 tier (Multiplexer)
connection-level
load spreading
destination
selection
.
3
rd
 tier (Host Agent)
S
t
a
t
e
f
u
l
 
N
A
T
Inbound connections
15
Router
Router
MUX
Host
MUX
Router
MUX
Host
Agent
VM
DIP
Client
s:
 CLI
, d: VIP
s:
 CLI
, d: 
DIP
s:
 
VIP
, d: CLI
s:
 DIP
, d: CLI
Outbound (SNAT) connections
16
Server
s:
 DIP:555
, d: SVR:80
Port??
Map VIP:777 to DIP
Map VIP:777 to DIP
s:
 
VIP
:
777
, d: SVR:80
s:
 SVR:80
, d: VIP:777
s:
 SVR:80
, d: 
DIP:555
Reducing Load of AnantaManager
Optimization
Batching: Allocate 8 ports instead of one
Pre-allocation: 160 ports per VM
Demand prediction: Consider recent request history
Less than 1% of outbound connections ever hit
Ananta Manager
SNAT request latency is reduced
17
VIP traffic in a datacenter
Large portion of traffic via load-balancer is intra-DC
18
Step 1: Forward Traffic
19
Step 2: Return Traffic
20
Step 3: Redirect Messages
21
Step 4: Direct Connection
22
SNAT Fairness
Ananta Manager is not scalable
More VMs, more resources
23
Packet Rate Fairness
Each Mux keeps track of its top-talkers
(top-talker: VIPs with the highest rate of packets)
When packet drop happens, Ananta Manager
withdraws the topmost top-talker from all Muxes
24
Reliability
When Ananta Manager fails
Paxos provides fault-tolerance by replication
Typically 5 replicas
When Mux fails
1
st
 tier routers detect failure by BGP
The routers stop sending traffic to that Mux.
25
Evaluation
 
26
Impact of Fastpath
Experiment:
One 20 VM tenant as the server
Two 10 VM tenants a clients
Each VM setup 10 connections, upload 1MB data
27
Ananta Manager’s SNAT latency
Ananta manager’s port allocation latency
over 24 hour observation
28
SNAT Fairness
Normal users (N) make 150 outbound connections per minute
A heavy user (H) keep increases outbound connection rate
Observe SYN retransmit and SNAT latency
Normal users are not affected by a heavy user
29
Overall Availability
Average availability over a month: 99.95%
30
Summary
How Ananta meet cloud requirements
31
Discussion
Ananta may lose some connections
When it recovers from MUX failure
Because there is no way to copy MUX’s internal state.
32
1
st
 tier Router
 
???
TCP flows
Discussion
Detection of MUX failure takes at most 30 seconds (BGP
hold timer). Why don’t we use additional health monitoring?
Fastpath does not preserve the order of packets.
Passing through a software component, MUX, may increase
the latency of connection establishment.* (Fastpath does
not relieve this.)
Scale of evaluation is too small. (e.g. Bandwidth of 2.5Gbps,
not Tbps). Another paper insists that Ananta requires 8,000
MUXes to cover mid-size datacenter.*
33
*DUET: Cloud Scale Load Balancing with Hardware and Software, SIGCOMM‘14
Thanks !
Any Questions ?
34
Lessons learnt
Centralized controllers work
There are significant challenges in doing per-flow processing, e.g., SNAT
Provide overall higher reliability and easier to manage system
Co-location of control plane and data plane provides faster local recovery
Fate sharing eliminates the need for a separate, highly-available management channel
Protocol semantics are violated on the Internet
Bugs in external code forced us to change network MTU
Owning our own software has been a key enabler for:
Faster turn-around on bugs, DoS detection, flexibility to design new features
Better monitoring and management
Microsoft
Backup: ECMP
Equal-Cost Multi-Path Routing
Hash packet header and choose one of equal-cost paths
36
 
Backup: SEDA
 
37
Backup: SNAT
 
38
VIP traffic in a data center
Microsoft
CPU usage of Mux
CPU usage over typical 24-hr period by 14 Muxes in
single Ananta instance
40
Remarkable Points
The first middlebox architecture that moves parts
of it to the host
Deployed and served for Microsoft datacenter
more than 2 years
41
Slide Note
Embed
Share

This content provides a detailed overview of cloud-scale load balancing, involving components like servers, hypervisors, VMs, VIPs, DIPs, and the role of load balancers in distributing workloads efficiently. It also discusses communication flows involving VIPs, DIPs, front-end VMs, back-end VMs, and the need for reliable and scalable load balancer solutions in cloud environments. The state-of-the-art cloud requirements for throughput, redundancy, and quick failover are highlighted, emphasizing the critical role of load balancing in modern cloud infrastructures.

  • Cloud Load Balancing
  • Scale Requirement
  • VIP Communication
  • Datacenter Network
  • State-of-the-Art

Uploaded on Sep 15, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Ananta: Cloud Scale Load Balancing Presenter: Donghwi Kim 1

  2. Background: Datacenter Each server has a hypervisor and VMs Each VM is assigned a Direct IP(DIP) Each service has zero or more external end-points Each service is assigned one Virtual IP (VIP) 2

  3. Background: Datacenter Each datacenter has many services A service may work with Another service in same datacenter Another service in other datacenter A client over the internet 3

  4. Background: Load-balancer Entrance of server pool Distribute workload to worker servers Hide server pools from client with network address translator (NAT) 4

  5. Inbound VIP Communication payload payload payload src: Client, dst: VIP src: Client, dst: VIP src: Client, dst: VIP Internet Do destination address translation (DNAT) src: Client, dst: DIP1 src: Client, dst: DIP2 src: Client, dst: DIP3 payload payload payload VIP LB Front-end VM Front-end VM Front-end VM DIP 1 DIP 2 DIP 3 5

  6. Outbound VIP Communication Datacenter Network payload src: VIP1, dst: VIP2 Do source address translation (SNAT) payload src: VIP1, dst: VIP2 VIP 1 VIP 2 LB LB Front-end VM Front-end VM DIP 4 Front-end VM DIP 5 Front-end VM Back-end VM src: DIP2, dst: VIP2 payload DIP 3 DIP 1 DIP 2 Service 1 Service 2 6

  7. State of the Art A load balancer is a hardware device Expensive, slow failover, no scalability LB 7

  8. Cloud Requirements Scale Requirement State-of-the-art ~40 Tbps throughput using 400 servers 20Gbps for $80,000 100Gbps for a single VIP Up to 20Gbps per VIP Reliability Requirement State-of-the-art N+1 redundancy 1+1 redundancy or slow failover Quick failover 8

  9. Cloud Requirements Any service anywhere Requirement State-of-the-art Servers and LB/NAT are placed across L2 boundaries NAT supported only in the same L2 Tenant isolation Requirement State-of-the-art An overloaded or abusive tenant cannot affect other tenants Excessive SNAT from one tenant causes complete outage 9

  10. Ananta 10

  11. SDN SDN: Managing a flexible data plane via a centralized control plane Controller Control Plane Switch Data plane 11

  12. Break down Load-balancer s functionality Control plane: VIP configuration Monitoring Data plane Destination/source selection address translation 12

  13. Design Ananta Manager Source selection Not scalable (like SDN controller) Multiplexer (Mux) Destination selection Host Agent Address translation Reside in each server s hypervisor 13

  14. Data plane 1sttier (Router) packet-level load spreading via ECMP. 2ndtier (Multiplexer) connection-level load spreading destination selection. 3rdtier (Host Agent) Stateful NAT dst: VIP1 dst: VIP1 dst: VIP2 dst: DIP3 dst: VIP2 dst: DIP1 dst: VIP1 dst: DIP2 dst: VIP1 . . . Multiplexer Multiplexer Multiplexer dst: DIP3 dst: DIP1 dst: DIP2 Host Agent Host Agent Host Agent . . . VM Switch VM Switch VM Switch . . . . . . . . . VM1 VM1 VM1 VMN VMN VMN 14

  15. Inbound connections Host 2 s: CLI, d: VIP s: MUX, d: DIP s: CLI, d: VIP 4 s: CLI, d: DIP Router Router Router 3 MUX MUX MUX Host Agent 1 5 VM DIP 8 s: VIP, d: CLI Client 7 6 s: DIP, d: CLI 15

  16. Outbound (SNAT) connections Map VIP:777 to DIP Map VIP:777 to DIP s: MUX, d: DIP:555 s: SVR:80, d: VIP:777 s: SVR:80, d: VIP:777 s: SVR:80, d: DIP:555 Port?? Server s: VIP:777, d: SVR:80 s: DIP:555, d: SVR:80 16

  17. Reducing Load of AnantaManager Optimization Batching: Allocate 8 ports instead of one Pre-allocation: 160 ports per VM Demand prediction: Consider recent request history Less than 1% of outbound connections ever hit Ananta Manager SNAT request latency is reduced 17

  18. VIP traffic in a datacenter Large portion of traffic via load-balancer is intra-DC Total Traffic VIP Traffic Internet 14% VIP Traffic 44% Inter-DC 16% DIP Traffic 56% Intra-DC 70% 18

  19. Step 1: Forward Traffic Host VIP1 VM DIP1 MUX MUX MUX1 Host Agent Data Packets 1 Host VIP2 VM DIP2 2 MUX MUX MUX2 Host Agent Destination 19

  20. Step 2: Return Traffic Host VIP1 VM DIP1 4 MUX MUX MUX1 Host Agent Data Packets 1 3 Host VIP2 VM DIP2 2 MUX MUX MUX2 Host Agent Destination 20

  21. Step 3: Redirect Messages Host VIP1 VM DIP1 MUX MUX MUX1 Host Agent Redirect Packets 6 7 5 Host VIP2 VM DIP2 MUX MUX MUX2 Host Agent Destination 21

  22. Step 4: Direct Connection Host VIP1 VM DIP1 MUX MUX MUX1 Host Agent Redirect Packets Data Packets 8 Host VIP2 VM DIP2 MUX MUX MUX2 Host Agent Destination 22

  23. SNAT Fairness Ananta Manager is not scalable More VMs, more resources DIP 4 6 5 DIP 1 DIP 2 DIP 3 Pending SNAT Reques ts per DIP. At most on e per DIP. 1 2 3 4 3 Pending SNAT Reques ts per VIP. VIP 2 VIP 1 2 1 4 Global queue. Round- robin dequeue from V IP queues. Processed by thread pool. SNAT proces sing queue 1 3 2 4 23

  24. Packet Rate Fairness Each Mux keeps track of its top-talkers (top-talker: VIPs with the highest rate of packets) When packet drop happens, Ananta Manager withdraws the topmost top-talker from all Muxes 24

  25. Reliability When Ananta Manager fails Paxos provides fault-tolerance by replication Typically 5 replicas When Mux fails 1sttier routers detect failure by BGP The routers stop sending traffic to that Mux. 25

  26. Evaluation 26

  27. Impact of Fastpath Experiment: One 20 VM tenant as the server Two 10 VM tenants a clients Each VM setup 10 connections, upload 1MB data 60 No Fastpath 55 40 % CPU 20 13 10 2 0 Host Mux 27

  28. Ananta Managers SNAT latency Ananta manager s port allocation latency over 24 hour observation 28

  29. SNAT Fairness Normal users (N) make 150 outbound connections per minute A heavy user (H) keep increases outbound connection rate Observe SYN retransmit and SNAT latency Normal users are not affected by a heavy user 29

  30. Overall Availability Average availability over a month: 99.95% 30

  31. Summary How Ananta meet cloud requirements Requirement Description Scale Mux: ECMP Host agent: Scale-out naturally Reliability Ananta manager: Paxos Mux: BGP Any service anywhere Ananta is on layer 4 (Transport layer) Tenant isolation SNAT fairness Packet rate fairness 31

  32. Discussion Ananta may lose some connections When it recovers from MUX failure Because there is no way to copy MUX s internal state. TCP flows 1sttier Router MUX 5-tuple DIP DIP1 DIP2 ??? MUX (NEW) 32 5-tuple DIP

  33. Discussion Detection of MUX failure takes at most 30 seconds (BGP hold timer). Why don t we use additional health monitoring? Fastpath does not preserve the order of packets. Passing through a software component, MUX, may increase the latency of connection establishment.* (Fastpath does not relieve this.) Scale of evaluation is too small. (e.g. Bandwidth of 2.5Gbps, not Tbps). Another paper insists that Ananta requires 8,000 MUXes to cover mid-size datacenter.* *DUET: Cloud Scale Load Balancing with Hardware and Software, SIGCOMM 14 33

  34. Thanks ! Any Questions ? 34

  35. Backup: ECMP Equal-Cost Multi-Path Routing Hash packet header and choose one of equal-cost paths 36

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#