Reconfigurable In-Rack Network for Rack-Scale Computers

XFabric: a Reconfigurable In-Rack Network
for Rack-Scale Computers
Sergey Legtchenko
, Nicholas Chen, Daniel Cletheroe,
Antony Rowstron, Hugh Williams, Xiaohan Zhao
Increasing Performance per $ in Data Centers
 
Hardware designed for data centers
 
Racks as units of deployment & operation
 
Systems on Chip (SoCs)
In-rack Networks for Rack-Scale Computers
Challenge: reducing in-rack network cost
 
Full bisection bandwidth: 9 Tbps
Cost: $$$$$
 
900 ports
 
Multi-tiered?
 
ToR switch?
 
Direct connect topology
(e.g. mesh)
 
SoCs with packet switches
Low cost
Oversubscribed
Oversubscription in Direct-Connect Topologies
 
Multi-hop routing
Path length impacts performance
Higher, less predictable latency
Lower goodput
A->D
 
Example: 3D Torus with 512 SoCs
Average hop count = 6
6x oversubscription
Path length is low if the topology is adapted to traffic
XFabric: a Reconfigurable Topology
Rack
Adapting topology to traffic
Lower path length
Reduced oversubscription
XFabric: a Reconfigurable Topology
Rack
Adapting topology to traffic
Lower path length
Reduced oversubscription
Circuit switched fabric
Electrical signal forwarding
No queuing, no packet inspection
Physical circuit
XFabric: a Reconfigurable Topology
Rack
Adapting topology to traffic
Lower path length
Reduced oversubscription
Circuit switched fabric
Electrical signal forwarding
No queuing, no packet inspection
A->D
XFabric: a Reconfigurable Topology
Rack
Adapting topology to traffic
Lower path length
Reduced oversubscription
Circuit switched fabric
Electrical signal forwarding
No queuing, no packet inspection
A->D
XFabric Architecture
Rack
 
Uplinks
 
Control plane
 
Periodic topology reconfiguration
Dynamic uplink placement
Circuit-Switching Fabric Cost
Rack
 
Challenge: high port count
Too high port count for one ASIC
 
 
 
 
e.g. 300 SoCs, 6 ports/SoC: 
1,800
 
ports
Circuit-Switching Fabric Cost
Rack
Challenge: high port count
Too high port count for one ASIC
Folded Clos total cost: $27K
e.g. 300 SoCs, 6 ports/SoC: 
1,800
 
ports
x30 300 port ASICs
Reducing Circuit-Switching Fabric Cost
Rack
Trading off reconfigurability for cost
Full reconfigurability:
Any 2 ports can be connected
Reducing Circuit-Switching Fabric Cost
Rack
Trading off reconfigurability for cost
Full reconfigurability:
Any 2 ports can be connected
Partial reconfigurability:
Port can connected to 
subset
 of ports
Reducing Circuit-Switching Fabric Cost
Rack
Connected to port 0 
on all SoCs
Trading off reconfigurability for cost
Full reconfigurability:
Any 2 ports can be connected
Partial reconfigurability:
Port can connected to 
subset
 of ports
Reducing Circuit-Switching Fabric Cost
Rack
Trading off reconfigurability for cost
Full reconfigurability:
Any 2 ports can be connected
Partial reconfigurability:
Port can connected to 
subset
 of ports
XFabric Performance at Rack Scale
Flow-based simulation, 343 SoCs, 6 ports/SoC
XFabric Performance at Rack Scale
Flow-based simulation, 343 SoCs, 6 ports/SoC
Varying traffic skew
Skewed
Uniform
(7x7x7)
XFabric Performance at Rack Scale
Flow-based simulation, 343 SoCs, 6 ports/SoC
Varying traffic skew
Production cluster workload
Traffic matrix from TCP flow trace
Lower is
better
Path length
 (#hops)
Skewed
Uniform
(7x7x7)
(7x7x7)
XFabric Prototype Performance
XFabric prototype
SoC emulated by server
 
Gen2: 160 ports @ 10Gbps
XFabric Prototype Performance
XFabric prototype
SoC emulated by server
27 servers
Unmodified TCP/IP applications
6 circuit switches, custom PCB design
Gen1: 32 ports @ 1Gbps
Gen2: 160 ports @ 10Gbps
 
23%
improvement
3DTorus
Completion time 
(normalized to 3DTorus)
Reconfiguration period (sec)
Conclusion
 
Rack-Scale Computers
Higher performance per $
Up to hundreds of SoCs/rack
XFabric: in-rack network with reconfigurable topology
Dynamic adaptation to traffic demand
Low cost
Slide Note
Embed
Share

Designing a reconfigurable in-rack network to enhance performance and reduce costs in rack-scale computers. The XFabric project focuses on adapting network topology to traffic flow, resulting in lower path lengths and reduced oversubscription. By implementing circuit-switched fabric with electrical signal forwarding, this solution eliminates queuing and packet inspection, utilizing physical circuits for efficient data transfer.

  • Reconfigurable Network
  • Rack-Scale Computers
  • XFabric Project
  • Circuit-Switched Fabric
  • Performance Optimization

Uploaded on Sep 22, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. XFabric: a Reconfigurable In-Rack Network for Rack-Scale Computers Sergey Legtchenko, Nicholas Chen, Daniel Cletheroe, Antony Rowstron, Hugh Williams, Xiaohan Zhao

  2. Increasing Performance per $ in Data Centers Hardware designed for data centers Google Jupiter (data center fabric) Pelican Cold Storage Racks as units of deployment & operation Systems on Chip (SoCs) Controllers: IO, memory... CPU NIC/Packet switch SoC d ports Rack scale computer Standard rack 40 servers 80 CPUs Open CloudServer (OCS) rack 80 servers 160 CPUs e.g. Boston Viridis Server = Calxeda SoC 900 (wimpy) CPUs $$$/server $/server In-rack consolidation

  3. In-rack Networks for Rack-Scale Computers Challenge: reducing in-rack network cost 900 ports Full bisection bandwidth: 9 Tbps Cost: $$$$$ ToR switch? Rack scale computer e.g. Boston Viridis Server = Calxeda SoC 900 (wimpy) CPUs Multi-tiered? >900 ports High power draw/cost Direct connect topology (e.g. mesh) SoCs with packet switches Low cost Oversubscribed d ports/SoC

  4. Oversubscription in Direct-Connect Topologies Multi-hop routing Path length impacts performance Higher, less predictable latency Lower goodput SoC A SoC B A->D Packet switch CPUPacket switch CPU Example: 3D Torus with 512 SoCs Average hop count = 6 6x oversubscription SoC C SoC D Packet switch CPUPacket switch CPU Path length is low if the topology is adapted to traffic

  5. XFabric: a Reconfigurable Topology Adapting topology to traffic Lower path length Reduced oversubscription SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  6. XFabric: a Reconfigurable Topology Adapting topology to traffic Lower path length Reduced oversubscription Circuit switched fabric Electrical signal forwarding No queuing, no packet inspection Physical circuit SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  7. XFabric: a Reconfigurable Topology Adapting topology to traffic Lower path length Reduced oversubscription Circuit switched fabric Electrical signal forwarding No queuing, no packet inspection SoC A SoC B A->D Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  8. XFabric: a Reconfigurable Topology Adapting topology to traffic Lower path length Reduced oversubscription Circuit switched fabric Electrical signal forwarding No queuing, no packet inspection SoC A SoC B A->D Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  9. XFabric Architecture Data center aggregation switch Uplinks Periodic topology reconfiguration Dynamic uplink placement SoC A SoC B Packet switch CPUPacket switch CPU Controller Control plane (process on one SoC in the rack) SoC C SoC D Generate topology Minimize path length Packet switch CPUPacket switch CPU Rack Configure data plane Assign circuits Update SoC routing Estimate demand

  10. Circuit-Switching Fabric Cost Commodity ASICs 160 ports @ 10 Gbps Max size: ~350 ports Cost : $3/port SoC A SoC B Packet switch CPUPacket switch CPU Challenge: high port count Too high port count for one ASIC SoC C SoC D Packet switch CPUPacket switch CPU Rack e.g. 300 SoCs, 6 ports/SoC: 1,800 ports

  11. Circuit-Switching Fabric Cost Commodity ASICs 160 ports @ 10 Gbps Max size: ~350 ports Cost : $3/port SoC A SoC B Packet switch CPUPacket switch CPU Challenge: high port count Too high port count for one ASIC Folded Clos total cost: $27K x30 300 port ASICs SoC C SoC D Packet switch CPUPacket switch CPU Rack e.g. 300 SoCs, 6 ports/SoC: 1,800 ports

  12. Reducing Circuit-Switching Fabric Cost Trading off reconfigurability for cost Full reconfigurability: Any 2 ports can be connected SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  13. Reducing Circuit-Switching Fabric Cost Trading off reconfigurability for cost Full reconfigurability: Any 2 ports can be connected Partial reconfigurability: Port can connected to subset of ports SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack

  14. Reducing Circuit-Switching Fabric Cost Trading off reconfigurability for cost Full reconfigurability: Any 2 ports can be connected Partial reconfigurability: Port can connected to subset of ports Connected to port 0 on all SoCs SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Connected to port N on all SoCs Rack

  15. Reducing Circuit-Switching Fabric Cost Trading off reconfigurability for cost Full reconfigurability: Any 2 ports can be connected Partial reconfigurability: Port can connected to subset of ports x6 300-port ASICs SoC A SoC B Packet switch CPUPacket switch CPU SoC C SoC D Packet switch CPUPacket switch CPU Rack x5 Lower cost compared to full reconfigurability: $5.4K

  16. XFabric Performance at Rack Scale Flow-based simulation, 343 SoCs, 6 ports/SoC

  17. XFabric Performance at Rack Scale Flow-based simulation, 343 SoCs, 6 ports/SoC Varying traffic skew 6 5 Path length (#hops) 4 3 2 (7x7x7) 3D Torus Random XFabric Fully reconfigurable 1 Skewed 0 2 8 32 128 Uniform Skew (cluster size)

  18. XFabric Performance at Rack Scale Flow-based simulation, 343 SoCs, 6 ports/SoC Varying traffic skew Production cluster workload Traffic matrix from TCP flow trace 6 (7x7x7) 3DTorus Lower is better 5 Path length (#hops) 6 Path length (#hops) Random 5 4 XFabric 4 Fully reconfigurable 3 3 2 (7x7x7) 3D Torus Random XFabric Fully reconfigurable 2 1 1 Skewed 0 0 2 8 32 128 Production Uniform Skew (cluster size)

  19. XFabric Prototype Performance Server XFabric prototype SoC emulated by server 27 servers Unmodified TCP/IP applications 6 circuit switches, custom PCB design Application Software Packet Switch Filter driver 6 NICs Gen1: 32 ports @ 1Gbps Gen2: 160 ports @ 10Gbps

  20. XFabric Prototype Performance Server XFabric prototype SoC emulated by server 27 servers Unmodified TCP/IP applications 6 circuit switches, custom PCB design Application Software Packet Switch 1 3DTorus (normalized to 3DTorus) Filter driver 0.8 Completion time 23% improvement 6 NICs XFabric 0.6 0.4 0.2 Gen1: 32 ports @ 1Gbps Gen2: 160 ports @ 10Gbps 0 0.1 1 10 100 1000 Reconfiguration period (sec)

  21. Conclusion Rack-Scale Computers Higher performance per $ Up to hundreds of SoCs/rack XFabric: in-rack network with reconfigurable topology Dynamic adaptation to traffic demand Low cost Deploying new circuit switch hardware Electrical circuit switching, 160 ports @ 10Gbps

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#