High Performance User-Level Sockets over Gigabit Ethernet

undefined
H
H
i
i
g
g
h
h
 
 
P
P
e
e
r
r
f
f
o
o
r
r
m
m
a
a
n
n
c
c
e
e
U
U
s
s
e
e
r
r
-
-
L
L
e
e
v
v
e
e
l
l
 
 
S
S
o
o
c
c
k
k
e
e
t
t
s
s
 
 
o
o
v
v
e
e
r
r
G
G
i
i
g
g
a
a
b
b
i
i
t
t
 
 
E
E
t
t
h
h
e
e
r
r
n
n
e
e
t
t
Pavan Balaji
Ohio State University
balaji@cis.ohio-state.edu
Piyush Shivam
Piyush Shivam
Ohio State University
Ohio State University
shivam@cis.ohio-state.edu
shivam@cis.ohio-state.edu
D.K. Panda
D.K. Panda
Ohio State University
Ohio State University
panda@cis.ohio-state.edu
panda@cis.ohio-state.edu
Pete Wyckoff
Pete Wyckoff
Ohio Supercomputer Center
Ohio Supercomputer Center
pw@osc.edu
pw@osc.edu
Presentation Overview
 Background and Motivation
 Design Challenges
 Performance Enhancement Techniques
 Performance Results
 Conclusions
Background and Motivation
Sockets
Frequently used API
Traditional Kernel-Based Implementation
Unable to exploit High Performance Networks
Earlier Solutions
Interrupt Coalescing
Checksum Offload
Insufficient
It gets worse with 10 Gigabit Networks
Can we do better
User-level support
Kernel Based Implementation of
Sockets
NIC
IP
TCP
Sockets
Application or Library
Hardware
Kernel
User Space
 Pros
 High Compatibility
 Cons
 Kernel Context Switches
 Multiple Copies
 CPU Resources
Alternative Implementations of
Sockets (GigaNet cLAN)
“VI aware” NIC
IP
TCP
Sockets
Application or Library
Hardware
Kernel
User Space
 Pros
 High Compatibility
 Cons
 Kernel Context Switches
 Multiple Copies
 CPU Resources
IP-to-VI layer
Sockets over User-Level Protocols
Sockets is a generalized protocol
Sockets over VIA
Developed by Intel Corporation [shah98] and ET Research
Institute [sovia01]
GigaNet cLAN platform
Most networks in the world are Ethernet
Gigabit Ethernet
Backward compatible
Gigabit Network over the existing installation base
MVIA: Version of VIA on Gigabit Ethernet
Kernel Based
A need for a High Performance Sockets layer over
Gigabit Ethernet
User-Level Protocol over Gigabit
Ethernet
Ethernet Message Passing (EMP)
Protocol
Zero-Copy OS-Bypass NIC-driven User-Level
protocol over Gigabit Ethernet
Developed over the Dual-processor Alteon NICs
Complete Offload of message passing functionality to
the NIC
 Piyush Shivam, Pete Wyckoff, D.K. Panda, 
“EMP: Zero-Copy OS-bypass NIC-
driven Gigabit Ethernet Message Passing”
, Supercomputing, November ’01
 Piyush Shivam, Pete Wyckoff, D.K. Panda, 
“Can User-Level Protocols take
advantage of Multi-CPU NICs?”
, IPDPS, April ‘02
EMP: Latency
A base latency of 28
s compared to an ~120 
s of TCP for 4-byte messages
EMP: Bandwidth
Saturated the Gigabit Ethernet network with a peak bandwidth of 964Mbps
Proposed Solution
Gigabit Ethernet NIC
Sockets over EMP
Application or Library
Hardware
Kernel
User Space
 Kernel Context Switches
 Multiple Copies
 CPU Resources
 High Performance
OS Agent
Presentation Overview
 Background and Motivation
 Background and Motivation
D
D
e
e
s
s
i
i
g
g
n
n
 
 
C
C
h
h
a
a
l
l
l
l
e
e
n
n
g
g
e
e
s
s
 Performance Enhancement Techniques
 Performance Enhancement Techniques
 Performance Results
 Performance Results
 Conclusions
 Conclusions
Design Challenges
Functionality Mismatches
Connection Management
Message Passing
Resource Management
UNIX Sockets
Functionality Mismatches and
Connection Management
Functionality Mismatches
No API for buffer advertising in TCP
Connection Management
Data Message Exchange
Descriptors required for connection
management
Message Passing
Message Passing
Data Streaming
Parts of the same message can be read
potentially to different buffers
Unexpected Message Arrivals
Separate Communication Thread
Keeps track of used descriptors and re-posts
Polling Threads have high Synchronization cost
Sleeping Threads involve OS scheduling granularity
Rendezvous Approach
Eager with Flow Control
Rendezvous Approach
Sender
Receiver
SQ
RQ
SQ
RQ
send()
receive()
Request
ACK
Data
Eager with Flow Control
Sender
Receiver
SQ
RQ
SQ
RQ
send()
Data
ACK
Data
receive()
Resource Management and UNIX
Sockets
Resource Management
Clean up unused descriptors (connection
management)
Free registered memory
UNIX Sockets
Function Overriding
Application Changes
File Descriptor Tracking
Presentation Overview
 Background and Motivation
 Background and Motivation
 Design Challenges
 Design Challenges
P
P
e
e
r
r
f
f
o
o
r
r
m
m
a
a
n
n
c
c
e
e
 
 
E
E
n
n
h
h
a
a
n
n
c
c
e
e
m
m
e
e
n
n
t
t
 
 
T
T
e
e
c
c
h
h
n
n
i
i
q
q
u
u
e
e
s
s
 Performance Results
 Performance Results
 Conclusions
 Conclusions
Performance Enhancement
Techniques
Credit Based Flow Control
Disabling Data Streaming
Delayed Acknowledgments
EMP Unexpected Queue
Credit Based Flow Control
Sender
Receiver
SQ
RQ
SQ
RQ
Credits Left: 4
Credits Left: 3
Credits Left: 2
Credits Left: 1
Credits Left: 0
Credits Left: 4
 Multiple Outstanding Credits
Non-Data Streaming and
Delayed Acknowledgments
Disabling Data Streaming
Intermediate copy required for Data Streaming
Place data directly into user buffer
Delayed Acknowledgments
Increase in Bandwidth
Lesser Network Traffic
NIC has lesser work to do
Decrease in Latency
Lesser descriptors posted
Lesser Tag Matching at the NIC
550ns per descriptor
EMP Unexpected Queue
EMP Unexpected Queue
EMP features unexpected message queue
Advantages: Last to be checked
Disadvantage: Data Copy
Acknowledgments in the Unexpected
Queue
No copy, since acknowledgments carry no data
Acknowledgments pushed out of the critical
path
Presentation Overview
 Background and Motivation
 Background and Motivation
 Design Challenges
 Design Challenges
 Performance Enhancement Techniques
 Performance Enhancement Techniques
P
P
e
e
r
r
f
f
o
o
r
r
m
m
a
a
n
n
c
c
e
e
 
 
R
R
e
e
s
s
u
u
l
l
t
t
s
s
 Conclusions
 Conclusions
Performance Results
Micro-benchmarks
Latency (ping-pong)
Bandwidth
FTP Application
Web Server
HTTP/1.0 Specifications
HTTP/1.1 Specifications
Experimental Test-bed
Four Pentium III 700Mhz Quads
1GB Main Memory
Alteon NICs
Packet Engine Switch
Linux version 2.4.18
Micro-benchmarks: Latency
 Up to 4 times improvement compared to TCP
 Overhead of 0.5us compared to EMP
Micro-benchmarks: Bandwidth
 An improvement of 53% compared to enhanced TCP
FTP Application
 Up to 2 times improvement compared to TCP
Web Server (HTTP/1.0)
 Up to 6 times improvement compared to TCP
Web Server (HTTP/1.1)
 Up to 3 times improvement compared to TCP
Conclusions
Developed a High Performance User-Level
Sockets implementation over Gigabit Ethernet
Latency close to base EMP (28 
s)
28.5 
s for Non-Data Streaming
37 
s for Data Streaming sockets
4 times improvement in latency compared to TCP
Peak Bandwidth of 840Mbps
550Mbps obtained by TCP with increased Registered space
for the kernel (up to 2MB)
Default case is 340Mbps with 32KB
Improvement of 53%
Conclusions (contd.)
FTP Application shows an improvement of
nearly 2 times
Web Server shows tremendous
performance improvement
HTTP/1.0 shows an improvement of up to 6 times
HTTP/1.1 shows an improvement of up to 3 times
Future Work
Dynamic Credit Allocation
NIC: The trusted component
Integrated QoS
Currently on Myrinet Clusters
Commercial applications in the Data
Center environment
Extend the idea to next generation
interconnects
InfiniBand
10 Gigabit Ethernet
For more information, please visit the
http://nowlab.cis.ohio-state.edu
Network Based Computing Laboratory,
The Ohio State University
T
h
a
n
k
 
Y
o
u
Slide Note

Good Morning! I’m Pavan Balaji from The Ohio State University. I’ll be presenting our group’s recent work titled “High Performance User-Level Sockets over Gigabit Ethernet”. This work has been done in collaboration with the Ohio Supercomputing center.

Embed
Share

Presentation overview of the design challenges, performance enhancement techniques, and results related to implementing high-performance user-level sockets over Gigabit Ethernet networks. The background and motivation discuss the limitations of traditional kernel-based implementations and the need for user-level support. Alternative implementations and protocols like EMP are explored for achieving high performance in socket communication. The content also delves into the advantages and drawbacks of different approaches, offering insights into the development of efficient networking protocols.

  • High Performance
  • User-Level Sockets
  • Gigabit Ethernet
  • Network Protocols
  • Kernel-based Implementations

Uploaded on Oct 05, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University balaji@cis.ohio-state.edu Piyush Shivam Ohio State University shivam@cis.ohio-state.edu Pete Wyckoff D.K. Panda Ohio State University panda@cis.ohio-state.edu Ohio Supercomputer Center pw@osc.edu

  2. Presentation Overview Background and Motivation Design Challenges Performance Enhancement Techniques Performance Results Conclusions

  3. Background and Motivation Sockets Frequently used API Traditional Kernel-Based Implementation Unable to exploit High Performance Networks Earlier Solutions Interrupt Coalescing Checksum Offload Insufficient It gets worse with 10 Gigabit Networks Can we do better User-level support

  4. Kernel Based Implementation of Sockets User Space Pros Application or Library High Compatibility Cons Sockets Kernel Context Switches TCP Kernel Multiple Copies CPU Resources IP NIC Hardware

  5. Alternative Implementations of Sockets (GigaNet cLAN) User Space Pros Application or Library High Compatibility Cons Sockets Kernel Context Switches TCP Kernel Multiple Copies CPU Resources IP IP-to-VI layer Hardware VI aware NIC

  6. Sockets over User-Level Protocols Sockets is a generalized protocol Sockets over VIA Developed by Intel Corporation [shah98] and ET Research Institute [sovia01] GigaNet cLAN platform Most networks in the world are Ethernet Gigabit Ethernet Backward compatible Gigabit Network over the existing installation base MVIA: Version of VIA on Gigabit Ethernet Kernel Based A need for a High Performance Sockets layer over Gigabit Ethernet

  7. User-Level Protocol over Gigabit Ethernet Ethernet Message Passing (EMP) Protocol Zero-Copy OS-Bypass NIC-driven User-Level protocol over Gigabit Ethernet Developed over the Dual-processor Alteon NICs Complete Offload of message passing functionality to the NIC Piyush Shivam, Pete Wyckoff, D.K. Panda, EMP: Zero-Copy OS-bypass NIC- driven Gigabit Ethernet Message Passing , Supercomputing, November 01 Piyush Shivam, Pete Wyckoff, D.K. Panda, Can User-Level Protocols take advantage of Multi-CPU NICs? , IPDPS, April 02

  8. EMP: Latency 250 200 Latency (us) 150 TCP EMP 100 50 0 4 8 16 32 64 128 256 512 1K 2K 4K Message Size (bytes) A base latency of 28 s compared to an ~120 s of TCP for 4-byte messages

  9. EMP: Bandwidth 1200 1000 Bandwidth (Mbps) 800 EMP TCP 600 400 200 0 4 8 16 32 64 128 256 512 1K 2K 4K 8K Message Size (bytes) Saturated the Gigabit Ethernet network with a peak bandwidth of 964Mbps

  10. Proposed Solution Application or Library Kernel Context Switches Multiple Copies Sockets over EMP User Space CPU Resources High Performance EMP Library OS Agent Kernel Gigabit Ethernet NIC Hardware

  11. Presentation Overview Background and Motivation Design Challenges Performance Enhancement Techniques Performance Results Conclusions

  12. Design Challenges Functionality Mismatches Connection Management Message Passing Resource Management UNIX Sockets

  13. Functionality Mismatches and Connection Management Functionality Mismatches No API for buffer advertising in TCP Connection Management Data Message Exchange Descriptors required for connection management

  14. Message Passing Message Passing Data Streaming Parts of the same message can be read potentially to different buffers Unexpected Message Arrivals Separate Communication Thread Keeps track of used descriptors and re-posts Polling Threads have high Synchronization cost Sleeping Threads involve OS scheduling granularity Rendezvous Approach Eager with Flow Control

  15. Rendezvous Approach Sender SQ Receiver SQ RQ RQ send() Request receive() ACK Data

  16. Eager with Flow Control Sender SQ Receiver SQ RQ RQ send() Data receive() ACK Data

  17. Resource Management and UNIX Sockets Resource Management Clean up unused descriptors (connection management) Free registered memory UNIX Sockets Function Overriding Application Changes File Descriptor Tracking

  18. Presentation Overview Background and Motivation Design Challenges Performance Enhancement Techniques Performance Results Conclusions

  19. Performance Enhancement Techniques Credit Based Flow Control Disabling Data Streaming Delayed Acknowledgments EMP Unexpected Queue

  20. Credit Based Flow Control Multiple Outstanding Credits Sender SQ Receiver SQ RQ RQ Credits Left: 4 Credits Left: 3 Credits Left: 2 Credits Left: 1 Credits Left: 0 Credits Left: 4

  21. Non-Data Streaming and Delayed Acknowledgments Disabling Data Streaming Intermediate copy required for Data Streaming Place data directly into user buffer Delayed Acknowledgments Increase in Bandwidth Lesser Network Traffic NIC has lesser work to do Decrease in Latency Lesser descriptors posted Lesser Tag Matching at the NIC 550ns per descriptor

  22. EMP Unexpected Queue EMP Unexpected Queue EMP features unexpected message queue Advantages: Last to be checked Disadvantage: Data Copy Acknowledgments in the Unexpected Queue No copy, since acknowledgments carry no data Acknowledgments pushed out of the critical path

  23. Presentation Overview Background and Motivation Design Challenges Performance Enhancement Techniques Performance Results Conclusions

  24. Performance Results Micro-benchmarks Latency (ping-pong) Bandwidth FTP Application Web Server HTTP/1.0 Specifications HTTP/1.1 Specifications

  25. Experimental Test-bed Four Pentium III 700Mhz Quads 1GB Main Memory Alteon NICs Packet Engine Switch Linux version 2.4.18

  26. Micro-benchmarks: Latency 250 200 Latency (us) TCP Data Streaming Non-Data Streaming EMP 150 100 50 0 4 16 64 256 1K 4K Message Size (bytes) Up to 4 times improvement compared to TCP Overhead of 0.5us compared to EMP

  27. Micro-benchmarks: Bandwidth 900 800 Bandwidth (Mbps) 700 600 Data Streaming Non-Data Streaming TCP Enhanced TCP 500 400 300 200 100 0 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K 32K 64K 128 Message Size (bytes) An improvement of 53% compared to enhanced TCP

  28. FTP Application 12 10 Transfer Time (secs) 8 6 Data Streaming Non-Data Streaming TCP 4 2 0 1 4 16 64 256 File Size (Mbytes) Up to 2 times improvement compared to TCP

  29. Web Server (HTTP/1.0) 12000 Transactions per second 10000 8000 TCP Data Streaming Non-Data Streaming 6000 4000 2000 0 4 16 64 256 1K 4K Response Size (bytes) Up to 6 times improvement compared to TCP

  30. Web Server (HTTP/1.1) 18000 Transactions per second 16000 14000 12000 TCP Data Streaming Non-Data Streaming 10000 8000 6000 4000 2000 0 4 16 64 256 1K 4K Response Size (bytes) Up to 3 times improvement compared to TCP

  31. Conclusions Developed a High Performance User-Level Sockets implementation over Gigabit Ethernet Latency close to base EMP (28 s) 28.5 s for Non-Data Streaming 37 s for Data Streaming sockets 4 times improvement in latency compared to TCP Peak Bandwidth of 840Mbps 550Mbps obtained by TCP with increased Registered space for the kernel (up to 2MB) Default case is 340Mbps with 32KB Improvement of 53%

  32. Conclusions (contd.) FTP Application shows an improvement of nearly 2 times Web Server shows tremendous performance improvement HTTP/1.0 shows an improvement of up to 6 times HTTP/1.1 shows an improvement of up to 3 times

  33. Future Work Dynamic Credit Allocation NIC: The trusted component Integrated QoS Currently on Myrinet Clusters Commercial applications in the Data Center environment Extend the idea to next generation interconnects InfiniBand 10 Gigabit Ethernet

  34. Thank You For more information, please visit the Home Page NBC http://nowlab.cis.ohio-state.edu Network Based Computing Laboratory, The Ohio State University

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#