Asynchronous Zero-copy Communication in Sockets Direct Protocol over InfiniBand

undefined
 
04/25/06
 
Pavan Balaji (The Ohio State
University)
Asynchronous Zero-copy Communication for
Synchronous Sockets in the
Sockets Direct Protocol over InfiniBand
 
P. Balaji, S. Bhagvat, H. –W. Jin and D. K. Panda
Network Based Computing Laboratory (NBCL)
Computer Science and Engineering
Ohio State University
 
04/25/06
 
Pavan Balaji (The Ohio State University)
InfiniBand Overview
 
An emerging industry standard
High Performance
Low latency (about 2us)
High Throughput (8Gbps, 16Gbps and higher)
Advanced Features
Hardware offloaded protocol stack
Kernel bypass – direct access to network for applications
RDMA operations – direct access to remote memory
04/25/06
Pavan Balaji (The Ohio State University)
Sockets Direct Protocol (SDP)
High-Performance Alternative
to TCP/IP sockets for IB, etc.
Hijack and redirect socket calls
Application transparent
Binary compatibility (most cases)
Utilizes IB capabilities
Offloaded Protocol
RDMA operations
Kernel bypass
Sockets Direct
Protocol
App #1
App #2
High-speed Network
Device Driver
IP
TCP
Traditional
Sockets
Sockets Interface
Offloaded
Protocol
App #N
Advanced
Features
04/25/06
Pavan Balaji (The Ohio State University)
Sockets APIs Supported by SDP
(Portions of this table have been borrowed from Mellanox Technologies)
* RAIT05: “Supporting iWARP compatibility and features for regular network adapters”. P. Balaji,
H. –W. Jin, K. Vaidyanathan and D. K. Panda. RAIT Workshop; in conjunction with Cluster ‘05
BSDP, ZSDP,
AZ-SDP
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Presentation Layout
 
§
Introduction and Background
§
Understanding Asynchronous Zero-copy SDP
§
Design Issues in AZ-SDP
§
Performance Evaluation
§
Conclusions and Future Work
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Buffer-copy SDP (BSDP)
 
Several buffer-copy based
implementations of SDP
exist
OSU, Mellanox, Voltaire
HCA offloads transport
and network layers
Copy overhead still present
 
Data Source
App
Buffer
App
Buffer
SDP
Buffer
SDP
Buffer
SDP
Buffer
SDP
Buffer
SDP
Buffer
SDP
Buffer
 
Data Sink
 
SDP Data
Message
 
ISPASS04: “Sockets Direct Protocol over InfiniBand in Clusters: Is it Beneficial?”. P. Balaji,
S. Narravula, K. Vaidyanathan, S. Krishnamoorthy and D. K. Panda. IEEE International
Conference on Performance Analysis of Systems and Software (ISPASS), 2004.
04/25/06
Pavan Balaji (The Ohio State University)
Zero-copy SDP (ZSDP)
Implemented by Mellanox
RDMA Read based design
Benefits of zero-copy
Limited by the API of
Synchronous Sockets
At most one outstanding
communication request
Control message latency (50%
time for 16K message)
Intolerant to Skew
Data Source
App
Buffer
Data Sink
SRC AVAIL
App
Buffer
send()
Send
Complete
App
Buffer
SRC AVAIL
App
Buffer
send()
Send
Complete
GET COMPLETE
GET COMPLETE
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Asynchronous Zero-copy SDP
(AZ-SDP)
 
Basic zero-copy communication is synchronous
Data communication accompanied by control messages
Communication will be latency bound
Asynchronous Zero-copy SDP
Utilize the benefits of asynchronous communication (more
than one outstanding communication operation)
Maintain the semantics of synchronous sockets (application
can assume that it is using synchronous sockets)
Objectives: Correctness, Transparency and Performance
Key Idea: Memory protect buffers
 
04/25/06
 
Pavan Balaji (The Ohio State University)
 
Memory
Protect
 
Memory
Protect
AZ-SDP Functionality
 
Send returns as soon as
communication is initiated
Application “thinks”
communication is synchronous
Memory unprotected after
communication completes
If application touches buffer
Communication complete: Great!
Else PAGE FAULT generated
 
SRC AVAIL
 
send()
App
Buffer1
 
SRC AVAIL
 
send()
App
Buffer2
 
Memory
Unprotect
 
GET COMPLETE
App
Buffer1
App
Buffer2
 
Data Source
 
Data Sink
 
Get Data
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Presentation Layout
 
§
Introduction and Background
§
Understanding Asynchronous Zero-copy SDP
§
Design Issues in AZ-SDP
§
Performance Evaluation
§
Conclusions and Future Work
04/25/06
Pavan Balaji (The Ohio State University)
Design Issues in AZ-SDP
Handling a Page Fault
Block-on-Write: Wait till the communication has finished
Copy-on-Write: Copy data to internal buffer and carry on
communication
Handling Buffer Sharing
Buffers shared through mmap()
Handling Unaligned Buffers
Memory protection is only in the granularity of a page
Malloc hook overheads
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Handling a Page Fault
 
Memory protection needed to disallow the
application from accessing an occupied
communication buffer
Page fault generated on access
Number of page faults generated are application dependent
Two approaches for handling the page-fault
Block on Write
Copy on Write
04/25/06
Pavan Balaji (The Ohio State University)
Memory
Protect
Block-on-Write
Optimistic approach to avoid
blocking for communication
ZSDP blocks during the
communication call
AZ-SDP delays blocking
Advantage:
Zero-copy communication
SDP specification compliant
Disadvantage:
Not skew tolerant
SRC AVAIL
send()
App
Buffer1
Memory
Unprotect
GET COMPLETE
App
Buffer1
Data Source
Data Sink
Get Data
 
Application
touches buffer
 
PAGE FAULT
generated
 
Block
04/25/06
Pavan Balaji (The Ohio State University)
Memory
Protect
Copy-on-Write
Enhances the functionality of
Block-on-Write
Does not blindly block
Advantage:
Zero-copy communication when
possible
Skew tolerant when receiver is
not ready
Disadvantage
Not SDP specification compliant
SRC AVAIL
send()
App
Buffer1
 
Memory
Unprotect
 
GET COMPLETE
App
Buffer1
Data Source
Data Sink
 
Application
touches buffer
 
PAGE FAULT
generated
 
Block
 
Atomic Lock
 
Atomic Lock
Failed
App
Buffer1
 
Atomic Lock
Successful
 
Copy to temp.
buffer
 
SRC UPDATE
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Presentation Layout
 
§
Introduction and Background
§
Understanding Asynchronous Zero-copy SDP
§
Design Issues in AZ-SDP
§
Performance Evaluation
§
Conclusions and Future Work
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Experimental Test-Bed
 
4 node cluster
Dual 3.6 GHz Intel Xeon EM64T processors (2 MB L2 cache),
512 MB of 333 MHz DDR SDRAM
Mellanox MT25208 InfiniHost III DDR PCI-Express
adapters (capable of a link-rate of 16 Gbps)
Mellanox MTS-2400, 24-port fully non-blocking DDR switch
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Throughput and Comp./Comm.
Overlap
 
 30% improvement in the throughput
 Up to 2X improvement in computation/communication overlap tests
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Impact of Page-faults
 
 When application touches the communication buffer very frequently,
PAGE FAULT overheads degrade AZ-SDP’s performance
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Presentation Layout
 
§
Introduction and Background
§
Understanding Asynchronous Zero-copy SDP
§
Design Issues in AZ-SDP
§
Performance Evaluation
§
Conclusions and Future Work
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Conclusions and Future Work
 
Current Zero-copy SDP approaches: Very restrictive
AZ-SDP brings the benefits of asynchronous sockets
to synchronous sockets in a 
TRANSPARENT
 manner
30% better throughput and 2X improvement in
computation-communication overlap tests
 
Analysis with applications and large-scale clusters
Integration with OpenIB/Gen2
 
21
Acknowledgements
 
 Our research is supported by the following organizations
 
 
Current Funding support by
 
 
Current Equipment support by
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Web Pointers
 
Website: 
http://www.cse.ohio-state.edu/~balaji
Group Homepage: 
http://nowlab.cse.ohio-state.edu
Email: balaji@cse.ohio-state.edu
 
NBCL
undefined
 
04/25/06
 
Pavan Balaji (The Ohio State
University)
Backup Slides
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Sockets Programming Model
 
Several high-speed networks available today
E.g., InfiniBand (IB), Myrinet, 10-Gigabit Ethernet
Common programming models
E.g., Sockets, MPI, Shared Memory Models
Network independent parallel and distributed applications
Sockets programming model is of particular interest
Scientific apps, file/storage systems, commercial apps
Traditionally built over TCP/IP (and others)
Performance of such implementations is not the best
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Limitations of TCP/IP Sockets
for High-speed Networks
 
Network/Transport layers processed by the host
Limited performance
Excessive resource usage (CPU, Memory traffic)
Generic optimizations for TCP/IP sockets
Cannot sustain the performance of high-speed networks
Performance on IB (16Gbps) adapters limited to 2Gbps
Sockets Direct Protocol (SDP) proposed
Alternative to TCP/IP Sockets
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Zero-Copy Mechanisms in SDP
 
S
R
C
 
A
v
a
i
l
a
b
l
e
 
R
D
M
A
 
R
e
a
d
 
D
a
t
a
 
G
E
T
 
C
o
m
p
l
e
t
e
 
S
e
n
d
e
r
 
R
e
c
e
i
v
e
r
 
S
I
N
K
 
A
v
a
i
l
a
b
l
e
 
R
D
M
A
 
W
r
i
t
e
 
D
a
t
a
 
P
U
T
 
C
o
m
p
l
e
t
e
 
S
e
n
d
e
r
 
R
e
c
e
i
v
e
r
 
R
e
g
i
s
t
e
r
 
B
u
f
f
e
r
 
R
e
g
i
s
t
e
r
 
B
u
f
f
e
r
 
S
O
U
R
C
E
-
A
V
A
I
L
 
S
I
N
K
-
A
V
A
I
L
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Prior Research
 
Prior Research on  High-Performance Sockets
spanning various networks (Giganet CLAN, VIA, GbE,
Myrinet)
SDP over IBA: Buffer-copy based implementation
Recent research on Zero-copy SDP [Goldenberg05]
Zero-copy schemes to optimize TCP and UDP stacks
Mostly for asynchronous sockets
May require kernel/NIC firmware modifications
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Latency and Throughput
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Computation/Communication
Overlap
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Multi-connection Tests
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Hot-spot Latency Test
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Buffer Sharing
 
Memory-protect B1 and disallow all
access to it
Override the mmap() call (libc) with a
new mmap call
New mmap() call contains mapping of
all memory-mapped buffers
B1
B2
 
B1 and B2 are memory mapped to each other
 
Send()
 
Write()
 
04/25/06
 
Pavan Balaji (The Ohio State University)
 Managing Un-aligned Buffers
 
Two approaches
Malloc Hook
Hybrid approach with Buffered SDP
Physical
Page
 
VAPI Control
Buffer
 
Application Buffer
 
Shared Page
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Malloc Hook
 
Approach overrides the malloc() and free() system
calls
New Malloc() allocates physical page boundary-
aligned 
N + PAGE_SIZE
 bytes, when 
N
 bytes are
requested
 
Advantage :
Simple Approach
Disadvantage :
Very small buffer requests may result in buffer wastage
Time to malloc few bytes to Physical Page size is the
same
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Hybrid approach with Buffered
SDP
 
Hybrid Mechanism between BSDP and AZ-SDP
Physical
Page
 
VAPI Control
Buffer
 
Application Buffer
AZ-SDP
communication
BSDP
comm.
BSDP
comm.
 
A single communication might be carried out in multiple operations
(upto three)
5-10% better performance than Malloc-hook based scheme
 
04/25/06
 
Pavan Balaji (The Ohio State University)
Copy-on-Write
 
Control maintained via Locks at the receiver end
by the AZ-SDP layer
Receiver obtains the lock, if recv() is called first
Sender can obtain the lock on generation of a page
fault and can perform a copy-on-write operation
Slide Note
Embed
Share

This study explores the implementation of Asynchronous Zero-copy Communication for Synchronous Sockets in the Sockets Direct Protocol over InfiniBand. It discusses InfiniBand's high performance, low latency, and advanced features, as well as the Sockets Direct Protocol as a high-performance alternative to TCP/IP sockets. The presentation layout covers Introduction, Understanding of Asynchronous Zero-copy SDP, Design Issues in AZ-SDP, Performance Evaluation, and Conclusions with Future Work.

  • InfiniBand
  • Sockets Direct Protocol
  • Asynchronous Communication
  • Zero-copy
  • High Performance

Uploaded on Oct 01, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Asynchronous Zero-copy Communication for Synchronous Sockets in the Sockets Direct Protocol over InfiniBand P. Balaji, S. Bhagvat, H. W. Jin and D. K. Panda Network Based Computing Laboratory (NBCL) Computer Science and Engineering Ohio State University 04/25/06 Pavan Balaji (The Ohio State University)

  2. InfiniBand Overview An emerging industry standard High Performance Low latency (about 2us) High Throughput (8Gbps, 16Gbps and higher) Advanced Features Hardware offloaded protocol stack Kernel bypass direct access to network for applications RDMA operations direct access to remote memory 04/25/06 Pavan Balaji (The Ohio State University)

  3. Sockets Direct Protocol (SDP) High-Performance Alternative to TCP/IP sockets for IB, etc. Hijack and redirect socket calls Application transparent Binary compatibility (most cases) Utilizes IB capabilities Offloaded Protocol RDMA operations Kernel bypass App #1 App #2 App #N Sockets Interface Traditional Sockets Sockets Direct Protocol TCP IP Device Driver Offloaded Protocol Advanced Features High-speed Network 04/25/06 Pavan Balaji (The Ohio State University)

  4. Sockets APIs Supported by SDP Synchronous Sockets Synchronous Asynchronous Sockets Asynchronous Extended Sockets (OSU Specific)* Communication Operations Outstanding SDP Implementations Existing Applications Potential for Performance Asynchronous At most one More than one More than one BSDP, ZSDP, BSDP, ZSDP AZ-SDP BSDP, ZSDP BSDP, ZSDP Most Few Very few Limited High High (Portions of this table have been borrowed from Mellanox Technologies) * RAIT05: Supporting iWARP compatibility and features for regular network adapters . P. Balaji, H. W. Jin, K. Vaidyanathan and D. K. Panda. RAIT Workshop; in conjunction with Cluster 05 04/25/06 Pavan Balaji (The Ohio State University)

  5. Presentation Layout Introduction and Background Understanding Asynchronous Zero-copy SDP Design Issues in AZ-SDP Performance Evaluation Conclusions and Future Work 04/25/06 Pavan Balaji (The Ohio State University)

  6. Buffer-copy SDP (BSDP) Several buffer-copy based implementations of SDP exist OSU, Mellanox, Voltaire HCA offloads transport and network layers Copy overhead still present SDP Buffer SDP Data Message Data Sink App Buffer SDP Buffer SDP Buffer App Buffer SDP SDP Buffer Buffer Data Source SDP Buffer ISPASS04: Sockets Direct Protocol over InfiniBand in Clusters: Is it Beneficial? . P. Balaji, S. Narravula, K. Vaidyanathan, S. Krishnamoorthy and D. K. Panda. IEEE International Conference on Performance Analysis of Systems and Software (ISPASS), 2004. 04/25/06 Pavan Balaji (The Ohio State University)

  7. Zero-copy SDP (ZSDP) Implemented by Mellanox RDMA Read based design Benefits of zero-copy Limited by the API of Synchronous Sockets At most one outstanding communication request Control message latency (50% time for 16K message) Intolerant to Skew App Buffer send() SRC AVAIL Application Blocks App Buffer Send Complete App Buffer GET COMPLETE send() SRC AVAIL Application Blocks App Buffer Send Complete GET COMPLETE Data Source Data Sink 04/25/06 Pavan Balaji (The Ohio State University)

  8. Asynchronous Zero-copy SDP (AZ-SDP) Basic zero-copy communication is synchronous Data communication accompanied by control messages Communication will be latency bound Asynchronous Zero-copy SDP Utilize the benefits of asynchronous communication (more than one outstanding communication operation) Maintain the semantics of synchronous sockets (application can assume that it is using synchronous sockets) Objectives: Correctness, Transparency and Performance Key Idea: Memory protect buffers 04/25/06 Pavan Balaji (The Ohio State University)

  9. AZ-SDP Functionality Send returns as soon as communication is initiated Application thinks communication is synchronous Memory unprotected after communication completes If application touches buffer Communication complete: Great! Else PAGE FAULT generated send() SRC AVAIL Memory Protect App Buffer1 send() SRC AVAIL Memory Protect App Buffer2 Get Data Memory Unprotect App Buffer1 App Buffer2 GET COMPLETE Data Source Data Sink 04/25/06 Pavan Balaji (The Ohio State University)

  10. Presentation Layout Introduction and Background Understanding Asynchronous Zero-copy SDP Design Issues in AZ-SDP Performance Evaluation Conclusions and Future Work 04/25/06 Pavan Balaji (The Ohio State University)

  11. Design Issues in AZ-SDP Handling a Page Fault Block-on-Write: Wait till the communication has finished Copy-on-Write: Copy data to internal buffer and carry on communication Handling Buffer Sharing Buffers shared through mmap() Handling Unaligned Buffers Memory protection is only in the granularity of a page Malloc hook overheads 04/25/06 Pavan Balaji (The Ohio State University)

  12. Handling a Page Fault Memory protection needed to disallow the application from accessing an occupied communication buffer Page fault generated on access Number of page faults generated are application dependent Two approaches for handling the page-fault Block on Write Copy on Write 04/25/06 Pavan Balaji (The Ohio State University)

  13. Block-on-Write Optimistic approach to avoid blocking for communication ZSDP blocks during the communication call AZ-SDP delays blocking Advantage: Zero-copy communication SDP specification compliant Disadvantage: Not skew tolerant send() SRC AVAIL Memory Protect App Buffer1 Memory Unprotect Application touches buffer PAGE FAULT generated Get Data Block App Buffer1 GET COMPLETE Data Source Data Sink 04/25/06 Pavan Balaji (The Ohio State University)

  14. Copy-on-Write Enhances the functionality of Block-on-Write Does not blindly block Advantage: Zero-copy communication when possible Skew tolerant when receiver is not ready Disadvantage Not SDP specification compliant send() SRC AVAIL Memory Protect App Buffer1 Memory Unprotect Application touches buffer PAGE FAULT generated Atomic Lock Failed buffer Atomic Lock Atomic Lock Successful Copy to temp. App Buffer1 Block SRC UPDATE GET COMPLETE App Buffer1 Data Source Data Sink 04/25/06 Pavan Balaji (The Ohio State University)

  15. Presentation Layout Introduction and Background Understanding Asynchronous Zero-copy SDP Design Issues in AZ-SDP Performance Evaluation Conclusions and Future Work 04/25/06 Pavan Balaji (The Ohio State University)

  16. Experimental Test-Bed 4 node cluster Dual 3.6 GHz Intel Xeon EM64T processors (2 MB L2 cache), 512 MB of 333 MHz DDR SDRAM Mellanox MT25208 InfiniHost III DDR PCI-Express adapters (capable of a link-rate of 16 Gbps) Mellanox MTS-2400, 24-port fully non-blocking DDR switch 04/25/06 Pavan Balaji (The Ohio State University)

  17. Throughput and Comp./Comm. Overlap Throughput Comp./Comm. Overlap 10000 12000 9000 BSDP 10000 BSDP 8000 ZSDP ZSDP AZSDP 7000 Throughput (Mbps) 8000 Throughput (Mbps) AZ-SDP 6000 6000 5000 4000 4000 3000 2000 2000 1000 0 0 1M 16 64 256 1 4 1K 4K 16K 64K 256K 100 120 140 160 180 200 0 20 40 60 Delay (usec) 80 Message Size (Bytes) 30% improvement in the throughput Up to 2X improvement in computation/communication overlap tests 04/25/06 Pavan Balaji (The Ohio State University)

  18. Impact of Page-faults Effect of Page Faults (1MB Message) Effect of Page Faults (64KB Message) 12000 9000 8000 10000 7000 Throughput (Mbps) Throughput (Mbps) 8000 6000 5000 6000 4000 BSDP ZSDP AZ-SDP 4000 3000 BSDP ZSDP AZ-SDP 2000 2000 1000 0 0 1 2 3 4 Window Size 5 6 7 8 9 10 1 2 3 4 Window Size 5 6 7 8 9 10 When application touches the communication buffer very frequently, PAGE FAULT overheads degrade AZ-SDP s performance 04/25/06 Pavan Balaji (The Ohio State University)

  19. Presentation Layout Introduction and Background Understanding Asynchronous Zero-copy SDP Design Issues in AZ-SDP Performance Evaluation Conclusions and Future Work 04/25/06 Pavan Balaji (The Ohio State University)

  20. Conclusions and Future Work Current Zero-copy SDP approaches: Very restrictive AZ-SDP brings the benefits of asynchronous sockets to synchronous sockets in a TRANSPARENT manner 30% better throughput and 2X improvement in computation-communication overlap tests Analysis with applications and large-scale clusters Integration with OpenIB/Gen2 04/25/06 Pavan Balaji (The Ohio State University)

  21. Acknowledgements Our research is supported by the following organizations Current Funding support by Current Equipment support by 21

  22. Web Pointers Network Based Computing Laboratory NBCL Website: http://www.cse.ohio-state.edu/~balaji Group Homepage: http://nowlab.cse.ohio-state.edu Email: balaji@cse.ohio-state.edu 04/25/06 Pavan Balaji (The Ohio State University)

  23. Backup Slides 04/25/06 Pavan Balaji (The Ohio State University)

  24. Sockets Programming Model Several high-speed networks available today E.g., InfiniBand (IB), Myrinet, 10-Gigabit Ethernet Common programming models E.g., Sockets, MPI, Shared Memory Models Network independent parallel and distributed applications Sockets programming model is of particular interest Scientific apps, file/storage systems, commercial apps Traditionally built over TCP/IP (and others) Performance of such implementations is not the best 04/25/06 Pavan Balaji (The Ohio State University)

  25. Limitations of TCP/IP Sockets for High-speed Networks Network/Transport layers processed by the host Limited performance Excessive resource usage (CPU, Memory traffic) Generic optimizations for TCP/IP sockets Cannot sustain the performance of high-speed networks Performance on IB (16Gbps) adapters limited to 2Gbps Sockets Direct Protocol (SDP) proposed Alternative to TCP/IP Sockets 04/25/06 Pavan Balaji (The Ohio State University)

  26. Zero-Copy Mechanisms in SDP Register Buffer Register Buffer SRC Available SINK Available RDMA Read Data RDMA Write Data PUT Complete GET Complete Sender Receiver Sender Receiver SOURCE-AVAIL SINK-AVAIL 04/25/06 Pavan Balaji (The Ohio State University)

  27. Prior Research Prior Research on High-Performance Sockets spanning various networks (Giganet CLAN, VIA, GbE, Myrinet) SDP over IBA: Buffer-copy based implementation Recent research on Zero-copy SDP [Goldenberg05] Zero-copy schemes to optimize TCP and UDP stacks Mostly for asynchronous sockets May require kernel/NIC firmware modifications 04/25/06 Pavan Balaji (The Ohio State University)

  28. Latency and Throughput Unidirectional Throughput Ping-pong Latency 12000 1600 BSDP ZSDP AZ-SDP 1400 10000 1200 BSDP ZSDP AZ-SDP 8000 Latency (usec) Throughput(Mbps) 1000 800 6000 600 4000 400 2000 200 0 0 1K 4K 1M 1 4 16 64 256 16K 64K 256K 1K 4K 1M 1 4 16 64 256 16K 64K 256K Message Size(Bytes) Message Size(Bytes) 04/25/06 Pavan Balaji (The Ohio State University)

  29. Computation/Communication Overlap Computation/Communication Overlap(64K) Computation/Communication Overlap(1M) 10000 12000 9000 10000 BSDP ZSDP AZSDP 8000 7000 Throughput(Mbps) 8000 Throughput(Mbps) 6000 5000 6000 4000 4000 3000 BSDP ZSDP AZSDP 2000 2000 1000 0 0 100 120 140 160 180 200 0 20 40 60 80 0 20 40 60 Computation(us) 80 100 120 140 160 180 200 Computation(us) 04/25/06 Pavan Balaji (The Ohio State University)

  30. Multi-connection Tests Multi-Stream Throughput(64K) Multi-Client Throughput 14000 14000 12000 12000 BSDP ZSDP AZ-SDP 10000 Throughput(Mbps) 10000 Throughput(Mbps) 8000 8000 6000 6000 BSDP ZSDP AZ-SDP 4000 4000 2000 2000 0 1K 4K 1M 1 4 16 64 256 16K 64K 256K 0 1 2 3 4 5 6 7 8 Message Size(Bytes) Number of Streams 04/25/06 Pavan Balaji (The Ohio State University)

  31. Hot-spot Latency Test Hot-Spot Latency 12000 10000 BSDP ZSDP AZ-SDP 8000 Latency(us) 6000 4000 2000 0 1M 1 2 4 8 16 32 64 128 256 512 Message Size(Bytes) 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 04/25/06 Pavan Balaji (The Ohio State University)

  32. Buffer Sharing Send() Memory-protect B1 and disallow all B1 access to it Override the mmap() call (libc) with a new mmap call B2 New mmap() call contains mapping of Write() all memory-mapped buffers B1 and B2 are memory mapped to each other 04/25/06 Pavan Balaji (The Ohio State University)

  33. Managing Un-aligned Buffers Physical Page VAPI Control Buffer Application Buffer Shared Page Two approaches Malloc Hook Hybrid approach with Buffered SDP 04/25/06 Pavan Balaji (The Ohio State University)

  34. Malloc Hook Approach overrides the malloc() and free() system calls New Malloc() allocates physical page boundary- aligned N + PAGE_SIZE bytes, when N bytes are requested Advantage : Simple Approach Disadvantage : Very small buffer requests may result in buffer wastage Time to malloc few bytes to Physical Page size is the same 04/25/06 Pavan Balaji (The Ohio State University)

  35. Hybrid approach with Buffered SDP Hybrid Mechanism between BSDP and AZ-SDP VAPI Control Buffer Application Buffer Physical Page BSDP comm. AZ-SDP communication BSDP comm. A single communication might be carried out in multiple operations (upto three) 5-10% better performance than Malloc-hook based scheme 04/25/06 Pavan Balaji (The Ohio State University)

  36. Copy-on-Write Control maintained via Locks at the receiver end by the AZ-SDP layer Receiver obtains the lock, if recv() is called first Sender can obtain the lock on generation of a page fault and can perform a copy-on-write operation 04/25/06 Pavan Balaji (The Ohio State University)

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#