Innovations in Software-Defined Network Computing

Software Packet Processing
- The Click modular router
- netmap: A novel framework for fast packet I/O
 
Presented by 
Shinae Woo
 
EE807 Software-defined Network Computing
The Click modular router
 
 
EDDIE KOHLER, ROBERT MORRIS, BENJIE CHEN, JOHN JANNOTTI,
and M. FRANS KAASHOEK
Router’s functionality
 
Routing + Forwarding
Do much more than routing and forwarding
 
 
 
 
 
 
Firewall
 
Packet filtering
 
Packet tunneling
 
Traffic prioritizing
 
Network address translation
Routers in early 2000
Cisco ASR 1013
Cisco NCS 6008 
Single Chassis System
Juniper E120 
Broadband Services Router
 
HW routers
Specialized HW + proprietary SW
Monolithic, closed, static and inflexible
Difficult to add/delete functionality
 
SW routers
Commodity hardware + shipped with kernel
Hard to extend:
Need to modify monolithic kernel code
 
 
 
x-kernel
 
scout
 
Streams
 
Netgraph
 
Q. How to design an extendable SW router platform?
Requirements
 
No such solutions in early 2000 
1. Flexible and configurable 
router design
 
2. Extensible router design
 
3. Clearly defined interfaces
between router functionalities
Click Modular Router
 
Divide
 
Conquer
 
Break down to individual
router functionalities
 
Link individual functionalities
to complete a router design
Click architecture
A directed graph with
Elements
A single router function
Connections
Possible packet path between
two elements
Elements
 
A single router function
Input and output ports
Configuration strings
Per-element state, tuning behavior
Connections
 
Possible paths for packet handoff
 
Building a routing configuration
1. Choose a collection of elements
2. Connect them into a directed graph
 
Queue
 
Start with
packet arrival events
 
Start with
available packet transmission
and scheduling events
 
1. Implicit queue in elements
 
2. Explicit queue outside elements
 
VS.
 
Queue: Unit for scheduling  (single task or single thread)
 
Click’s choice
Push and Pull Connections (1)
Queue
Destination
Source
 
receive
packet p
 
ready to
transmit
 
Push
 
Pull
 
enqueue
packet p
 
dequeue
packet p
 
Single scheduling unit
 
Single scheduling unit
 
push (p)
 
pull (p)
 
return
 
return
Push and Pull Connections (11)
Single scheduling unit
Single scheduling unit
 
Push connection (
▶■
)
 
Pull connection (
▷□
)
 
Agnostic connection (double outline)
Becomes push or pull depending on peer
Either push or pull, not mixed
Invalid connections between elements
 
1. Push output cannot
connects to pull input
 
2. Push output cannot have
more than one connection
 
3. Pull input cannot have
more than one connection
 
4. Agnostic elements cannot
have mixed push/pull context
Click implementation
 
Two versions
Linux in-kernel driver: Good for production
User-level driver: Good for profiling and debugging
 
Element
C++ class
20 virtual functions – need overwrite 6 or less
push, pull, run_scheduled
 
Connection
Virtual function calls between elements
 
Passing configuration file to the driver
 
 
 
 
 
 
Simple languages
for declarations and connections
Fully declarative
Declaration
Connection
Not how to process packets
 
Compound element
IP Router
 
16 elements in push path
1 elements in pull path
 
Bring local information in packet
e.g.,TTL
Bring annotated information
with packets
One element creates information
Other element uses information
e
.g., Destination IP address,
Paint (Marking packet with color)
Extensibility
We can 
easily
 add 
additional functionality
 to IP router
(1)
Scheduling
(2)
Dropping Policy
(3)
Differentiated Services
(4)
Ethernet Switch
 
Covered in this talk
Extensibility (1) Scheduling
Stochastic Fairness Queueing (SFQ)
Providing isolation between competing flows
Distributing packets into multiple queues
Extensibility (2) Dropping policy
 
Weighted Random Early Detection (RED)
Red: Dropping packets with network congestion
 
 
 
 
 
 
 
RED element needs to know ‘nearest downstream queue’
S1. Manual configuration to give the information
S2. Flow-based router context
 
What is your queue length?
My queue length is 10
Flow-based router context
Answers for
If I were to emit a packet on my second output,
Where might it go?
Which queues might it encounter?
 Using a simple data-flow algorithm on configuration graph
 
RED: Where is my nearest down stream queue?
Evaluation environment
Source
Destination
Click
IP Router
100 Mbit/s
700 MHz
Intel Pentium 
733 MHz
Intel Pentium 
200 MHz
Intel Pentium 
Pro
 
UDP packets
147,900 64B PPS
100 Mbit/s
 
Make a bottlneck
Simple forwarding rate
 
Click keeps the baseline performance of Linux
 
Peak: 360,000 PPS
Loss-free: 333,000 PPS
 
Linux routing table algorithm
is slower than Click’s
 
Receive live rock
 
Click’s polling driver
avoiding interrupts and efficient I/O
Forwarding rate with richer functionality
 
 Comparison with real network
Not involve: fragmentation, IP options, ICMP error
Smaller number of ARP entries
Click’s performance drops 
only gradually
Overhead of
modularity
 
Overhead of passing packets between
elements
Single virtual function call: 70 ns
16 elements in IP router:
70 ns * 16 = 
1 us / packet
Combine multiple elements into one
16 elements 
 8 elements
Number of virtual function calls
will decrease
Push path latency decreases
1.57 us 
 1.03 us
Overhead of modularity
 
Unnecessarily general element code
Implemented more generality than necessary
Classifier
Required for IP router: ARP, IP packets
Implemented:
a small data structure for finding which packet data to examine
Special classifier
24% smaller CPU cycles than general classifier
But only 4% less than total cost
Conclusion
 
Click Modular Router
An open, extensible, and configurable router framework
 
Building a complex IP router by connecting small, modular elements
Modularity does not harm the performance of the base Linux system
Easily adding extended functionality
 
 
 
http://www.read.cs.ucla.edu/click/
netmap: 
a novel framework for fast
packet processing
 
Luigi Rizzo
netmap:
a novel framework for fast packet processing
 
Possible number of packets
@ a 10G link
14.88 Mpps 
(64B packets)
Packet processing overhead in
commodity OS
Per-packet dynamic memory
allocation
System call overhead
Memory copies
netmap
60-65 cycles/packet (67 ns)
sendto() execution path and time in FreeBSD
 
system call
 
1us / packet
1.05 Mbps
netmap architecture
 
Netmap’s approaches
Per-packet dynamic memory allocation 
 Preallocated resources
System call overhead 
 Large batching
Memory copies 
 Shared buffer between kernel and userspace
Evaluation
 
Test equipment
i7-870 4 core @ 2.93MHz
Intel 82599 10G NIC
 
 
 
 
 
 
 
 
64B packet Tx
netmap: 60-65 cycles/packet
 
Line rate
 
8x
Discussion
Click
implicit dependency from annotation
 
Annotation makes implicit dependency between elements
There are 16 annotation types in the paper
(+ custom types now)
How to resolve those dependency?
 
 
 
There are no consideration on the configuration step
 
Paint
 
CheckPaint
 
Set paint annotation
 
Use paint annotation
Click
Extend to support L4/Stateful functionality
 
How to support TCP functionality
Can be built as a single element
 
 
Cannot support modularity on L4 layer
How to support IP defragmentation
It has to reserve packets until the other parts of packets arrived
 
 
 
 
Click elements requires to store packets
 
 
 
 
 
TCP
Click
Tradeoff between modularity vs efficiency
 
There are fundamental tradeoff between modularity and
efficiency
 
 
 
 
 
 
VS.
 
How can we get both benefit?
Give a configuration
in a modular way
 
 
Optimize binary to
integrated functionality
IPPush
netmap
Comparison between other apporoaches
 
There are many similar work in this area
PSIO, PF_RING, DPDK
 
Similar approaches to solve same challenges
Benefit from netmap compared with other approaches
Integrated with FreeBSD
Not depend on specific hardware
 
Slide Note
Embed
Share

The presentation explores the evolution of routers in the early 2000s, the limitations of hardware routers, and the need for more flexible and extensible software routers. It discusses the concept of the Click modular router and its architecture, emphasizing the benefits of a modular design approach in packet processing. Key elements of the Click architecture are highlighted, showcasing how it enables the creation of customizable router functionalities.

  • Software-Defined Network
  • Computing Systems
  • Click Modular Router
  • Packet Processing
  • Network Architecture

Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. EE807 Software-defined Network Computing Software Packet Processing -The Click modular router - netmap: A novel framework for fast packet I/O Presented by Shinae Woo NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  2. The Click modular router EDDIE KOHLER, ROBERT MORRIS, BENJIE CHEN, JOHN JANNOTTI, and M. FRANS KAASHOEK NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  3. Routers functionality Routing + Forwarding Do much more than routing and forwarding Packet filtering Firewall Network address translation Packet tunneling Traffic prioritizing NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  4. Routers in early 2000 HW routers Specialized HW + proprietary SW Monolithic, closed, static and inflexible Difficult to add/delete functionality SW routers Commodity hardware + shipped with kernel Hard to extend: Need to modify monolithic kernel code Netgraph x-kernel Cisco ASR 1013 Cisco NCS 6008 Single Chassis System scout Juniper E120 Broadband Services Router Streams Q. How to design an extendable SW router platform? NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  5. Requirements 1. Flexible and configurable router design 2. Extensible router design 3. Clearly defined interfaces between router functionalities No such solutions in early 2000 NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  6. Click Modular Router ARP query Packet classification Switching packets Address lookup Divide Break down to individual router functionalities Conquer Link individual functionalities to complete a router design NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  7. Click architecture A directed graph with Elements A single router function Connections Possible packet path between two elements NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  8. Elements A single router function Input and output ports Configuration strings Per-element state, tuning behavior NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  9. Connections Possible paths for packet handoff Building a routing configuration 1. Choose a collection of elements 2. Connect them into a directed graph NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  10. Queue Packet storage Start with packet arrival events Start with available packet transmission and scheduling events Click s choice 1. Implicit queue in elements 2. Explicit queue outside elements VS. NULL Queue: Unit for scheduling (single task or single thread) NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  11. Push and Pull Connections (1) Single scheduling unit Single scheduling unit Source Queue Destination Push Pull receive packet p enqueue packet p dequeue packet p ready to transmit push (p) pull (p) return return NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  12. Push and Pull Connections (11) Single scheduling unit Single scheduling unit Pull connection ( ) Push connection ( ) Agnostic connection (double outline) Becomes push or pull depending on peer Either push or pull, not mixed NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  13. Invalid connections between elements 1. Push output cannot connects to pull input 2. Push output cannot have more than one connection 3. Pull input cannot have more than one connection 4. Agnostic elements cannot have mixed push/pull context NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  14. Click implementation Two versions Linux in-kernel driver: Good for production User-level driver: Good for profiling and debugging Element C++ class 20 virtual functions need overwrite 6 or less push, pull, run_scheduled Connection Virtual function calls between elements Passing configuration file to the driver NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  15. Simple languages for declarations and connections Compound element Fully declarative Declaration Connection Not how to process packets Shaper ShapedQueue NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  16. IP Router 16 elements in push path 1 elements in pull path Bring local information in packet e.g.,TTL Bring annotated information with packets One element creates information Other element uses information e.g., Destination IP address, Paint (Marking packet with color) NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  17. Extensibility We can easily add additional functionality to IP router (1) Scheduling Covered in this talk (2) Dropping Policy (3) Differentiated Services (4) Ethernet Switch NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  18. Extensibility (1) Scheduling Stochastic Fairness Queueing (SFQ) Providing isolation between competing flows Distributing packets into multiple queues Virtual queue NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  19. Extensibility (2) Dropping policy Weighted Random Early Detection (RED) Red: Dropping packets with network congestion What is your queue length? My queue length is 10 RED element needs to know nearest downstream queue S1. Manual configuration to give the information S2. Flow-based router context NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  20. Flow-based router context Answers for If I were to emit a packet on my second output, Where might it go? Which queues might it encounter? Using a simple data-flow algorithm on configuration graph RED: Where is my nearest down stream queue? NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  21. Evaluation environment Make a bottlneck Click IP Router 100 Mbit/s 100 Mbit/s 700 MHz Intel Pentium UDP packets 147,900 64B PPS Source Destination 733 MHz Intel Pentium 200 MHz Intel Pentium Pro NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  22. Simple forwarding rate Peak: 360,000 PPS Loss-free: 333,000 PPS Linux routing table algorithm is slower than Click s Click s polling driver avoiding interrupts and efficient I/O Receive live rock Click keeps the baseline performance of Linux NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  23. Forwarding rate with richer functionality Click s performance drops only gradually Comparison with real network Not involve: fragmentation, IP options, ICMP error Smaller number of ARP entries NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  24. Overhead of modularity Overhead of passing packets between elements Single virtual function call: 70 ns 16 elements in IP router: 70 ns * 16 = 1 us / packet Combine multiple elements into one 16 elements 8 elements Number of virtual function calls will decrease Push path latency decreases 1.57 us 1.03 us NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  25. Overhead of modularity Unnecessarily general element code Implemented more generality than necessary Classifier Required for IP router: ARP, IP packets Implemented: a small data structure for finding which packet data to examine Special classifier 24% smaller CPU cycles than general classifier But only 4% less than total cost NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  26. Conclusion Click Modular Router An open, extensible, and configurable router framework Building a complex IP router by connecting small, modular elements Modularity does not harm the performance of the base Linux system Easily adding extended functionality http://www.read.cs.ucla.edu/click/ NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  27. netmap: a novel framework for fast packet processing Luigi Rizzo NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  28. netmap: a novel framework for fast packet processing Possible number of packets @ a 10G link 14.88 Mpps (64B packets) Packet processing overhead in commodity OS Per-packet dynamic memory allocation System call overhead Memory copies netmap 60-65 cycles/packet (67 ns) system call 1us / packet 1.05 Mbps sendto() execution path and time in FreeBSD NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  29. netmap architecture Netmap s approaches Per-packet dynamic memory allocation Preallocated resources System call overhead Large batching Memory copies Shared buffer between kernel and userspace NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  30. Evaluation Test equipment i7-870 4 core @ 2.93MHz Intel 82599 10G NIC Line rate 8x 64B packet Tx netmap: 60-65 cycles/packet NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  31. Discussion NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  32. Click implicit dependency from annotation Annotation makes implicit dependency between elements There are 16 annotation types in the paper (+ custom types now) How to resolve those dependency? Paint CheckPaint Set paint annotation Use paint annotation There are no consideration on the configuration step NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  33. Click Extend to support L4/Stateful functionality How to support TCP functionality Can be built as a single element TCP Cannot support modularity on L4 layer How to support IP defragmentation It has to reserve packets until the other parts of packets arrived Click elements requires to store packets NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  34. Click Tradeoff between modularity vs efficiency There are fundamental tradeoff between modularity and efficiency VS. How can we get both benefit? Give a configuration in a modular way Optimize binary to integrated functionality IPPush NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

  35. netmap Comparison between other apporoaches There are many similar work in this area PSIO, PF_RING, DPDK Similar approaches to solve same challenges Benefit from netmap compared with other approaches Integrated with FreeBSD Not depend on specific hardware NETWORKED & DISTRIBUTED COMPUTING SYSTEMS LAB

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#