Enhancing TCP Fairness Using P4-Programmable Data Planes
This study presents a solution to improve TCP fairness in non-programmable networks by utilizing P4-programmable data planes. It addresses unfair bandwidth distribution issues in TCP traffic and proposes a system that leverages P4 switches for passive traffic monitoring and RTT computation.
Uploaded on Sep 29, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Improving TCP Fairness in Non-Programmable Networks using P4-Programmable Data Planes Jose Gomez*, Elie Kfoury*, Jorge Crichigno* (Presenter), Gautam Srivastava^ *University of South Carolina, USA ^Brandon University, Canada IEEE International Black Sea Conference on Communications and Networking June 25, 2024 Tbilisi, Georgia 1
Agenda Background on Congestion Control Algorithms Unfair Bandwidth Distribution of TCP Traffic Proposed Solution to Improve Fairness Evaluation Results Conclusion 2
Traditional Congestion Control Algorithms The Transmission Control Protocol (TCP) enables reliable end-to-end communication over the Internet Within TCP, its congestion control algorithm (CCA) dictates the dynamics of the sending rate Traditional CCA implements Additive-Increase Multiplicative-Decrease (AIMD) The increase in the sending rate is given as a function of the Round-trip Time (RTT) Packet loss Additive increase Multiplicative decrease Sending rate Time 3
BBR: Rate-based Congestion Control TCP Bottleneck Bandwidth and RTT (BBR) is a rate-based CCA1 BBR proposed a new approach to overcome the limitations of traditional CCAs: It is not governed by AIMD control law It does not use packet loss as a signal of congestion The data in transit is limited to one Bandwidth-delay Product (BDP), calculated as the product of the RTT and the bottleneck bandwidth (Btlbw) (i.e., BDP=RTT Btlbw) probe Sending rate 125 Router Sender Receiver btlbw 100 75 drain Bottleneck (btlbw) Output port buffer Time cycle 1 cycle 2 ... 8 RTTs 1. N. Cardwell et al., BBR: Congestion-based congestion control: Measuring bottleneck bandwidth and round-trip propagation time, ACM Queue, 2016. 4
Unfair Bandwidth Distribution in TCP Traffic RTT unfairness occurs when two senders situated at different distances from their receivers share a common bottleneck link The queue imbalance occurs when TCP flows do not fully utilize the allocated bandwidth RTT unfairness scenario Throughput distribution 5
Proposed Solution: Overview The system addresses the RTT unfairness It uses P4 switches as measurement tools to passively capture traffic The system computes the RTT of individual TCP flows and applies a classification algorithm to enforce flow separation based on RTT values 1. Optical taps operate at the physical layer by splitting the light traveling in the fiber 6
P4 Programmable Switches P41 switches enable programmers to define the data plane behavior Describe and parse new protocols Measure events with high precision (nanosecond resolution) Run custom applications at line rate P4 switches can enhance TCP performance by providing visibility into network events 1. P4 stands for stands for Programming Protocol-independent Packet Processors 7
P4 Programmable Switches P41 switches enable programmers to define the data plane behavior Describe and parse new protocols Measure events with high precision (nanosecond resolution) Run custom applications at line rate P4 switches can enhance TCP performance by providing visibility into network events Evolution of the packet forwarding speeds1 1. Reproduced from N. McKeown. Creating an End-to-End Programming Model for Packet Forwarding. Available: https://www.youtube.com/watch?v=fiBuao6YZl0&t=634s 8
Proposed Solution 1. Passive taps collect traffic from the data link of a non-programmable router 2. This traffic is processed by the data plane of a P4 switch 3. Flows are identified and RTTs and throughput calculated at line rate 4. A classification algorithm performs the queue assignment 5. Control rules are created and installed in the non-programmable router 6. Flows are assigned to their corresponding queues 7. The buffer size of each queue is adjusted 9
RTT Calculation The methodto calculate the RTT comprises the following steps1: 1. The system computes the flow identifier (FID) of outgoing packet by applying a hash function to the 5-tuple. It also calculates the expected acknowledgement (eACK) 2. The current packet s timestamp is stored in a table, indexed by the combination of FID and eACK 3. Upon receiving an incoming TCP packet, the system searches the table using the FID and ACK number. If found, it calculates the timestamp difference to generate an RTT sample 1. X. Chen, H. Kim, J. Aman, W. Chang, M. Lee, and J. Rexford, Measuring TCP round-trip time in the data plane, in Proceedings of the Workshop on Secure Programmable Network Infrastructure, 2020. 10
Classification Algorithm The control plane implements the Jenks natural breaks algorithm1 This algorithm calculates the lower and upper limits of a set of K queues according to the RTTs These limits are used to allocate flows with similar RTTs in independent queues 1. G. Jenks, The data model concept in statistical mapping, International yearbook of cartography, 1967 11
Evaluation Topology Hosts are created as Linux network namespaces via Mininet on physical servers iPerf3 is used to generate data transfers Optical taps on both directions copy traffic to the P4 switch The P4 switch sends the control rules through the management port 12
Test 1: TCP Flow Separation and Fairness Analysis In this experiment, flows are separated, and the available bandwidth distributed evenly as a function of the number of flows The system can identify and separate the flows in different queues and achieve a fair bandwidth share CUBIC wo/ separation CUBIC w/ separation 13
Test 1: TCP Flow Separation and Fairness Analysis BBR flows with longer RTTs are dominant over flows with shorter RTTs Implementing flow separation ensures that BBR flows will evenly share the bandwidth The system can enforce fairness independently of the CCA BBR wo/ separation BBR w/ separation 14
Test 2: Analyzing the Impact of RTT Disparities CUBIC maintains a bandwidth share of around 60% for RTT values below 50ms The flow with lower RTT (i.e., Flow 1) obtains the largest bandwidth share CUBIC wo/ separation CUBIC w/ separation 15
Test 2: Analyzing the Impact of RTT Disparities BBR presents a more aggressive behavior with a slight increase in the RTT This experiment demonstrates that the proposed system can enhance the fairness of competing flows, regardless of the design principles of the CCA being used BBR wo/ separation BBR w/ separation 16
Test 3: FCT of Competing Flows This test analyzes a scenario with 1000 flows, each transferring 150MB of data RTTs are randomly assigned between 1ms - 100ms Increasing the number of queues reduces the FCT of long flows 100% CUBIC 75% CUBIC, 25% BBR 50%CUBIC, 50% BBR 25% CUBIC, 75% BBR 100% BBR 17
Test 4: Dynamic Conditions In a scenario with dynamic conditions the proposed system can achieve a link utilization between 85%-95% 18
Conclusion The proposed system segregates TCP flows based on their RTT The impact of RTT unfairness and queue imbalance are reduced The system can enforce fairness independently of the TCP CCA Results show an improvement in the fairness of competing flows, FCT, and RTT of individual flows Future work will validate the system s performance and scalability using real-world traces 19
Acknowledgement This work was supported by the U.S. National Science Foundation (NSF), under grant number 2118311, and by the Office of Naval Research (ONR), grant number N00014- 23-1-2245 20
For additional information, please refer to https://research.cec.sc.edu/cyberinfra/ Email: {gomezgaj, ekfoury}@email.sc.edu, jcrichigno@cec.sc.edu, srivastavag@brandonu.ca 21
Additional Test: Rebalancing the Queue The queue imbalance occurs when TCP flows underutilize the allocated bandwidth The system identifies the link that is under-utilized and redistributes the bandwidth CUBIC wo/ separation CUBIC w/ separation 22
Additional Test: Rebalancing the Queue This mechanism works independently of the CCA used by the end hosts The system can effectively rebalance the queue BBR wo/ separation BBR w/ separation 23
Proposed Solution: System Overview The system uses passive taps to collect the traffic and send a copy to a P4 switch The data plane of the P4 switch identifies the flows and calculates RTT and throughput of individual flows The control plane implements a classification algorithm and creates the rules that will separate the flows in the legacy router 24