Traffic Localization at the Edge: Implications for User Performance
This paper explores the localization of traffic at the edge network by major content providers, its impact on user performance, and the methodology to measure traffic locality. Using data from the University of Oregon, the study identifies top content providers such as Netflix and Akamai while investigating the stability of these providers across different snapshots.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
A View From the Edge: A Stub-AS Perspective of Traffic Localization and its Implications Bahador Yeganeh*, Reza Rejaie*, Walter Willinger+ *University of Oregon, +NIKSUN Inc. research supported by NSF Award NeTS-1320977
Introduction In today s Internet, Content providers (CPs) often serve their content front-end servers at datacenters, content distribution networks (CDN), and cache servers Cache servers may be placed in other networks This localization of traffic at the edge of the network achieves two goals: Decreasing upstream costs for the CP Improving end user performance (delay & throughput) Two basic questions about this well-known strategy: What is the observed level of locality for delivered traffic to an edge network? What is the resulting performance improvement for users 2
Related Work Prior measurement studies on content delivery mechanisms fall into two broad categories: Uncovering the global infrastructure of a given CP [Bottger16, Calder13, Adhikari12, Fan15, Torress11, Truikose11] Evaluating user-perceived performance and efficiency of a CP or specific content type (e.g. video, web, file-sharing) [Gehlen12, Ager11, Bermudez13, Casas14, Jiang13, Akhtar16] Capturing and characterizing the degree of locality for delivered content from all CPs to an edge network and its actual effect on its user performance 3
This Paper Presents A methodology to measure the locality of delivered content from major content providers to an edge network The characterization of several aspects of traffic locality The performance implications of traffic locality 4
Target Network & Dataset University of Oregon (UO) as the target edge network Serving about 24K students and 4K staff members Utilizing unsampled Netflow data for incoming flows Using RouteViews to map source IP of each flow to its ASN, i.e. mapping each flow to its corresponding CP 10 daily snapshots (Tue & Wed of 5 consecutive weeks) starting on October of 2016 Bytes Flows ASes IPs 8.8TB 202M 39K 3.7M Average characteristics of a daily snapshot of Netflow data 5
Identifying Top CPs Distribution of delivered bytes per CP is very skewed Roughly 20 CPs deliver 90% of incoming bytes top CPs Top CPs are known players Netflix, Akamai, Twitter We later discuss why Google is not among the top CP! Are top CPs change stable across different snapshots? Volume of incoming traffic by top 21 CPs on 10/04/16 6
Top CPs - Prevalence Measuring persistence of top CPs across snapshots 21 CPs consistently appear among the top CPs Our analysis focuses on these 21 top CPs Prevalence of top CPs among all snapshots along with their rank distribution 7
Top CPs - Target Source IP Addresses Distribution of delivered bytes across individual source IP per CP is very skewed A very small fraction of all source IPs deliver 90% of bytes from each top CP top Ips 50K unique top IPs for all CPs across all snapshots Distribution of top IPs for target CPs along with total number of top IPs (blue line) in comparison to all IPs observed per CP (red line) 8
Top CPs Distance Measurement Delay, GEO, and hop count could be considered as a distance metric to measure the locality of each source IP GEO and hop distance are either unreliable or too coarse Measuring RTT towards target IP addresses using traceroute RTT is representative of delivery path's delay 10 rounds of paris-traceroute measurement in one day Minimum observed RTT (minRTT) per top IP is used as the server s distance minRTT is invariant to congestion and RTT fluctuations - representative of actual server's distance Outcome: the amount of delivered bytes & minRTT for each IP 9
Characterizing Traffic Locality for Top CPs to UO Traffic locality of each CP can be presented as a CDF of the fraction of delivered bytes from RTT distance x Utilizing a radar plot to visualize the overall traffic locality of all major CPs Each radi presents the RTT distance for (50, 75, and 90 percentile of) delivered bytes from each CP to UO The radar plot presents the footprint of delivered content to UO from the Internet 10
Traffic Locality for CPs Radar Plots Traffic locality varies among the top CPs 90% of traffic for 13 CPs are delivered from 60 ms radius For 9 of the top CPs radius is reduced to 20 ms [WHAT?] Bimodal mode of delivery for some CPs, e.g. CenturyLink Larger flows are delivered from close-by servers Smaller flows are delivered from servers that are further away Radius of delivered traffic for 50, 75, and 90 percentile of traffic volume per CP 11
Traffic Locality for CPs Infrastructure Utilization How well a CP utilizes the visible portion of its infrastructure? Normalized Weighted Locality (NWL) measures the fraction of traffic for each CP that is delivered from each RTT normalized by its minRTT NWL = 1 indicates the maximum level of traffic locality for a CP NWL measures how well a CP utilizes its infrastructure to localize its delivered traffic to UO 12
Traffic Locality for CPs Infrastructure Utilization Examining the distribution of NWL across snapshots for each CP Tight distribution for majority of CPs => delivery pattern is consistent Most CPs have NWL < 2 => utilizing closest locations to UO for content delivery Distribution of NWL measure for target CPs in all snapshots (top figure) along with minRTT for each CP (bottom figure) NWL for transit CPs is larger and exhibit wider variations across snapshots 13
Guest Servers Guest servers: servers that a CP deploys in other (host) networks e.g. Akamai deploys more than 200K servers at 1500 unique host networks (such as Comcast) Our approach does not detect guest servers and maps all flows to their host network Detecting guest servers presents a more accurate picture of content delivery for each CP 14
Guest Server - Detection Methodology A methodology to identify Akamai s guest servers but it could be extended to other CP Key steps of methodology: Identifying a URL for small and common objects served by the target CP reference objects Constructing custom HTTP request with matching header fields to request the reference object Probing (port 80 of) all target IPs using the constructed HTTP request A single positive response indicates a guest server Comparing the first 100 Bytes of the response payload to avoid false positives Validating our approach against Akamai s own servers Accuracy of 99% for content serving Akamai servers 15
Guest Server - Identification Probing all the top 50K IP addresses of 21 target CPs 658 unique guest servers in the following 7 ASes were identified in all snapshots: NTT, CenturyLink, OVH, Cogent, Comcast, Dropbox and Amazon Guest servers responsible for 121-259 GB (9-20%) of daily traffic from Akamai to UO Akamai s (34-103) own servers deliver about 12 times more traffic than guest servers Guest servers are mainly used for load balancing and cache misses 16
Guest Server - Locality Akamai s own servers offer better locality 75% of traffic from 4 ms 90% of traffic from 8 ms Guest servers from CenturyLink, NTT and OVH residing at 8, 15, 20 ms radiuses respectively Radius of delivered traffic for 50, 75, and 90 percentile of traffic volume (left figure) and number of flows (right figure) per CP 17
Implications of Traffic Locality - Throughput Distribution Does higher locality lead to a higher average throughput? 90% of all flows from all CPs (except Level3) get low average throughput (< 0.5 MB/s) Majority of CPs offer content from very localized locations (< 20 ms radius) Why do we observe subpar performance? Distribution of average throughput for incoming flows of top CPs in all snapshots 18
Implications of Traffic Locality - Throughput Bottlenecks Content: flow does not have sufficient amount of content to fill the pipe Receiver: receiver s access link is bottleneck Network: network bandwidth is limited due to cross traffic (and resulting loss rate) Server Rate Limit: CP s server limiting transmission rate implicitly due to limited capacity or explicitly as a result of the bandwidth requirements of the content (Netflix video doesn t require more than 0.6 MB/s) 19
Implications of Traffic Locality - Throughput Bottlenecks To identify underlying cause(s) of subpar throughput, we focus on elephant flows ( > 1 MB) that can fully utilize available bandwidth 510-570K flows per CP 3-4% of flows per CP Average packet size of 1400 B Flows are delivered to various types of UO users (wired, wireless, and residential) => client access link shouldn't be the bottleneck Either network or server are limiting the throughput for elephant flows 20
Implications of Traffic Locality - Throughput Bottlenecks Maximum Achievable Throughput (MAT) represents the 95 percentile of throughput per RTT distance Closest line depicts inferred loss rate for a CP if bandwidth was only limiting factor Inferred loss rate for majority of CPs is 10-3 or above ? =??? ??? Actual measured throughput orders of magnitude smaller network not bottleneck Diminishing returns for content locality 1 ? MAT vs minRTT for top CPs along with estimated TCP throughput for various loss rates 21
Conclusions Captured and presented the locality of delivered flows from major CPs to UO users Most of the content of major CPs are delivered from local servers (within 20ms RTT) Devised a new approach to detect guest servers Akamai servers and characterize their contributions Evaluated performance benefits of content locality: Average throughput of delivered content from major CPs is limited by the front-end servers and thus is not improved by the locality of front-end servers 22
Thank You! 23