Enhancing Internet Telephony Quality Through Predictive Relay Selection
Examining the quality of Internet telephony in relation to network performance, this research explores the use of Managed Overlay to improve call quality for services like Skype. Analysis of 430 million Skype calls reveals that a significant portion experience poor network performance, emphasizing the importance of effective relay selection. The study suggests that a data-driven approach to relay selection can significantly enhance call quality, presenting VIA as a solution for optimal relay selection. Managed Overlay networks are highlighted as a promising means to address network performance challenges in Internet telephony. With the goal of alleviating poor network performance, this research offers insights into the potential benefits of Managed Overlays in enhancing the reliability and quality of Internet telephony services.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
VIA VIA: Improving Internet Telephony Quality Using Predictive Relay Selection Junchen Jiang, Rajdeep Das, Ganesh Ananthanarayanan, Philip A. Chou, Venkata N. Padmanabhan, Vyas Sekar, Esbjorn Dominique, Marcin Goliszewski, Dalibor Kukoleca, Renat Vafin, Hui Zhang 1
Key takeaways in one minute We studied 430 million Skype calls and found One fifth of calls use paths with poor network performance Managed Overlay could alleviate over half of calls on these paths VIA: Data-driven relaying can realize most Managed Overlay s potential 2
Internet telephony is everywhere! More apps focusing on Internet telephony Peak # of users concurrent online on Skype 80 60 Million Rapid growth over the last decade 40 20 0 2004200520062007200820092010201120122013 https://www.statista.com/chart/1417/skype-usage/ 3 https://blogs.skype.com/2013/04/03/thanks-for-making-skype-a-part-of-your-daily-lives-2-billion-minutes-a-day/
Call quality is sensitive to bad network performance Datasets: Avg RTT, loss rate, jitter for each call of 430 million Skype calls Small fraction has user-provided quality scores 1 1 1 Normalized Poor Call Rate 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 320ms 0.2 0.2 12ms 0.2 1.2% 0 0 0 0 1 2 3 4 0 5 10 Jitter (ms) 15 20 25 0 200 400 600 Loss rate (%) RTT (ms) Thresholds of poor network performance 4
Many calls have poor network performance 17% 17% 12% One fifth of calls have poor network performance Our goal: Alleviate poor network performance for Skype 5
Outline Problem: Network performance of Skype is bad Opportunity: Managed Overlay has huge potential Solution: VIA for optimal relay selection Evaluation: VIA is close-to-optimal 6
Revisiting Overlay Networks by Managed Overlay Managed Overlay has new benefits World-wide distributed DCs as relays Well connected Deployed by many providers Single administrative entity How much can Internet telephony benefit from it? 7
Selecting the best relay option Multi-relay hops One-relay hop Direct path Key is to select the best relay option (direct, one-relay, or multi-relay) Q1: Does picking best relay option have significant impact? Q2: If so, how to pick the best relay option? 8
Managed Overlay has huge potential benefit Consider an oracle that picks the best relay option for each src-dst AS pair in 24 hours Poor performance: RTT > 320ms; Loss rate > 1.2%; Jitter > 12ms 60 50 40 % calls 30 Bad Good 20 10 0 RTT Loss Jitter At least one bad Substantial fraction of bad-performance calls could be alleviated 9
Outline Problem: Network performance of Skype is bad Opportunity: Managed Overlay has huge potential Solution: VIA for optimal relay selection Evaluation: VIA is close-to-optimal 10
VIA: Realizing the benefit of Managed Overlay using Centralized Predictive Control VIA Control Algorithm Predict the best relay option based on other calls performance Quality of existing calls Relay selection 11
Strawman 1: Pure prediction-based Use long-term history to predict performance Relay options A new call Call history Quality prediction Problem: Call performance has great inherent variance E.g., predicting next day using last week leads to over 30% error on latency. 12
Strawman 2: Pure exploration-based In a short time window, explore relay options, then exploit the best one Relay options Calls of one AS pair per day Problem: Call distribution is highly skewed # of actual calls >> # of relay options NOT true for most AS pairs 13
Key idea: Guided exploration Strawmen VIA XOR Prediction-based Exploration-based 1 Probability of the best relay being in top k 0.8 Top k can be more easily predicted and can be more efficiently explored! 0.6 0.4 Predicting top 1 is too hard 0.2 0 0 1 2 3 4 5 6 7 8 9 k VIA s idea: Guided Exploration Rough prediction can still identify top k candidates, which can be explored efficiently. 14
Step 1: Prediction-based pruning Focus on relay options whose confidence intervals are better than those of others. Upper bound of blue and yellow is better than the lower bound of green and red Latency Call history Confidence interval of quality prediction Top-k candidates 15
Step 2: Exploring top-k candidates UCB1: Always pick the one with the highest UCB Multi-armed bandit process Upper Confidence Bounds (UCB1) Reward Reward Reward How to maximize rewards? Our problem looks like MAB UCB1 with domain-specific twists How to minimize latency? 16
Putting them together: Guided Exploration in action Guided Exploration in action All relay options Tomography-based coverage expansion Call history Prediction-based Pruning Updates every T hours Predictive selection of top k candidates Performance measurements Top-k candidates per AS pair Modified UCB1 on the top k candidates To explore more relays Real-time relay selection Runs per call 17
More in our paper Budgeted relaying Network tomography Granularity of prediction International vs. domestic calls 18
Outline Problem: Network performance of Skype is bad Opportunity: Managed Overlay has huge potential Solution: VIA for optimal relay selection Evaluation: VIA is close-to-optimal 19
VIA achieves close-to-optimal performance Poor performance: RTT > 320ms; Loss rate > 1.2%; Jitter > 12ms 60 Oracle Prediction-based VIA Exploration-based 50 40 % calls 30 Bad Good 20 10 0 RTT Loss Jitter At least one bad 20
Benefit varies across ASes Substantial improvement 100 Oracle VIA 80 % calls 60 Room for further improvement Bad Good 40 20 0 Source AS1 Source AS2 Source AS3 Limited room for improvement 21
Conclusion Problem: One fifth of calls have bad network performance. Opportunity: Managed Overlay could significantly reduce bad-performance calls. Challenges: Optimal relay selection Pure prediction and pure exploration won t work! Solution: VIA can realize most of the benefit of Managed Overlay. Key idea: Guided exploration: Predictive pruning + Exploration over top k candidates Takeaway 1: Managed Overlay has huge potential for Skype Takeaway 2: Data-driven predictive relaying is the key enabler 22
Thank you! 23
Related work Overlay routing: (e.g., RON, Detour) Lastmile bottlenecks, small scale deployment MO, large-scale global infrastructure, compelling usecase. VoIP measurement (IMC12, etc) Active measurements First large-scale measurement on Skype Server selection (e.g., DONAR) Server in proximity to client Focus on end-to-end performance Moving supernode to data centers (e.g., Hangout, Skype) It is a trend; we are in line with it We focus on connectivity as well as performance. Middle part cut through MO backbone Internet performance prediction Active probing Existing data on performance 24
Real-world controlled deployment 14 modified Skype clients across Singapore, India, USA, UK, and Sri Lanka Pair-wise calls, each have 9-20 different relay options 25
Q&A What about Google? We only talk about Skype What about super node P2P -> Cloud. Connectivity -> Performance. Through cloud Future directions Talked to Skype team 26