Cloud-Scale VM Deflation for Running Interactive Applications on Transient Servers

Slide Note

This research explores the concept of deflatable virtual machines to run interactive applications on transient cloud servers without facing unexpected preemption. By reclaiming resources from low-priority VMs and allowing forward progress with some performance degradation, the method aims to provide cost-effective solutions without downtime. Benefits include increased cluster utilization and better availability for cloud providers, with a tradeoff of possible performance degradation due to resource reduction.

vilia Follow

Uploaded on Oct 09, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Cloud-scale VM Deflation for Running Interactive Applications On Transient Servers Alexander Fuerst, Ahmed Ali-Eldin, Prashant Shenoy, and Prateek Sharma 1

Transient Computing 1. Surplus cloud resources are sold at discounted rates as low-priority Virtual Machines (VMs) AWS Spot Instances , Azure Batch VMs, Google Preemptible VMs 2. Current transient cloud VMs are subject to unexpected revocation (preemption) Preemption leads to application failure and downtime Not ideal for interactive and long running applications How can we provide such low cost and low priority VMs without preemption? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Deflatable Virtual Machines 1. Fractionally reclaim resources from low- priority VMs, avoiding preemption CPU, Memory, Network, Disk bandwidth Physical Server 2. Allow to make forward progress, albiet with some performance degradation On- On- Deflatable VM Deflatable VM Demand VM Demand VM 3. Arrival of new VMs causes deflation of low-priority VMs Incoming VM 4. Classic, high-priority, On-Demand VMs unaffected INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Deflation Benefits & Tradeoff For applications: For cloud providers: Black-box solution that supports all application classes Increase cluster utilization Utilize surplus resources No need for fault tolerance handling or downtime Better availability Tradeoff: Possible performance degradation because of reduced resources 1. How feasible is deflation on typical cloud VMs? 2. How can we incorporate deflation in clouds? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Agenda 1. Transient computing 2. Deflatable Virtual Machines 3. Trace-based Deflatability Analysis 4. Server-Level Deflation 5. Cluster Deflation Management 6. Application Deflation Experiments 7. Related & Future work INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Performance impact of deflation on typical Cloud VMs 1. Deflation causes underallocation 2. What fraction of time VM utilization is above deflation levels (performance degradation)? Slack Slack 3. If deflated allocation is above the VMs utilization, there is no performance degradation (slack) 4. Analyze traces from Azure and Alibaba INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

How much can CPU resources be deflated by? 1. Use timeseries of resource (CPU) utilization of Azure VMs Fraction of time different VM types face underallocation, for different deflation levels 2. Fraction of time that the CPU usage is higher than different deflation targets 3. Interactive VMs will be impacted less than 20% of the time at 50% deflation 4. Most batch VMs impacted under 30% of the time 5. Interactive applications are over- provisioned to handle peak load INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Memory and I/O Deflatability 1. Alibaba dataset has memory and I/O data from containers 2. Containers have high memory utilization, but Primarily JVM apps Large heap but small working set Low memory bandwidth usage (avg ~1%) 3. Very little disk and network usage (avg < 1%) 4. Cloud applications have significant deflatable slack INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

How is deflation achieved? Deflation Mechanisms

Deflation Mechanisms Explicit Guest-OS Deflation VM Allocation 1. OS hot-unplugs memory and CPU, returning to the hypervisor CPUs Mem Explicit to OS and application NIC BW Hot unplug Done in course-grained units (whole vCPUs) Disk BW Hotplug has a safety threshold that limits effectiveness CPUs Mem VM sees 2 vCPUs, 3 Mem, full bandwidth after unplug 2. NICs and disk are unsafe to hotplug NIC BW Hypervisor must handle Disk BW INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Deflation Mechanisms Transparent Deflation via Hypervisor 1. Use traditional VM overcommitment mechanisms CPU: Multiplex vCPUs to fewer physical CPUs VM Allocation Memory: Swap VM pages CPUs 2. Transparent deflation adds overhead Mem Transparently deflate NIC BW 3. Combine mechanisms into hybrid deflation: CPUs Disk BW Mem 1. Reclaim via hotplug up to safety limit NIC BW 2. Reclaim remaining via hypervisor VM still sees 2 vCPUs, 3 Mem, full bandwidth Disks BW INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

How much to deflate? Cluster Deflation Policies

Server Deflation Policies 1. All server resources are allocated 2. VM needs to be created on server 3. Can t preempt to reclaim resources No room By how much do we deflate each VM? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Server Deflation Policies Proportional Deflation 1. Simple solution is to reclaim shares in proportion to size Incoming Size: R = 1 2. Deflate each VM proportionally to their maximum resources (????) ??= .25 ??= .5 ??= .25 3. Amount to reclaim (?) determines what percentage will be taken (?) 4. Compute deflation amount (??) for each VM ??= ???? ? ???? 5. Incoming VM is included if deflatable ? ? = 1 ?? ???? = ? = ?? ????? ? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Server Deflation Policies Priority Deflation 1. VMs are assigned priority (??)at creation time Physical Server Deflatable VM ?=0.8 Deflatable VM ?=0.6 Deflatable VM ?=0.4 Deflatable VM ?=0.4 2. Lower priority VMs are deflated more 3. Priority determines VMs minimum allocation Incoming VM ??= (???? ?? ????) ? ?? (???? ?? ????) 4. VMs cannot be deflated below their minimum allocation ? ? = ?? ? = 1 ????? ? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Server Deflation Policies Reinflation 1. As VMs exit or are removed, their resources are reclaimed 2. These can then be returned to deflated VMs 3. Easily computed as inverse of deflation calculation R = -Rfree How do we decide where to place VMs? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Cluster-Wide Deflation Policies 1. VM placement naturally affects deflation 2. Track overcommitment of all VMs in cluster 3. Deflation-aware bin-packing intelligently chooses host 4. Can partition physical servers and assign deflatable VMs to them by priority INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Cluster-Wide Deflation Policies Deflation-Aware Bin Packing 1. No server has free resources for incoming VM 2. Quantify low-priority VMs deflatable resources 3. Must account for existing server overcommitment What server should we choose? INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Cluster-Wide Deflation Policies Deflation-Aware Bin Packing 1. Each server has a resource availability vector (??) 2. Compute cosine similarity fitness of each server based on incoming VM size (D) and availability 3. Best fit server with highest fitness ?? ? ?? |?| ??????????? ?????????????) ??????? ?,?? = ??= ?????? ?????+ ( INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Lets do some science Experimental Evaluation

Evaluation Questions 1. How does deflation affect interactive application throughput? Deflate highly interactive web app and services Wikipedia, MySQL, Memcached, Apache HTTP 2. Does deflation reduce chance of preemptions in overcommitted clusters? Cluster simulator (~2000 Python LOC) Real VM utilization data from Azure Cloud-scale Deflation https://en.wikipedia.org/wiki/Wikipedia_logo

Wikipedia CPU Deflation 1. Mean response time is still less than 0.1 sec at 70% deflation 2. No requests dropped till 80% CPU deflation INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Azure Simulation Throughput Loss 1. Use Azure traces to estimate throughput loss 2. Throughput reduced by only 4% even at extreme 80% cluster overcommittment 3. Priority deflation keeps loss under 1% Priority assigned based on p95 CPU 4. Loss is lower than overcommitment due to unused VM resources (slack) 5. Shows effectiveness of bin-packing and proportional deflation policies INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Preemption Reduction 1. VMs preempted if they are deflated below their minimum threshold 2. Negligible preemptions even at extreme cluster overcommittment (50%) 3. Interactive applications can thus be safely deflated without risk of downtime INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Future 1. GPU deflation 2. Increase in performance interference due to deflation? 3. Impact of deflatable VMs on end-users Cloud-scale Deflation

Related Work 1. Resource Deflation: [EuroSys 19] 2. VM overcommitment: [Ballooning] 3. Burstable VMs: Inverse of deflatable VMs. 4. Transient computing & fault tolerance [SpotWeb, SpotCheck] 5. Virtualized cluster management [VMWare DRS] INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation

Conclusions 1. Deflatable VMs: An alternative to preemptible VMs for transient computing 2. Main idea: Fractionally reclaim resources instead of outright preemption 3. Real-world resource utilization (Azure and Alibaba traces) indicates most applications amenable to deflation 4. Hybrid deflation using hypervisor and OS level overcommittment 5. Proportional deflation and bin-packing policies can increase cluster overcommittment by 50%, resulting in < 5% decrease in application throughput INDIANA UNIVERSITY BLOOMINGTON Cloud-scale Deflation