CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems

Slide Note

CloudScale is an automatic resource scaling system designed to meet Service Level Objective (SLO) requirements with minimal resource and energy cost. The architecture involves resource demand prediction, host prediction, error correction, virtual machine scaling, and conflict handling. Module 1 focuses on resource demand prediction using signature-driven and state-driven approaches, with techniques such as pattern window sizing and frequency domain analysis.

haar618 Follow

Uploaded on Sep 29, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes

CloudScale: Background Background and Motivation Infrastructure as a Service(Iaas) providers like Amazon EC2 uses virtualizations to provide isolation among users Service Level Objective(SLO) An agreement between a service provider and a customer, about the performance of service provider in terms of measurable characteristics (e.g. 90% of the requests are fulfilled within 100ms) 100% VM Resource Usage 80% Resource Demand 60% 40% Cap 1 VM1 VM2 20% Cap2 0% Physical Host 1 2 3 4 5 6 7 8 9 10 Time 2

CloudScale Automatic resource scaling system Goal: meet Service Level Objective(SLO) requirements of the applications with minimum resource and energy cost 3

The CloudScale System Architecture Resource demand prediction Host Prediction error correction CloudScale Virtual Machine Virtual Machine Machine Virtual Scaling conflict handling Dom 0 Predictive frequency/voltage scaling Xen hypervisor 4 Graph adapted from original paper

Module 1: Resource Demand Prediction Goal: predict future resource demand based on past Signature-driven resource demand prediction P1 P4 P2 P3 P5 Pattern Windows If all pairs of pattern windows Pi and Pj are similar(determined by Pearson correlation), CloudScale uses the average values over all pattern windows to make its prediction. Pearson Correlation(X,Y) = ???(?,?) ?? ? ??(?) 5

Module 1: Resource Demand Prediction How to determine the size of pattern window? Resource Usage Apply Fast Fourier Transform Time fd = frequency of the frequency component with most signal power Signal power of f = lim ? Pattern window size = ? ?/2 1 ? ?/2 fd, r is sampling rate 2??) ? ? 6 Graphs from http://en.wikipedia.org/wiki/Frequency_domain

Module 1: Resource Demand Prediction State-driven resource demand prediction Used when no signature is found State-transition matrix Pij State 1 State 2 State 3 State 2 CPU [30 %, 60%) State 1 0.5 0.3 0.2 State 2 0.2 0.6 0.2 State 3 State 1 CPU [60 %, 100%) CPU [0 %, 30%) State 3 0.3 0.4 0.3 7

The CloudScale System Architecture Resource demand prediction Host Predicted Resource demands Prediction error correction CloudScale Virtual Machine Virtual Machine Machine Virtual Scaling conflict handling Dom 0 Predictive frequency/voltage scaling Xen hypervisor 8 Graph adapted from original paper

Module 2: Prediction Error Correction Why? Avoid under-estimation correction Proactive Approach : Burst-based Padding Frequency Domain Amplitude Frequency Apply Reverse Fast Fourier Transform Top k frequencies in frequency spectrum 9 Graphs from http://en.wikipedia.org/wiki/Frequency_domain

Module 2: Prediction Error Correction Burst Pattern 100% 80% 60% Burst Pattern 40% 20% 0% Burst density: percentage of positive values in burst pattern If burst density > 50%, CloudScale uses maximum of all burst values as padding value 10

Module 2: Prediction Error Correction Proactive Approach : Remedial Padding 100% VM Resource Usage 80% 60% Real Resource Demand 40% Predicted Resource Demand 20% 0% 1 2 3 4 5 Time 6 7 8 9 10 Prediction Errors (e1, e2 ) = Real Resource Demand Predicted Resource Demand Remedial Padding Value = Weighted Moving Average of | (e1, e2 ) | Padding Value = max(Burst-based value, Remedial Padding) 11

Module 2: Prediction Error Correction Reactive Approach: Fast Under-estimation Correction Challenge: real resource demand is unknown during under- provisioning x * 3 100% x * 2 Real Resource Demand 80% Resource Cap x * 60% 40% x 20% 0% t t+1 t+2 t+3 Time (seconds) 12

Module 2: Prediction Error Correction When to trigger reactive error correction? Let P = Resource Usage Resource Cap , i.e. resource pressure Trigger under-estimation when P Punder (90%, etc) How to determine (resource scale-up ratio)? =P Punder 1 Punder ( max min) + min Pre-defined parameters for maximum and minimum of scale-up ratio 13

The CloudScale System Architecture Resource demand prediction Host Resource demands Prediction error correction CloudScale Initial resource caps Virtual Machine Virtual Machine Machine Virtual Scaling conflict handling Dom 0 Predictive frequency/voltage scaling Xen hypervisor 14 Graph adapted from original paper

Module 3: Scaling conflict handling Conflict Prediction Conflict Duration 140% 120% Conflict Degree 100% 80% VM 1 VM 2 60% Host = VM1 + VM2 40% 20% 0% t t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 15

Module 3: Scaling conflict handling Two Approaches Local conflict handling Used when conflict duration is short and conflict degree is small Migration conflict handling Used when conflict duration is long and conflict degree is large 16

Module 3: Scaling conflict handling Local conflict handling Uniform scheme: set resource cap for each application in proportion to its resource demand Differentiated scheme: satisfy the resource demand of high priority application first Resource Under-provisioning Penalty (RP) SLO penalty for application VM caused by one unit resource under- provisioning Total Resource Under-provisioning Penalty(QRP) QRP= RP * total units of resource under-provisioning over the duration of conflict, of all VMs 1 Resource Usage Predicted Resource Demand 0.5 Allocated Resource Cap 0 1 2 3 4 5 6 7 8 9 10 17 Time

Module 3: Scaling conflict handling Migration-based conflict handling Key observation: trigger migration after the conflict already happened is too late since host is already overloaded Solution: trigger migration before conflict Leverage conflict prediction to trigger migration T seconds before the predicted conflict happens To avoid migration for transient conflicts, only migrate if conflict duration is larger than K seconds Total Migration Penalty (QM) Migration Penalty(MP) : SLO penalty during migration per time unit migration penalty for VMi = MPi * Migration Time Aggregate migration penalties over all VMs 18

Module 3: Scaling conflict handling Total Under-provisioning Penalty (QRP) vs Total Migration Penalty (QM) If QRP > QM, trigger migration conflict handling If QM > QRP, trigger local conflict handling QM > QRP Yes No Local conflict handling Migration 19

The CloudScale System Architecture Resource demand prediction Host Resource demands Prediction error correction CloudScale Initial resource caps Virtual Machine Virtual Machine Machine Virtual Scaling conflict handling Dom 0 Adjusted resource caps Predictive frequency/voltage scaling Xen hypervisor 20 Graph adapted from original paper

Module 4: Predictive Frequency/Voltage Scaling Goal: save energy without affecting application SLOs Max CPU frequency frequency i Adjust the resource caps for VMs accordingly frequency j Real Demand frequency 2 frequency 1 CPU resource demand CPU frequency 21

Evaluation: CPU prediction error World Cup and EPA are two 6-hour workloads. EPA has more fluctuations than World Cup For World Cup, CloudScale makes less than 5% significant prediction errors (|e| > 10%) For EPA, CloudScale makes less than 10% significant prediction errors (|e| > 10%) 22

Evaluation: Migration Prediction Accuracy Lead time: how early CloudScale triggers the migration (e.g. migrate T seconds before the conflict happens) 23

Evaluation: Conflict Resolving Schemes vs SLO violation rate RUBiS: an online auction benchmark Settings: two RUBiS web server VMs on the same host, maintain 75% resource pressure Memory size: VM1 = 1GB VM2 = 2GB Schemes: Local conflict resolving Uniform Scheme Reactive migration Always migrate VM2 VM selection Selects the VM with less migration penalty(VM1) CloudScale Predictive migration 70s before conflict 24

Discussion 1. In related work section the authors claims that compared to previous work, CloudScale does not require ANY offline tuning. But the current CloudScale does need migration lead time, the duration of conflict to trigger migration, as well as resource pressure threshold. Coordinated multi-metric resource scaling (CPU, memory, network, disc, etc) Coordinated multi-tier resource scaling (host-level,etc) Prioritize resource to applications which have not violated SLO yet, assuming SLO violation is binary. Fairness: how to ensure applications are providing real SLO feedback and migration penalty? 2. 3. 4. 5. 25

Backup Slides 27

Average Delay vs Average CPU cap Schemes Correction: the scaling system performs resource pressure triggered prediction error correction only Dynamic padding: dynamic padding only CloudScale RP: both dynamic padding and scaling error correction, scaling error correction is triggered by resource pressure(90%) only CloudScale RP + SLO: scaling error correction is also triggered by SLO feedback (5%) 28

Energy Saving Total energy saving is 8-10% and idle energy consumption are dmoninating. 29

Evaluation: two workloads 30

Evaluation: 90% resource pressure 31

Evaluation: 75% resource pressure 32

SLO Violation for Different Scaling Schemes Schemes: Local conflict resolving Uniform Scheme Reactive migration Always migrate VM2 VM selection Selects the VM with less migration penalty(VM1) CloudScale Predictive migration 70s before conflict 33