Energy-Efficient Computing: Scaling Strategies and Power Optimization

 
CSE 591: Energy-Efficient Computing
Lecture 5
SCALING: stateless vs. stateful
 
Anshul Gandhi
347, CS building
anshul@cs.stonybrook.edu
 
autoscale paper
3
Facebook data
center in Oregon
 Collection of thousands of servers
 Stores data and serves user requests
4
Most power is actually wasted!
[energystar.gov, McKinsey & Co., Gartner]
 Annual US 
d
ata centers: 100 Billion kWh = $ 7.4 Billion
 As much CO
2
 as all of Argentina
 Google investing in power plants
Setup cost
260 s
200
 
W
(+more)
Servers are only busy 30% of the time
 on average, but
they’re often left on, wasting power
 
  BUSY server:    200 Watts
  IDLE server:      140 Watts
  OFF server:           0 Watts
Intel Xeon E5520
dual quad-core 
2.27 GHz
 
Time
 
Demand
Provisioning
for peak
 
?
5
Time
Demand
?
Given 
unpredictable 
demand
,
 
how to provision
capacity to minimize power consumption without
violating response time guarantees (95%tile) ?
1.
 Turn servers off: save power
2.
 Release VMs: save rental cost
3.
 Repurpose: additional work done
6
28 servers
7 servers
(key-value store)
Response time
: Time taken to complete the request
A single request
: 120ms, 3000 KV pairs
7
1 server
(500 GB)
Goal
: Provision capacity to minimize power consumption
without violating response time SLA
SLA
: T
95
 < 400ms-500 ms
 
8
 
28 servers
 
7 servers
(key-value store)
 
1 server
(500 GB)
Static provisioning policy
Knows the maximum request rate into the entire
data center (r
max 
= 800 req/s)
What request rate can each server handle?
400 ms
60 req/s
1 server
9
T
95  
= 291ms   P
avg
= 2,323W
 
10
T
95  
= 11,003ms   P
avg
= 1,281W
T
95  
= 487ms   P
avg
= 2,218W
11
x
 
= 100%
Linear Regression
T
95  
= 2,544ms   P
avg
= 2,161W
Moving Window Average
T
95  
= 7,740ms   P
avg
= 1,276W
Use window of observed request rates to predict request rate at
time (t+260) seconds. Turn servers on/off based on this prediction.
12
[Gandhi et al., Allerton Conference on Communication, Control, and Computing, 2011]
[Gandhi et al., Open Cirrus Summit, 2011]
Predictive and Reactive are too quick to turn servers off
If request rate rises again, have to wait for full setup time (260s)
Wait for some time
(t
wait
) before turning
idle servers off
“Un-balance” load:
Pack jobs on as few
servers as possible
without violating SLAs
Two
new
ideas
13
Heuristic
Energy(wait) = Energy(setup)
P
idle
 ∙ t
wait
 = P
max
 ∙ t
setup
10 jobs/server
Load balancing?
14
[Gandhi et al., International Green Computing Conference, 2012]
[Gandhi et al., HotPower, 2011]
 
cachescale paper
Application in the Cloud
 
Why have a caching tier?
1.
Reduce database (DB) load
16
 
λ
req/sec
 
λ
DB
req/sec
 
(
λ
DB
 << 
λ
)
Application in the Cloud
 
Why have a caching tier?
1.
Reduce database (DB) load
2.
Reduce latency
17
λ
req/sec
λ
DB
req/sec
 
(
λ
DB
 << 
λ
)
 
Shrink your cache during low load
> 1/3 of
the cost
 
[Krioukov`10]
[Chen`08]
 
[Ousterhout`10]
Caching Tier
Will cache misses overwhelm the DB?
 
Load
Balancer
 
Application Tier
 
Goal: Keep 
λ
DB
 = 
λ
(1-p)
 
low
18
 
λ
req/sec
λ
DB
req/sec
 
λ
p
 
λ
(1-p) 
= 
λ
DB
 
If 
λ
 drops
 
(1-p) can be higher
 
p can be lower
 
SAVE 
$$$
 
Large
 decrease in
caching tier size
 
Small
 decrease in
caching tier size
Are the savings significant?
 
Small decrease in
hit rate
 
Uniform
 
Zipf
19
 
% of data cached
 
Hit rate, p
It depends on the popularity distribution
Is there a problem?
 
Performance can temporarily suffer if we lose
a lot of hot data
20
Mean response time (ms)
Time (min)
Shrink
the
cache
Response
time
stabilizes
What can we do about the hot
data?
21
Start state
 
End state
 
Option 1
 
Option 2
 
We need to transfer the hot
data before shrinking the cache
 
Transfer
 
Caching Tier
 
Primary
 
Retiring
Slide Note
Embed
Share

This collection of images and text delves into the nuances of energy-efficient computing, focusing on the comparison between stateless and stateful systems, power consumption in data centers, wastage in server operations, and strategies for minimizing power consumption while maintaining performance guarantees in response times. The exploration also includes experimental setups for evaluating power consumption and response time trade-offs in server provisioning strategies.

  • Energy efficient
  • Computing
  • Scaling
  • Power optimization
  • Data centers

Uploaded on Feb 17, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. CSE 591: Energy-Efficient Computing Lecture 5 SCALING: stateless vs. stateful Anshul Gandhi 347, CS building anshul@cs.stonybrook.edu

  2. autoscale paper

  3. Data Centers Facebook data center in Oregon Collection of thousands of servers Stores data and serves user requests 3

  4. Power is expensive Annual US data centers: 100 Billion kWh = $ 7.4 Billion As much CO2 as all of Argentina Google investing in power plants Most power is actually wasted! [energystar.gov, McKinsey & Co., Gartner] 4

  5. A lot of power is actually wasted Servers are only busy 30% of the time on average, but they re often left on, wasting power Setup cost 260 s 200 W (+more) BUSY server: 200 Watts IDLE server: 140 Watts OFF server: 0 Watts Intel Xeon E5520 dual quad-core 2.27 GHz Provisioning for peak Demand ? Time 5

  6. Problem statement Given unpredictable demand, how to provision capacity to minimize power consumption without violating response time guarantees (95%tile) ? 1. Turn servers off: save power 2. Release VMs: save rental cost 3. Repurpose: additional work done Demand ? Time 6

  7. Experimental setup 7 servers (key-value store) 1 server (500 GB) 28 servers Response time: Time taken to complete the request A single request: 120ms, 3000 KV pairs 7

  8. Experimental setup 7 servers (key-value store) 1 server (500 GB) 28 servers Goal: Provision capacity to minimize power consumption without violating response time SLA SLA: T95 < 400ms-500 ms 8

  9. AlwaysOn Static provisioning policy Knows the maximum request rate into the entire data center (rmax = 800 req/s) What request rate can each server handle? 1 server 95% Resp. time (ms) 1000 800 r max = = 14 k 600 60 400 ms 400 200 0 10 30 50 70 90 110 130 arrival rate (req/s) 60 req/s 9

  10. AlwaysOn T95 = 291ms Pavg= 2,323W 10

  11. Reactive r r current current = + = ( ) t % reqd k x ( ) reqd k t 60 60 T95 = 487ms Pavg= 2,218W T95 = 11,003ms Pavg= 1,281W x= 100% 11

  12. Predictive Use window of observed request rates to predict request rate at time (t+260) seconds. Turn servers on/off based on this prediction. Linear Regression T95 = 2,544ms Pavg= 2,161W Moving Window Average T95 = 7,740ms Pavg= 1,276W 12

  13. AutoScale Predictive and Reactive are too quick to turn servers off If request rate rises again, have to wait for full setup time (260s) Heuristic Wait for some time (twait) before turning idle servers off Energy(wait) = Energy(setup) Pidle twait = Pmax tsetup Two new ideas 10 jobs/server Load balancing? 1000 95% Resp. time 800 Un-balance load: Pack jobs on as few servers as possible without violating SLAs 600 400 200 0 0 10 20 30 jobs at server [Gandhi et al., Allerton Conference on Communication, Control, and Computing, 2011] [Gandhi et al., Open Cirrus Summit, 2011] 13

  14. Results Reactive Reactive AutoScale AutoScale [Gandhi et al., International Green Computing Conference, 2012] [Gandhi et al., HotPower, 2011] 14

  15. cachescale paper

  16. Application in the Cloud Application Tier Database Caching Tier DB req/sec req/sec Load Balancer Why have a caching tier? 1. Reduce database (DB) load ( DB << ) 16

  17. Application in the Cloud Application Tier Database Load Caching Tier DB req/sec req/sec Load Balancer > 1/3 of the cost [Krioukov`10] [Chen`08] Why have a caching tier? 1. Reduce database (DB) load 2. Reduce latency [Ousterhout`10] Shrink your cache during low load ( DB << ) 17

  18. Will cache misses overwhelm the DB? Application Tier Database Caching Tier p DB req/sec req/sec Load Balancer (1-p) = DB Goal: Keep DB = (1-p) low If drops (1-p) can be higher p can be lower SAVE $$$ 18

  19. Are the savings significant? It depends on the popularity distribution Small decrease in hit rate Hit rate, p Uniform Zipf Large decrease in caching tier size caching tier size Small decrease in % of data cached 19

  20. Is there a problem? Performance can temporarily suffer if we lose a lot of hot data Mean response time (ms) Shrink the cache Response time stabilizes Time (min) 20

  21. What can we do about the hot data? Start state Caching Tier End state Retiring Caching Tier Option 1 Transfer Caching Tier Primary Caching Tier Caching Tier Option 2 We need to transfer the hot data before shrinking the cache 21

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#