Innovations in Cold Storage Rack Design

 
Flamingo: Enabling Evolvable HDD-based
Near-Line Storage
 
Sergey Legtchenko
, Xiaozhou Li, Antony Rowstron,
Austin Donnelly, Richard Black
Storing Cold Data in the Cloud
Cold data: 
rarely accessed data
Challenge: 
storing cold data at low cost
 
Benefits
Lower capital cost
Capped resource consumption
Higher storage density
Designing Cold Storage Racks is Hard
Resources are constrained in the rack
 
Experience from building Pelican
Design complexity
Storage stack is brittle to design changes
Impact of resource provisioning on end performance?
 
Software: co-designed, constraint-aware
Data Layout
IO Scheduler
Pelican: 8% HDDs active
Flamingo: a Tool to Help Cold Storage Rack Design
Input
 
In the rest of the talk
Rack Description
Resource Domain: 
set of HDDs sharing a limited resource
D1
D2
D4
D5
D6
D7
D8
D3
 
can be hard or soft
Storage Stack Configuration
Rack
description
Constraint
Solver
 
Groups of HDDs that concurrently transition state
Minimize inter-group conflicts
Storage Stack Configuration
Rack
description
Constraint
Solver
Data layout 
Groups of HDDs that concurrently transition state
Minimize inter-group conflicts
vibration, budget: 1
power
 
budget: 40
D1
D2
D4
D5
D6
D7
D8
D3
Resource Provisioning Exploration
N resource types: N-dimensional space of rack descriptions
Resource 1
(e.g. power)
Resource 2 (e.g. vibration)
Fully provisioned
0
0
Fully provisioned
Resource Provisioning Exploration
N resource types: N-dimensional space of rack descriptions
Resource 1
(e.g. power)
Resource 2 (e.g. vibration)
Fully provisioned
0
0
Fully provisioned
Resource Provisioning Exploration
N resource types: N-dimensional space of rack descriptions
Resource 1
(e.g. power)
Resource 2 (e.g. vibration)
Fully provisioned
0
0
Fully provisioned
Discrete surface 
in the N-dimensional space
For Pelican: 747 rack descriptions
Evaluation - Pelican
Pelican Simulator – Poisson workload, 1GB reads.
Execution Time - Pelican
 
9 
minutes
 
Execution Time - Pelican
 
9 
minutes
 
Execution Time for Different Racks
 
3
 hours
 
9 
minutes
Conclusion
Cold storage racks:
Co-design: resource-constrained hardware + constraint-aware software
low cost
 but 
hard to (re)design
 
Flamingo simplifies design of cold storage racks
Synthesizes Data Layout and IO Scheduler parameters
Explores impact of resource provisioning on end performance
Redesign in days vs months manually
Slide Note
Embed
Share

Flamingo introduces a tool for designing efficient cold storage racks, addressing challenges in storing rarely accessed cold data at low cost. By leveraging innovative approaches like custom racks and resource optimization, the design complexity and performance impact are managed effectively. The solution involves co-designed software, constraint-aware mechanisms, and exploration of hardware properties to enhance cold storage infrastructure.

  • Cold Storage
  • Rack Design
  • Data Layout
  • Resource Optimization
  • Innovative Solutions

Uploaded on Sep 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Flamingo: Enabling Evolvable HDD-based Near-Line Storage Sergey Legtchenko, Xiaozhou Li, Antony Rowstron, Austin Donnelly, Richard Black

  2. Storing Cold Data in the Cloud Cold data: rarely accessed data Challenge: storing cold data at low cost Custom racks: trading latency for cost Only fraction of HDDs concurrently active Reduced #servers (1 or 2 per rack) Benefits Lower capital cost Capped resource consumption Higher storage density Open Compute Cold Storage rack (Facebook) 6% HDDs active 240 HDDs/server Pelican rack (Microsoft, [OSDI 14]) 8% HDDs active 576 HDDs/server

  3. Designing Cold Storage Racks is Hard Resources are constrained in the rack Software: co-designed, constraint-aware Data Layout IO Scheduler Cooling Experience from building Pelican Design complexity Storage stack is brittle to design changes Impact of resource provisioning on end performance? Pelican: 8% HDDs active Resource constraints: 1 HDD / cooling column 2 HDDs / tray Vibration, bandwidth

  4. Flamingo: a Tool to Help Cold Storage Rack Design In the rest of the talk Input Online Data layout IO scheduler parameters Storage stack configuration Data layout, IO scheduler + Constraint Solver Data layout, IO scheduler + Rack description Constraints Hardware properties Resource provisioning exploration Generic Storage Stack Perf. Analysis (simulator) Rack Performance goals Set of rack descriptions: same topology varying resource provisioning Data layout, IO scheduler + Resource provisioning specification

  5. Rack Description Resource Domain: set of HDDs sharing a limited resource {D1,D5}: 1 {D2,D6}: 1 Domain A: type: power,budget: 40W D1 D2 D3 D4 {D3,D7}: 1 Domain B: type: vibration, budget: 1HDD {D4,D8}: 1 D5 D6 D7 D8 {D1,D2,D3,D4}: 40 {D5,D6,D7,D8}: 40 HDD: operating states + resource consumption power: 20W vibration: 1 Expresses constraints: e.g. only 1 HDD can be spinning up in A Spin up power: 2W vibration: 0 power: 10W vibration: 1 Active Standby 40W 40W 2 + 2 + 2 + 20 = 26 2 + 2 + 20+20 = 44W IO-capable state can be hard or soft

  6. Storage Stack Configuration Data layout Groups of HDDs that concurrently transition state Minimize inter-group conflicts Constraint Solver Rack description

  7. Storage Stack Configuration Data layout Groups of HDDs that concurrently transition state Minimize inter-group conflicts Generic Storage Stack Constraint Solver G1: {D3, D8} Group definition D1 D1 D1 D1 D1 D1 D3 D3 D3 D3 D3 D3 D2 D2 D2 D2 D2 D2 D4 D4 D4 D4 D4 D4 D4 D4 G2: {D4, D7} Each group conflicts with 2 Conflicts minimized Blob-store API D5 D5 D5 D5 D5 D5 D6 D6 D6 D6 D6 D6 D8 D8 D8 D8 D8 D8 D8 D8 D7 D7 D7 D7 D7 D7 G3: {D1, D6} Conflict Conflict 20, 1 Data Layout G4: {D2, D5} Su powerbudget: 40 powerbudget: 40 powerbudget: 40 Sd 2, 0 Blob Group of HDDs A 10, 1 vibration, budget: 1 vibration, budget: 1 vibration, budget: 1 Rack IO Scheduler description Conflicts between groups Inter-group constraints Spin ups/downs, IOs Rack description {G2,G1}: 1 {G1,G2}: 1 {G4,G3}: 1 {G3,G4}: 1 {G3,G4,G1,G2}: 40 {G4,G3,G2,G1}: 40 {D4,D8}: 1 {D3,D7}: 1 {D2,D6}: 1 {D1,D5}: 1 {D1,D2,D3,D4}: 40 {D5,D6,D7,D8}: 40 {G1,G2}: 1 {G3,G4}: 1 {G1,G2,G3,G4}: 40

  8. Resource Provisioning Exploration N resource types: N-dimensional space of rack descriptions Fully provisioned Resource 1 (e.g. power) 40 D1 D1 D2 D2 D3 D3 D4 D4 D4 40 D5 D5 D6 D6 D7 D7 D8 D8 D8 1 0 Fully provisioned 0 1 Resource 2 (e.g. vibration)

  9. Resource Provisioning Exploration N resource types: N-dimensional space of rack descriptions Fully-provisioned rack (JBOD), per domain: all HDDs in most resource-consuming state D1 D1 D2 D2 D3 D3 D4 D4 D4 Fully provisioned D5 D5 D6 D6 D7 D7 D8 D8 D8 powerbudget: 80 Resource 1 (e.g. power) vibration, budget: 2 20, 1 Su Sd 2, 0 A 10, 1 Least-provisioned rack, per domain: 1 HDD in IO-capable state n-1 in lowest resource-consuming D1 D1 D2 D2 D3 D3 D4 D4 D4 D5 D5 D6 D6 D7 D7 D8 D8 D8 0 Fully provisioned 0 powerbudget: 26 Resource 2 (e.g. vibration) vibration, budget: 1

  10. Resource Provisioning Exploration N resource types: N-dimensional space of rack descriptions Fully-provisioned rack (JBOD), per domain: all HDDs in most resource-consuming state Fully provisioned Resource 1 (e.g. power) Discrete surface in the N-dimensional space For Pelican: 747 rack descriptions Bottleneck resource: vibration Bottleneck resource: power 0 Fully provisioned 0 Resource 2 (e.g. vibration)

  11. Evaluation - Pelican Pelican Simulator Poisson workload, 1GB reads. (Normalized to worst case) 1 1000 0.8 Time to first byte Time to first byte (s) Pelican Pelican 0.6 Flamingo - Pelican 100 0.4 10 0.2 0 1 0.0625 0 0.2 0.4 0.6 0.8 1 0.25 1 4 Resource provisioning (normalized to full) Workload rate (requests/sec)

  12. Execution Time - Pelican 9 minutes 1 0.9 CDF of rack descriptions 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 60 3600 Storage stack configuration time (s)

  13. Execution Time - Pelican 9 minutes 1 0.9 CDF of rack descriptions 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 60 3600 Storage stack configuration time (s) Rack name OCP Pelican Rack_A Rack_B Rack_C Rack_D Rack_E #rack 1921 747 1421 1152 973 649 683 descriptions

  14. Execution Time for Different Racks 3 hours 9 minutes 1 0.9 CDF of rack descriptions 0.8 0.7 OCP Pelican Rack_A Rack_B Rack_C Rack_D Rack_E 0.6 0.5 0.4 0.3 0.2 0.1 0 1 60 3600 Storage stack configuration time (s) Rack name OCP Pelican Rack_A Rack_B Rack_C Rack_D Rack_E #rack 1921 747 1421 1152 973 649 683 descriptions

  15. Conclusion Cold storage racks: Co-design: resource-constrained hardware + constraint-aware software low cost but hard to (re)design Flamingo simplifies design of cold storage racks Synthesizes Data Layout and IO Scheduler parameters Explores impact of resource provisioning on end performance Redesign in days vs months manually

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#