Elastic Parity Logging for SSD RAID Arrays

E
l
a
s
t
i
c
 
P
a
r
i
t
y
 
L
o
g
g
i
n
g
f
o
r
 
S
S
D
 
R
A
I
D
 
A
r
r
a
y
s
Y
o
n
g
k
u
n
 
L
i
*
,
 
H
e
l
e
n
 
C
h
a
n
#
,
 
P
a
t
r
i
c
k
 
P
.
 
C
.
 
L
e
e
#
,
 
Y
i
n
l
o
n
g
 
X
u
*
*University of Science and Technology of China
#
The Chinese University of Hong Kong
DSN 2016
1
S
S
D
 
S
t
o
r
a
g
e
Solid-state drives (SSDs) are widely deployed in
desktops and data centers
Extensive field studies by Facebook 
[Meza, Sigmetrics’15]
,
Google 
[Schroeder, FAST’16]
Better performance, shock resistance, and
power efficiency than harddisks
2
H
o
w
 
S
S
D
s
 
W
o
r
k
?
Basic operations:
Read
 and 
write
: per-page basis (e.g., 4KB, 8KB)
Erase
: per-block basis (e.g., 64 or 128 pages)
Out-of-place write 
for updates:
Write to a 
clean
 page and mark it as 
valid
Mark the original page as 
stale
Garbage collection (GC) 
reclaims stale pages
Erase blocks and relocate any valid page
3
C
h
a
l
l
e
n
g
e
s
 
Reliability
Flash errors are commonplace
Error-correcting codes aren’t bullet-proof
Endurance
Blocks allow only limited P/E cycles
Performance
Poor random write performance
GC overhead
4
S
S
D
 
R
A
I
D
 
RAID provides fault tolerance
Each 
stripe
 contains 
data chunks 
and 
parity chunks
 
 
 
 
Challenge:
Parity updates aggravate small writes
 degrade performance and endurance
5
m1
m4
m7
SSD
1
m2
m5
c3
SSD
2
c1
m6
m9
SSD
4
m3
c2
m8
SSD
3
c1 = m1 + m2 + m3
‘+’ means XOR
O
u
r
 
C
o
n
t
r
i
b
u
t
i
o
n
s
E
P
L
o
g
:
 
a
 
n
e
w
 
R
A
I
D
 
d
e
s
i
g
n
 
f
o
r
 
S
S
D
 
R
A
I
D
a
r
r
a
y
s
 
v
i
a
 
e
l
a
s
t
i
c
 
p
a
r
i
t
y
 
l
o
g
g
i
n
g
Redirects writes to separate log devices
Constructs “elastic” stripes by new writes only
Maintains high reliability, endurance, performance
Prototype implementation
General fault tolerance with erasure coding
Deployable on commodity hardware
Reliability analysis + testbed experiments
6
R
e
l
a
t
e
d
 
W
o
r
k
Improve parity updates for SSD RAID
Parity caching
: requires non-volatile memory
[HotDep’09,TC’11,SAC’11]
Elastic striping
: incurs RAID-level GC 
[DSN’13,DSN’15]
P
a
r
i
t
y
 
l
o
g
g
i
n
g
:
 
n
e
e
d
s
 
p
r
e
-
r
e
a
d
 
a
n
d
 
p
e
r
-
s
t
r
i
p
e
c
o
m
p
u
t
a
t
i
o
n
 
[
I
S
C
A
9
3
]
EPLog extends parity logging with two
optimizations
7
P
a
r
i
t
y
 
L
o
g
g
i
n
g
8
E
P
L
o
g
 
Performs out-of-place updates at system level
Computes a 
log chunk 
based on new writes
 
 
 
 
 
Elastic parity logging:
No pre-reads of existing chunks
A log chunk may span across stripes
9
E
P
L
o
g
 
A
r
c
h
i
t
e
c
t
u
r
e
User-level block device layer
SSD main array + harddisk log devices
10
P
a
r
i
t
y
 
c
o
m
m
i
t
D
e
s
i
g
n
 
I
s
s
u
e
s
 
Limitations:
Extra footprints for log devices
Extra space for multiple versions of chunks
Slow recovery before parity commit
Issues:
Write processing
Parity commit
Caching
Metadata management
11
W
r
i
t
e
 
P
r
o
c
e
s
s
i
n
g
Full-stripe new writes
Write directly data/parity chunks to SSD main array
Use 
k
-of-
n
 erasure coding
e.g., RAID-5: 
k = n - 1
Partial-stripe writes or updates
Write data chunks to SSD main array
Write log chunks to log devices in 
append-only
 mode
L
o
g
 
c
h
u
n
k
s
:
 
p
a
r
i
t
y
 
c
h
u
n
k
s
 
f
o
r
 
e
l
a
s
t
i
c
 
s
t
r
i
p
e
s
Formed by 
k’
-of-
n’
 erasure coding
k’
 = number of data chunks in an elastic stripe
n’ – k’
 = number of tolerable failed devices
12
P
a
r
i
t
y
 
C
o
m
m
i
t
Commit latest updates to main array
Identify data stripes
Read latest versions of data chunks from SSDs
Re-compute parity chunks
Write back to SSDs
Release space
Performed regularly or during idle time
No need to access log devices
13
C
a
c
h
i
n
g
EPLog offers an 
optional
 caching feature
Stripe buffer
: New writes
Device buffers
: Updates
14
I
m
p
l
e
m
e
n
t
a
t
i
o
n
Persistent metadata management
Full checkpoint
: flushes all metadata
Incremental checkpoint
: flushes modified metadata
since the last checkpoint
Persistent metadata storage on SSDs
Separate data and metadata on SSDs
RAID-10 for metadata partitions
Multi-threaded writes for performance gains
15
R
e
l
i
a
b
i
l
i
t
y
 
A
n
a
l
y
s
i
s
Question:
Reduced writes to SSDs slow down wearing 
improves reliability
Extra harddisk log devices 
 degrades reliability
Reduced to SSDs slows down the wearing of SSDs and so
improves reliability
Compare EPLog and conventional SSD RAID
via Markov MTTDL analysis
Our key findings:
E
P
L
o
g
 
i
m
p
r
o
v
e
s
 
r
e
l
i
a
b
i
l
i
t
y
 
i
n
 
c
o
m
m
o
n
 
s
e
t
t
i
n
g
s
16
E
x
p
e
r
i
m
e
n
t
s
Testbed
Linux Ubuntu 14.04 LTS with kernel 3.13
Plextor M5 Pro 128GB SSDs
Seagate ST1000DM003 7200RPM 1TB SATA HDDs
Four public block-level traces
Write-intensive; dominated by small random writes
Three schemes
Linux software RAID 
mdadm
 (MD)
Original parity logging (PL)
EPLog
17
W
r
i
t
e
 
S
i
z
e
 
t
o
 
S
S
D
s
EPLog has ~50% less writes than MD
18
(6+2)-RAID-6
G
C
 
O
v
e
r
h
e
a
d
EPLog has 77% fewer GC requests than MD
Slightly fewer GCs than PL due to better sequentiality
19
(6+2)-RAID-6
I
/
O
 
P
e
r
f
o
r
m
a
n
c
e
EPLog outperforms MD by 30-120% and PL by 190-300%
No pre-reads
Fewer log chunks
20
(6+2)-RAID-6
C
a
c
h
i
n
g
Caching reduces write size to SSDs with small
cache
21
(6+2)-RAID-6
P
a
r
i
t
y
 
C
o
m
m
i
t
Parity commit has limited overhead if it’s
performed in groups of writes
22
(6+2)-RAID-6
C
o
n
c
l
u
s
i
o
n
s
EPLog is a new SSD RAID design with
reliability, endurance, and performance in mind
Elastic parity logging
EPLog design is backed by implementation,
reliability analysis, extensive experiments
Source code:
http://ansrlab.cse.cuhk.edu.hk/software/eplog
23
Slide Note
Embed
Share

SSDs are widely used for their performance and efficiency, but face challenges like flash errors and poor random write performance. The paper introduces EPLog, a new RAID design for SSD arrays, enhancing reliability, endurance, and performance through elastic parity logging. The solution redirects writes to separate log devices and constructs elastic stripes, maintaining fault tolerance with erasure coding. Related work includes improvements in SSD RAID parity updates.

  • SSD RAID Arrays
  • EPLog
  • Elastic Parity Logging
  • Data Storage
  • Reliability

Uploaded on Sep 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Elastic Parity Logging for SSD RAID Arrays Yongkun Li*, Helen Chan#, Patrick P. C. Lee#, Yinlong Xu* *University of Science and Technology of China #The Chinese University of Hong Kong DSN 2016 1

  2. SSD Storage Solid-state drives (SSDs) are widely deployed in desktops and data centers Extensive field studies by Facebook [Meza, Sigmetrics 15], Google [Schroeder, FAST 16] Better performance, shock resistance, and power efficiency than harddisks 2

  3. How SSDs Work? Basic operations: Read and write: per-page basis (e.g., 4KB, 8KB) Erase: per-block basis (e.g., 64 or 128 pages) Out-of-place write for updates: Write to a clean page and mark it as valid Mark the original page as stale Garbage collection (GC) reclaims stale pages Erase blocks and relocate any valid page 3

  4. Challenges Reliability Flash errors are commonplace Error-correcting codes aren t bullet-proof Endurance Blocks allow only limited P/E cycles Performance Poor random write performance GC overhead 4

  5. SSD RAID RAID provides fault tolerance Each stripe contains data chunks and parity chunks c1 = m1 + m2 + m3 + means XOR m1 m4 m7 m2 m5 c3 m3 c2 m8 c1 m6 m9 SSD1 SSD2 SSD3 SSD4 Challenge: Parity updates aggravate small writes degrade performance and endurance 5

  6. Our Contributions EPLog: a new RAID design for SSD RAID arrays via elastic parity logging Redirects writes to separate log devices Constructs elastic stripes by new writes only Maintains high reliability, endurance, performance Prototype implementation General fault tolerance with erasure coding Deployable on commodity hardware Reliability analysis + testbed experiments 6

  7. Related Work Improve parity updates for SSD RAID Parity caching: requires non-volatile memory [HotDep 09,TC 11,SAC 11] Elastic striping: incurs RAID-level GC [DSN 13,DSN 15] Parity logging: needs pre-read and per-stripe computation [ISCA 93] EPLog extends parity logging with two optimizations 7

  8. Parity Logging Original parity logging Requests: {?0, ?0, ?0 }, {?1, ?1, ?1 }, {?0 , ?0 , ?1 } Drawbacks: Pre-read: Extra reads Per-stripe basis: Extra log chunks; Partial parallelism 8

  9. EPLog Performs out-of-place updates at system level Computes a log chunk based on new writes Elastic parity logging: No pre-reads of existing chunks A log chunk may span across stripes 9

  10. EPLog Architecture User-level block device layer SSD main array + harddisk log devices Parity commit 10

  11. Design Issues Limitations: Extra footprints for log devices Extra space for multiple versions of chunks Slow recovery before parity commit Issues: Write processing Parity commit Caching Metadata management 11

  12. Write Processing Full-stripe new writes Write directly data/parity chunks to SSD main array Use k-of-n erasure coding e.g., RAID-5: k = n - 1 Partial-stripe writes or updates Write data chunks to SSD main array Write log chunks to log devices in append-only mode Log chunks: parity chunks for elastic stripes Formed by k -of-n erasure coding k = number of data chunks in an elastic stripe n k = number of tolerable failed devices 12

  13. Parity Commit Commit latest updates to main array Identify data stripes Read latest versions of data chunks from SSDs Re-compute parity chunks Write back to SSDs Release space Performed regularly or during idle time No need to access log devices 13

  14. Caching EPLog offers an optional caching feature Stripe buffer: New writes Device buffers: Updates 14

  15. Implementation Persistent metadata management Full checkpoint: flushes all metadata Incremental checkpoint: flushes modified metadata since the last checkpoint Persistent metadata storage on SSDs Separate data and metadata on SSDs RAID-10 for metadata partitions Multi-threaded writes for performance gains 15

  16. Reliability Analysis Question: Reduced writes to SSDs slow down wearing improves reliability Extra harddisk log devices degrades reliability Reduced to SSDs slows down the wearing of SSDs and so improves reliability Compare EPLog and conventional SSD RAID via Markov MTTDL analysis Our key findings: EPLog improves reliability in common settings 16

  17. Experiments Testbed Linux Ubuntu 14.04 LTS with kernel 3.13 Plextor M5 Pro 128GB SSDs Seagate ST1000DM003 7200RPM 1TB SATA HDDs Four public block-level traces Write-intensive; dominated by small random writes Three schemes Linux software RAID mdadm (MD) Original parity logging (PL) EPLog 17

  18. Write Size to SSDs (6+2)-RAID-6 EPLog has ~50% less writes than MD 18

  19. GC Overhead (6+2)-RAID-6 EPLog has 77% fewer GC requests than MD Slightly fewer GCs than PL due to better sequentiality 19

  20. I/O Performance (6+2)-RAID-6 EPLog outperforms MD by 30-120% and PL by 190-300% No pre-reads Fewer log chunks 20

  21. Caching (6+2)-RAID-6 Caching reduces write size to SSDs with small cache 21

  22. Parity Commit (6+2)-RAID-6 Parity commit has limited overhead if it s performed in groups of writes 22

  23. Conclusions EPLog is a new SSD RAID design with reliability, endurance, and performance in mind Elastic parity logging EPLog design is backed by implementation, reliability analysis, extensive experiments Source code: http://ansrlab.cse.cuhk.edu.hk/software/eplog 23

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#