Facebook's Haystack: Optimizing Photo Storage for Popular Content Distribution

C
S
E
 
4
8
6
/
5
8
6
 
D
i
s
t
r
i
b
u
t
e
d
 
S
y
s
t
e
m
s
W
e
b
 
C
o
n
t
e
n
t
 
D
i
s
t
r
i
b
u
t
i
o
n
-
-
-
2
C
a
s
e
 
S
t
u
d
y
:
 
F
a
c
e
b
o
o
k
 
H
a
y
s
t
a
c
k
Steve Ko
Computer Sciences and Engineering
University at Buffalo
R
e
c
a
p
 
Engineering principle
Make the common case fast, and rare cases correct
(From Patterson & Hennessy books)
This principle cuts through generations of systems.
Example?
CPU Cache
Knowing common cases == understanding your
workload
E.g., read dominated? Write dominated? Mixed?
2
R
e
c
a
p
 
“Hot” vs. “very warm” vs. “warm” photos
Hot: Popular, a lot of views
Very warm: Somewhat popular, still a lot of views
Warm: Unpopular, but still a lot of views in aggregate
3
Items sorted by popularity
Popularity
V
e
r
y
 
w
a
r
m
 
a
n
d
 
w
a
r
m
 
P
h
o
t
o
s
 
Hot photos are served by a CDN.
Warm photo characteristics
Not so much popular
Not entirely “cold,” i.e., occasional views
A lot in aggregate
Does not want to cache everything in CDN due to
diminishing returns
Facebook stats (in their 2010 paper)
260 billion images (~20 PB)
1 billion new photos per week (~60 TB)
One million image views per second at peak
Approximately 10% not served by CDN, but 
still a lot
4
P
o
p
u
l
a
r
i
t
y
 
C
o
m
e
s
 
w
i
t
h
 
A
g
e
 
5
F
a
c
e
b
o
o
k
 
P
h
o
t
o
 
S
t
o
r
a
g
e
Three generations of photo storage
NFS-based (today)
Haystack (today): Very warm photos
f4 (next time): Warm photos
Characteristics
After-CDN storage
Each generation solves a particular problem observed from
the previous generation.
6
1
s
t
 
G
e
n
e
r
a
t
i
o
n
:
 
N
F
S
-
B
a
s
e
d
 
7
1
s
t
 
G
e
n
e
r
a
t
i
o
n
:
 
N
F
S
-
B
a
s
e
d
 
Each photo 
 single file
Observed problem
Thousands of files in each directory
Extremely inefficient due to meta data management
10 disk operations for a single image: chained filesystem
inode reads for its directory and itself & the file read
In fact, a well-known problem with many files in a
directory
Be aware when you do this.
The inode space (128 or 256 bytes) runs out.
A lot of operations necessary for meta data retrieval.
8
C
S
E
 
4
8
6
/
5
8
6
 
A
d
m
i
n
i
s
t
r
i
v
i
a
Final: 5/18/2017, Thursday, 6 pm 
 8 pm, Knox 110
9
2
n
d
 
G
e
n
e
r
a
t
i
o
n
:
 
H
a
y
s
t
a
c
k
 
Custom-designed photo storage
What would you try? (Hint: too many files!)
Starting point: One big file with many photos
Reduces the number of disk operations required to
one
All meta data management done in memory
Design focus
Simplicity
Something buildable within a few months
Three components
Directory
Cache
Store
10
H
a
y
s
t
a
c
k
 
A
r
c
h
i
t
e
c
t
u
r
e
 
11
H
a
y
s
t
a
c
k
 
D
i
r
e
c
t
o
r
y
 
Helps the URL construction for an image
http://⟨CDN⟩/⟨Cache⟩/⟨Machine id⟩/⟨Logical volume, Photo⟩
Staged lookup
CDN strips out its portion.
Cache strips out its portion.
Machine strips out its portion
 
 
 
Logical & physical volumes
A logical volume is replicated as multiple physical volumes
Physical volumes are stored.
Each volume contains multiple photos.
Directory maintains this mapping
12
H
a
y
s
t
a
c
k
 
C
a
c
h
e
 
Facebook-operated CDN using DHT
Photo IDs as the key
Further removes traffic to Store
Mainly caches newly-uploaded photos
High cache hit rate (due to caching new photos)
13
H
a
y
s
t
a
c
k
 
S
t
o
r
e
 
Maintains physical volumes
One volume is a single large file (100GB) with many
photos (needles)
14
H
a
y
s
t
a
c
k
 
S
t
o
r
e
 
Metadata managed in memory
(key, alternate key) to (flags, size, volume offset)
Quick lookup for both read and write
Disk operation only required for actual image read
Write/delete
Append-only
Delete is marked, later garbage-collected.
Indexing
For fast memory metadata construction
15
D
a
i
l
y
 
S
t
a
t
s
 
w
i
t
h
 
H
a
y
s
t
a
c
k
Photos uploaded: ~120 M
Haystack photos written: ~1.44 B
Photos viewed: 80 – 100 B
Thumbnails: 10.2%
Small: 84.4%
Medium: 0.2%
Large: 5.2%
Haystack photos read: 10 B
16
S
u
m
m
a
r
y
Two different types of workload for a social
networking Web service
Posts: read/write
Photos: write-once, read-many
Photo workload
Zipf distribution
“Hot” photos can be handled by CDN
“Warm” photos have diminishing returns.
Haystack: Facebook’s 2
nd
 generation photo storage
Goal: reducing disk I/O for warm photos
One large file with many photos
Metadata stored in memory
Internal CDN
17
Slide Note
Embed
Share

Facebook's photo storage system, Haystack, optimizes content distribution by categorizing photos based on popularity and utilizing a CDN. The system addresses challenges of storing billions of images efficiently and ensures quick access to popular photos while managing less popular ones effectively.


Uploaded on Oct 10, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. CSE 486/586 Distributed Systems Web Content Distribution---2 Case Study: Facebook Haystack Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586

  2. Recap Engineering principle Make the common case fast, and rare cases correct (From Patterson & Hennessy books) This principle cuts through generations of systems. Example? CPU Cache Knowing common cases == understanding your workload E.g., read dominated? Write dominated? Mixed? CSE 486/586 2

  3. Recap Hot vs. very warm vs. warm photos Hot: Popular, a lot of views Very warm: Somewhat popular, still a lot of views Warm: Unpopular, but still a lot of views in aggregate Popularity Items sorted by popularity CSE 486/586 3

  4. Very warm and warm Photos Hot photos are served by a CDN. Warm photo characteristics Not so much popular Not entirely cold, i.e., occasional views A lot in aggregate Does not want to cache everything in CDN due to diminishing returns Facebook stats (in their 2010 paper) 260 billion images (~20 PB) 1 billion new photos per week (~60 TB) One million image views per second at peak Approximately 10% not served by CDN, but still a lot CSE 486/586 4

  5. Popularity Comes with Age CSE 486/586 5

  6. Facebook Photo Storage Three generations of photo storage NFS-based (today) Haystack (today): Very warm photos f4 (next time): Warm photos Characteristics After-CDN storage Each generation solves a particular problem observed from the previous generation. CSE 486/586 6

  7. 1st Generation: NFS-Based CSE 486/586 7

  8. 1st Generation: NFS-Based Each photo single file Observed problem Thousands of files in each directory Extremely inefficient due to meta data management 10 disk operations for a single image: chained filesystem inode reads for its directory and itself & the file read In fact, a well-known problem with many files in a directory Be aware when you do this. The inode space (128 or 256 bytes) runs out. A lot of operations necessary for meta data retrieval. CSE 486/586 8

  9. CSE 486/586 Administrivia Final: 5/18/2017, Thursday, 6 pm 8 pm, Knox 110 CSE 486/586 9

  10. 2nd Generation: Haystack Custom-designed photo storage What would you try? (Hint: too many files!) Starting point: One big file with many photos Reduces the number of disk operations required to one All meta data management done in memory Design focus Simplicity Something buildable within a few months Three components Directory Cache Store CSE 486/586 10

  11. Haystack Architecture CSE 486/586 11

  12. Haystack Directory Helps the URL construction for an image http:// CDN / Cache / Machine id / Logical volume, Photo Staged lookup CDN strips out its portion. Cache strips out its portion. Machine strips out its portion Logical & physical volumes A logical volume is replicated as multiple physical volumes Physical volumes are stored. Each volume contains multiple photos. Directory maintains this mapping CSE 486/586 12

  13. Haystack Cache Facebook-operated CDN using DHT Photo IDs as the key Further removes traffic to Store Mainly caches newly-uploaded photos High cache hit rate (due to caching new photos) CSE 486/586 13

  14. Haystack Store Maintains physical volumes One volume is a single large file (100GB) with many photos (needles) CSE 486/586 14

  15. Haystack Store Metadata managed in memory (key, alternate key) to (flags, size, volume offset) Quick lookup for both read and write Disk operation only required for actual image read Write/delete Append-only Delete is marked, later garbage-collected. Indexing For fast memory metadata construction CSE 486/586 15

  16. Daily Stats with Haystack Photos uploaded: ~120 M Haystack photos written: ~1.44 B Photos viewed: 80 100 B Thumbnails: 10.2% Small: 84.4% Medium: 0.2% Large: 5.2% Haystack photos read: 10 B CSE 486/586 16

  17. Summary Two different types of workload for a social networking Web service Posts: read/write Photos: write-once, read-many Photo workload Zipf distribution Hot photos can be handled by CDN Warm photos have diminishing returns. Haystack: Facebook s 2nd generation photo storage Goal: reducing disk I/O for warm photos One large file with many photos Metadata stored in memory Internal CDN CSE 486/586 17

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#