Data Storage Strategies for Modern Distributed Systems

Slide Note
Embed
Share

This content discusses the challenges of storing vast amounts of digital data in distributed systems, exploring strategies like duplication, Reed-Solomon codes, and degraded reads for fault tolerance and quick recovery in the face of disk failures. It highlights the importance of efficient data storage in the era of the data deluge and increasing demand for reliable, accessible, and cost-effective storage solutions.


Uploaded on Jul 17, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Coding for Modern Distributed Storage Systems: Part 1. Locally Repairable Codes Parikshit Gopalan Windows Azure Storage, Microsoft.

  2. The data deluge Problem Statement: We generate insane amounts of digital data. We expect it to be stored reliably and accessible anytime, anywhere. And for free! Total data in the cloud is of the order of few hundred Exabytes A terabyte hard drive costs $100. Even storing raw data costs hundreds of millions. Hardware is no longer cheap. Currently, data centers consume up to 3 percent of all global electricity production while producing 200 million metric tons of carbon dioxide.

  3. Data Storage: the basics Goal: Tolerate one disk failure. Duplication 2X overhead. Quick recovery. Simple XOR [RAID5] Treat each disk as a bit vector. 1.2X overhead. Slower recovery.

  4. Data Storage: the basics Goal: Tolerate two disk failures. Triplication 3X overhead. Quick recovery. [6,4] Reed Solomon Code [RAID6] 1.5X overhead. Slower recovery. Need a larger field: each disk is a byte-vector.

  5. Data storage: the basics Reed Solomon codes ? data symbols, ? ? parity checks. Field size O(?). Any k symbols suffice for full data recovery (MDS). How many parity checks do you need? o(k) redundancy seems to be sufficient. [G.-Huang-Jenkins-Yekhanin 13] Failure rate ? is tiny (assume ? ? < 0.5). Goal is (only) to be as reliable as 3-way replication. Should be getting overheads close to 0!

  6. Data storage: the basics Reed Solomon codes ? data symbols, ? ? parity checks. Field size O(?). Any k symbols suffice for full data recovery (MDS). How many parity checks do you need? o(k) redundancy seems to be sufficient. Should be getting overheads close to 0! Recovery cost would be prohibitively high: Need to read ? other disks (MDS). Limits us to small values of ? (< 25).

  7. Degraded Reads Typical failure scenario: a single disk fails or is prohibitively slow.

  8. Degraded Reads Typical failure scenario: a single disk fails or is prohibitively slow. Reed Solomon: Need to read ? disks. Any ? suffice.

  9. Disk Failure Typical failure scenario: a single disk fails and needs to be replaced. Reed Solomon: Need to read ? disks. Any ? suffice.

  10. Can we do better? Regenerating Codes [Dimakis-Godfrey-Wu-Wainwright-Ramachandran 10] Metric: Network bandwidth. Optimize the amount of data communicated to repair a single lost node. Locally Repairable Codes [G.-Huang-Simitci-Yekhanin 12] Metric: Number of disk reads. Optimize the number of disk reads needed to repair a single lost node.

  11. Part 1 of this Tutorial: LRCs Part 1.1: Locality 1. Locality of codeword symbols. 2. Rate-distance-locality tradeoffs. Part 1.2: Reliability 1. Beyond minimum distance: Maximum recoverability. 2. Constructions of Maximally Recoverable LRCs.

  12. Locality [Chen-Huang-Li 07, Oggier-Datta 11, G.-Simitci-Huang-Yekhanin 12, Papailiopoulos-Luo-Dimakis-Huang-Li 12] [G.-Simitci-Huang-Yekhanin 12] A coordinate in a linear code has locality ?if it can be expressed as a linear combination of ? other coordinates. If ?? is lost, it can be recovered by reading just ? other symbols. Data locality r: all data symbols have locality r. All-symbol locality r: all symbols have locality r. Decouples typical decoding complexity ? from length ?. ? reads for single failures, degraded reads. No guarantees for more worst-case failures.

  13. Locally Decodable/Testable Codes Locally Decodable Codes [Goldreich-Levin 89, Katz-Trevisan 00, Yekhanin 12] Implicit in early work on Majority Logic Decoding [Reed 52]. Aims for locality up to the minimum distance. Super-linear length lower bounds known for ? = ? 1 and constant relative distance [Katz-Trevsian 00]. Such codes would have given fault-tolerant storage with unlimited scalability. Best constructions are super-polynomial [Yekhanin 07, Efremeko 09]. Can get high rate with larger locality [Kopparty-Saraf-Yekhanin 11, Meir 14]. Locally Testable Codes [Blum-Luby-Runbinfeld 90, Rubinfeld-Sudan 92].

  14. Codes with data locality Def: A ?,?,?? code has data locality?if each information symbol can be expressed as a linear combination of ? other coordinates. Pyramid Codes [Chen-Huang-Li 07]: Take an [? + ? 1,?,?] Reed-Solomon code, split the first parity. ? = ? 1 +1 This gives ?+ ? 2. Is the linear overhead necessary?

  15. Codes with all-symbol locality Def: An ?,?,?? linear code has locality?if each co-ordinate can be expressed as a linear combination of ? other coordinates. Add a local parity to every group of parity symbols of size r. ? = (? + ? 2)(1 +1 This gives ?).

  16. Rate-distance-locality tradeoffs What are the tradeoffs between ?,?,?,?? [G.-Huang-Simitci-Yekhanin 12]: In any linear code with information locality r, ? ? ? ? + ? 2.

  17. Rate-distance-locality tradeoffs What are the tradeoffs between ?,?,?,?? [G.-Huang-Simitci-Yekhanin 12]: In any linear code with information locality ?, ? ? + 1 ? ? + ? 2.

  18. Rate-distance-locality tradeoffs What are the tradeoffs between ?,?,?,?? [G.-Huang-Simitci-Yekhanin 12]: In any linear code with information locality r, ? ? + 1 ? Moreover, any code achieving this bound for ?|? and small ? looks like ? + ? 2.

  19. Rate-distance-locality tradeoffs What are the tradeoffs between ?,?,?,?? [G.-Huang-Simitci-Yekhanin 12]: In any linear code with all-symbol locality r, ? + 1 ? Let ?|?. For equality, (? + 1)|?. Hence ? ? + 3. ? ? + ? 2. Codes that achieve equality could look like Such codes do exist for ? ??? (non-explicit construction).

  20. Explicit codes with all-symbol locality. [Tamo-Papailiopoulos-Dimakis 13] Optimal length codes with all-symbol locality for ? = exp(?). Construction based on RS code, analysis via matroid theory. [Silberstein-Rawat-Koyluoglu-Vishwanath 13] Optimal length codes with all-symbol locality for ? = 2?. Construction based on Gabidulin codes (aka linearized RS codes). [Barg-Tamo 14] Optimal length codes with all-symbol locality for ? = ?(?). Construction based on Reed-Solomon codes.

  21. Rate-distance-locality tradeoffs What are the tradeoffs between ?,?,?,?? [G.-Huang-Simitci-Yekhanin 12]: In any linear code with all-symbol locality r, ? + 1 ? ? + ? 2. ? For equality, (? + 1)|?. Hence ? ? + 3. Algorithmic proof using linear algebra. [Papailiopoulus-Dimakis 12] Replace linear algebra with information theory. [Prakash-Lalitha-Kamath-Kumar 12] Generalized Hamming weights. [Barg-Tamo 13] Graph theoretic proof.

  22. Generalizations Non-linear codes [Papailiopoulos-Dimakis, Forbes-Yekhanin]. Vector codes Prakash-Lalitha-Kumar] [Papailoupoulos-Dimakis, Silberstein-Rawat-Koyluoglu-Vishwanath, Kamath- Codes over bounded alphabets [Cadambe-Mazumdar] Codes with short local MDS codes [Prakash-Lalitha-Kamath-Kumar, Silberstein-Rawat-Koyluoglu-Vishwanath] Codes with local Regeneration [Silberstein-Rawat-Koyluoglu-Vishwanath, Kamath-Prakash-Lalitha-Kumar ]

  23. Towards locally decodable codes Codes with short local MDS codes [Prakash-Lalitha-Kamath-Kumar, Silberstein-Rawat-Koyluoglu-Vishwanath] Avoids the slowest node bottleneck [Shah-Lee-Ramachandran] Sequential local recovery [Prakash-Lalitha-Kumar] Multiple disjoint local parities [Wang-Zhang, Barg-Tamo] Can serve multiple read requests in parallel. Problem: Consider an ?,??linear code where even after ? arbitrary failures, every (information) symbol has locality ?. How large does ? need to be? [Barg-Tamo 14] bound might be a good starting point.

Related