Understanding Durability and Integrity in Data Storage Systems

Slide Note
Embed
Share

Durability and integrity are crucial aspects of data storage systems. Durability ensures data survives faults like crashes and power loss, while integrity ensures data correctness in the face of faults. Disk data is durable due to surviving power loss, and has integrity through explicit and complex software interfaces. In-memory data, although fast, lacks durability and may lack integrity. NVRAM offers fast, durable memory but does not ensure data integrity. Ensuring integrity of in-memory data becomes a protection challenge, which can involve protection mechanisms like language-level guarantees and file system interfaces in front of NVRAM.


Uploaded on Sep 19, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Free Transactions with Rio Vista Free Transactions with Rio Vista Landon Cox Landon Cox April 2, 2018 April 2, 2018

  2. Basic assumptions On On- -disk data is disk data is durable But slow to access durable and has and has integrity integrity In In- -memory data is memory data is fast But not durable and may lack integrity fast to access to access What is the difference between durability and integrity? What is the difference between durability and integrity? Durability: data survives faults (crashes and power loss) Integrity: data is correct in face of faults

  3. Basic assumption Why does disk data have durability and integrity? Why does disk data have durability and integrity? Disk content survives power loss Disk data has integrity because of software interfaces Interface to disk is explicit and complex Requires interacting with disk drivers Unlikely to randomly duplicate driver functionality Driver calls are checked for errors Is the interface to memory explicit and complex? Is the interface to memory explicit and complex? No, any store instruction modifies the state of memory Any store instruction can modify any writable memory address

  4. Enter: battery-backed memory NVRAM: non NVRAM: non- -volatile RAM NVRAM is fast NVRAM makes memory durable NVRAM does not ensure data integrity volatile RAM fast durable integrity Same simple interface as volatile RAM Same simple interface as volatile RAM Random stores can corrupt in-memory data Question: how to ensure integrity of in-memory data? This becomes a protection question

  5. Protection and NVRAM Previously in protection Previously in protection Language-level guarantees (Java) Instrumented code (Speculative execution) Virtual memory (Micro-kernels, etc.) Disadvantages of languages and instrumentation? Disadvantages of languages and instrumentation? Languages constrain programmer choice Languages do not support existing code in other languages Instrumentation can be slow Instrumentation requires interposing on all accesses

  6. Rio file cache A file system interface in front of NVRAM A file system interface in front of NVRAM Allows warm reboot Allows warm reboot Cache persists across reboots Inspect content, sync with disk No need to write synchronously No need to write synchronously No need to maintain dependencies May still want to maintain a journal Only flush when needed (no timers)

  7. Rio file cache A file system interface in front of NVRAM A file system interface in front of NVRAM Can apps corrupt cache? Can apps corrupt cache? Unlikely to randomly generate write Can randomly store to mmap Do we care about bad Do we care about bad mmap No, apps can corrupt their own data Take that risk when using mmap What about kernel stores? What about kernel stores? Failing kernel can still corrupt cache How to protect cache? How to protect cache? Mark pages read-only unless accessed by FS Corruption must occur while cache is writable write mmap region mmap stores? stores?

  8. Kinds of kernel failures Random bit flips in kernel address space Random bit flips in kernel address space To simulate, randomly flip memory bits Faulty instructions in kernel text Faulty instructions in kernel text To simulate, change src/dst registers of instructions Programming errors Programming errors Delete initialization code Corrupt pointer variables Randomly free allocated data Overwrite data structures

  9. Methodology Run benchmarks Run benchmarks Randomly inject errors Randomly inject errors Wait for crash Wait for crash Check to see if data has been corrupted Check to see if data has been corrupted

  10. Rio results Protections remove the risk. What we were afraid of.

  11. Rio file cache How else can we use Rio? How else can we use Rio? What about transactions? Transactions are great, but Transactions are great, but Rarely used outside of databases Synchronous writes are slow Can be hard to reason about aborts Rio can help make transactions fast Rio can help make transactions fast 2,000 times faster!

  12. Recoverable memory RVM: CMU library for recoverable memory RVM: CMU library for recoverable memory Copy of updated memory region. Copy of initial memory region. In whose address space is the recoverable memory? In whose address space is the recoverable memory? In the application s

  13. Recoverable memory RVM: CMU library for recoverable memory RVM: CMU library for recoverable memory How many times is data copied? How many times is data copied? 3: to undo log, to redo log, to database

  14. Recoverable memory RVM: CMU library for recoverable memory RVM: CMU library for recoverable memory What is the undo log used for? What is the undo log used for? User-initiated aborts

  15. Recoverable memory RVM: CMU library for recoverable memory RVM: CMU library for recoverable memory What action commits the transaction? What action commits the transaction? Write commit record to redo log

  16. Recoverable memory RVM: CMU library for recoverable memory RVM: CMU library for recoverable memory Which ACID properties does this provide? Which ACID properties does this provide? Durability and atomicity

  17. Vista recoverable memory Vista: library for recoverable memory on Rio Vista: library for recoverable memory on Rio Why don t we need the redo log? Why don t we need the redo log? Can just use persistent undo log to recover

  18. Vista recoverable memory Vista: library for recoverable memory on Rio Vista: library for recoverable memory on Rio Interface to Vista is a Interface to Vista is a malloc malloc- -like heap manager like heap manager

  19. Vista recoverable memory Vista: library for recoverable memory on Rio Vista: library for recoverable memory on Rio What needs to be protected? What needs to be protected? Heap management, undo log

  20. Vista recoverable memory Vista: library for recoverable memory on Rio Vista: library for recoverable memory on Rio Why aren t Rio protections sufficient? Why aren t Rio protections sufficient? Data lives in app address space Syscalls to alter protections slow

  21. Vista recoverable memory Vista: library for recoverable memory on Rio Vista: library for recoverable memory on Rio How is Vista protected? How is Vista protected? Create a moat around important data

  22. Protecting Vista

  23. Evaluation Why the drop off here?

  24. Evaluation Why the drop off here?

Related