Understanding Persistent Memory Use with WHISPER by University of Wisconsin-Madison & Hewlett-Packard Labs

Slide Note
Embed
Share

This analysis delves into the use of persistent memory with WHISPER, uncovering behaviors, system support enhancements, and the benefits of persistent memory technology. It discusses guarantees applications need, achieving consistency, and systems tailored for persistent memory utilization, providing valuable insights for researchers and industry professionals keen on leveraging this innovative technology.


Uploaded on Sep 20, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. An Analysis of Persistent Memory Use with WHISPER Sanketh Nalli, Swapnil Haria Michael M. Swift, Mark D. Hill Haris Volos*, Kimberly Keeton* University of Wisconsin-Madison & *Hewlett-Packard Labs (HPL)

  2. WHISPER Facilitate better system support for Persistent Memory Wisconsin-HP Labs Suite for Persistence Discovered behaviors: 4% accesses to PM, 96% accesses to DRAM 5-50 epochs/tx, contributed by memory allocation & logging Re-referencing PM cachelines: Rare across threads, common within a thread WHISPER: research.cs.wisc.edu/multifacet/whisper 2

  3. Outline WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results 3

  4. Persistent Memory is coming soon Persistent Memory = NVM attached to CPU on memory bus Offers low latency reads and persistent writes Allows user-level, byte-addressable loads and stores 4

  5. What guarantees do applications need ? Durability = Content is available to user after failure Consistency = Content is recoverable and usable after failure Data PM Data Pointer Pointer Pointer CACHE 1 . Data update followed by pointer update in cache 2. Pointer is evicted from cache to PM 3. Data lost on failure, dangling pointer persists 5

  6. Achieving consistency Data Data Data Pointer Data PM flush Pointer flush 3 . Store pointer update in cache 4 . Flush pointer update to PM 2 . Flush data update to PM 1 . Store data update in cache Ordering = Useful building block of consistency mechanisms Epoch = Set of writes to PM guaranteed to be durable before ANY writes to PM in following epochs become durable Ordering primitives: sfence, mfence of x86-64 6

  7. PM systems for consistency Native Application Application-specific optimizations TX TX load/ store NVML read/write Mnemosyne Persistent Heaps load/store VFS Atomic allocations, type safety ext4-DAX PMFS PM-aware Filesystems Persistent Memory (PM) POSIX interface 7

  8. Whats the problem ? Lack of standard workloads slows research Micro-benchmarks may not be representative Partial understanding of how applications use PM 8

  9. WHISPER Benchmark Type Brief description (*Adapted to PM) N-store* Database H-store like DB. Undo logs for consistency Echo* KV store Scalable, multi-version key-value store Memcached* Mnemosyne Distributed key-value store Vacation* Mnemosyne Online travel reservation system Redis NVML REmote Dictionary Service C-tree NVML Microbenchmarks for simulations Hashmap NVML Microbenchmarks for simulations NFS PMFS Linux server/client for remote file access Exim PMFS Mail server;stores mails in per-user file MySQL PMFS Widely used RDBMS for OLTP 9

  10. Outline WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results 10

  11. Identify writes to PM PM Application PIN/mmiotrace PM Writes PM Stats Trace Runtime IDENTIFY INSTRUMENT EXECUTE ANALYZE PIN for userspace, mmiotrace for the kernel On average, 101 lines in applications that update PM 67 line in the kernel that update PM 11

  12. Instrument writes to PM PM Application PIN/mmiotrace PM Writes PM Stats Trace Runtime IDENTIFY INSTRUMENT EXECUTE ANALYZE C macros capture all modes of updating PM and, PM transaction start/end, cacheline flushes, fences Example: Update and persist size of filesystem journal log.size = size; PM_SET(log.size, size); flush_buffer(log.size); PM_FLUSH(log.size, 8); asm( sfence ); PM_FENCE(); 12

  13. Execute and Analyze PM Application PIN/mmiotrace PM Writes PM Stats Trace Runtime IDENTIFY INSTRUMENT EXECUTE ANALYZE Python analyzer and dependency-checker Analyze trace for several statistics Number of epochs/tx Epoch dependencies Epoch sizes 13

  14. Outline WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results 14

  15. How many accesses to PM ? Total number of accesses in a WHISPER application 4% Accesses to PM Accesses to DRAM 96% Suggestion: Do not impede volatile accesses 15

  16. How many epochs/transaction ? Durability after every epoch is impedes execution Assumption: 3 epochs/TX = log + data + commit Reality: 5 to 50 epochs/TX Highest rate of epochs: Native & TM libraries Suggestion: Enforce durability only at the end of a transaction 16

  17. How large are epochs typically ? # of 64B cachelines 1 2 3 4 5 6-63 >=64 Determines amount of state buffered per epoch 100% Fraction of epochs 75% Small epochs are abundant 50% 75%update single cacheline 25% 0% Large epochs in PMFS Suggestion: Consider optimizing for small epochs 17

  18. What contributes to epochs ? Log entries Undo log: Alternating epochs of log and data Redo log: 1 Log epoch + 1 data epoch Persistent memory allocation 1 to 5 epochs Suggestion: Use redo logs and reduce epochs from memory allocator 18

  19. What are epoch dependencies ? 1 Self-dependency: B D A Cross-dependency: 2 C B 2 Why do they matter ? C Dependency can stall execution 3 D Thread 2 Thread 1 Measured dependencies in 50 us window 19

  20. How common are dependencies ? % cross-dep % self-dep 0.01 echo 54.5 0.003 nstore-ycsb 40.2 0.03 nstore-tpcc 27.18 0 redis 82.5 0 ctree 79 0 hashmap 81 0.01 vacation 40 0.2 memcached 63.5 5 nfs 55 1.16 exim 45.27 0.04 mysql 17.89 0 Suggestion: Design multi-versioned caches OR avoid updating same cacheline across epochs 100 Epoch dependencies as a percentage of total epochs 20

  21. Summary WHISPER: Wisconsin-HP Labs Suite for Persistence 4% accesses to PM, 96% accesses to DRAM 5-50 epochs/TX, primarily small in size Memory allocation, logging introduce extra epochs Cross-dependencies rare, self-dependencies common More results in ASPLOS 17 paper and code at: research.cs.wisc.edu/multifacet/whisper/ 21

  22. Extra 22

  23. A Simple Transaction using Epochs transaction_begin: Epoch 1 log[pobj.init] True Log entries TM_BEGIN(); log[pobj.data] 42 stored & pobj.data = 42; write_back(log) persisted. wait_for_write_back() pobj.init = True; Epoch 2 pobj.init True TM_END(); Variables pobj.data 42 stored & write_back(pobj) persisted. wait_for_write_back() transaction_end; 23

  24. Runtimes cause write amplification Runtimes cause write amplification PMFS Mnemosyne Logs every PM write PMFS NVML Clears log Auxiliary structures < 5% writes to PM Non-temporal writes Mnemosyne logs PMFS user-data Write Amplification 1200 1100 1000 1000 PERCENTAGE 800 600 400 200 100 10 0 24

Related


More Related Content