Enhanced Virtual Memory Framework for Fine-grained Memory Management

Page Overlays
An Enhanced Virtual Memory Framework to
Enable Fine-grained Memory Management
Vivek Seshadri
Gennady Pekhimenko, Olatunji Ruwase,
Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch,
 Todd C. Mowry, Trishul Chilimbi
@CMU
Executive Summary
 
Sub-page memory management has several applications
More efficient capacity management, protection, metadata, …
Page-granularity virtual memory 
 inefficient implementations
Low performance and high memory redundancy
Page Overlays: New Virtual Memory Framework
Virtual Page 
 
(physical page, 
overlay
)
Overlay contains new versions of subset of cache lines
Efficiently store pages with mostly similar data
Largely retains existing virtual memory structure
Low cost implementation over existing frameworks
Powerful access semantics – Enables many applications
E.g., overlay-on-write, efficient sparse data structure representation
Improves performance and reduces memory redundancy
2
Existing Virtual Memory Systems
Virtual memory enables many OS functionalities
Flexible capacity management
Inter-process data protection, sharing
Copy-on-write, page flipping
3
Virtual
page
4KB
Physical
Page
4KB
 
Page
Tables
Case Study: Copy-on-Write
4
Virtual
page
Physical
Page
 
Copy-on-
Write
Page
Tables
Virtual Address Space
Physical Address Space
Shortcomings of Page-granularity Management
5
Virtual
page
Physical
Page
 
Copy-on-
Write
 
Page
Tables
 
Virtual Address Space
 
Physical Address Space
Shortcomings of Page-granularity Management
6
Virtual
page
Physical
Page
Copy-on-
Write
 
Page
Tables
 
Virtual Address Space
 
Physical Address Space
Shortcomings of Page-granularity Management
7
Virtual
page
Physical
Page
 
Copy-on-
Write
 
Page
Tables
 
Virtual Address Space
 
Physical Address Space
Wouldn’t it be nice to map pages at a
finer granularity (smaller than 4KB)?
Fine-grained Memory Management
8
Fine-grained
Memory Management
Goal: Efficient Fine-grained Memory Management
9
Existing Virtual Memory Framework
Enable efficient fine-grained management
Low implementation cost
 
New Virtual Memory Framework
Outline
Shortcomings of Existing Framework
Page Overlays – Overview
Implementation
Challenges and solutions
Applications and Evaluation
Conclusion
10
The Page Overlay Framework
11
 
Access Semantics:
Only cache lines
not present in the
overlay are
accessed from the
physical page
C1
C1
C5
C5
Overlay maintains the newer version
of a subset of cache lines from the
virtual page
Overlay-on-Write: An Efficient Copy-on-Write
12
Virtual
page
Physical
Page
Copy-on-
Write
Page
Tables
Virtual Address Space
Physical Address Space
 
Overlay
Outline
Shortcomings of Existing Framework
Page Overlays – Overview
Implementation
Challenges and solutions
Applications and Evaluation
Conclusion
13
Implementation Overview
14
V
Virtual
Address Space
P
O
Main Memory
 
Regular
Physical
Pages
 
Overlays
No changes!
Three challenges
Implementation Challenges
15
C5
C5
C3
Identifying Overlay Cache Lines: Overlay Bit Vector
16
C5
C5
?
 
Indicates which
cache lines
belong to the
overlay
Addressing Overlay Cache Lines: Naïve Approach
17
V
Virtual
Address Space
1. Processor must compute the address
2. Does not work with virtually-indexed caches
3. Complicates overlay cache line insertion
Addressing Overlay Cache Lines: Dual Address Design
18
V
Virtual
Address Space
P
O
Main Memory
P
O
 
Page Tables
Physical
Address Space
 
Unused physical
address space
 
Overlay cache
address space
Virtual-to-Overlay Mappings
19
V
Virtual
Address Space
P
O
Main Memory
P
 
Page Tables
Physical
Address Space
Overlay cache
address space
Keeping TLBs Coherent
20
C0
C1
C2
C3
C4
C5
Virtual Page
C0
C1
C2
C3
C4
C5
Physical Page
C2
C5
Overlay
C5
C5
C3
 
Use the cache coherence protocol to keep TLBs
coherent!
Final Implementation
21
CPU
L1
Cache
Last
Level
Cache
Regular
Physical
Pages
Memory
Controller
TLB
OMT
Cache
OMT
 
Overlay Bit Vectors
3
1
2
Overlays
Other Details in the Paper
Virtual-to-overlay mapping
TLB and cache coherence
OMT management (by the memory controller)
Hardware cost
94.5 KB of storage
OS Support
22
Outline
Shortcomings of existing frameworks
Page Overlays – Overview
Implementation
Challenges and solutions
Applications and Evaluation
Conclusion
23
Methodology
Memsim memory system simulator 
[Seshadri+ PACT 2012]
2.67 GHz, single core, out-of-order, 64 entry instruction window
64-entry L1 TLB, 1024-entry L2 TLB
64KB L1 cache, 512KB L2 cache, 2MB L3 cache
 Multi-entry Stream Prefetcher 
[Srinath+ HPCA 2007]
Open row, FR-FCFS, 64 entry write buffer, drain when full
64-entry OMT cache
DDR3 1066 MHz, 1 channel, 1 rank, 8 banks
24
Overlay-on-Write
 
Lower memory
redundancy
Lower latency
25
Copy-on-Write
Overlay-on-Write
Fork Benchmark
 
Additional memory consumption
Performance (cycles per instruction)
26
Parent Process
 
Fork
(child idles)
Time
 
300 million insts
Overlay-on-Write vs. Copy-on-Write on Fork
27
Copy-on-Write
Overlay-on-Write
53%
15%
Conclusion
 
Sub-page memory management has several applications
More efficient capacity management, protection, metadata, …
Page-granularity virtual memory 
 inefficient implementations
Low performance and high memory redundancy
Page Overlays: New Virtual Memory Framework
Virtual Page 
 
(physical page, 
overlay
)
Overlay contains new versions of subset of cache lines
Efficiently store pages with mostly similar data
Largely retains existing virtual memory structure
Low cost implementation over existing frameworks
Powerful access semantics – Enables many applications
E.g., overlay-on-write, efficient sparse data structure representation
Improves performance and reduces memory redundancy
28
Page Overlays
An Enhanced Virtual Memory Framework to
Enable Fine-grained Memory Management
Vivek Seshadri
Gennady Pekhimenko, Olatunji Ruwase,
Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch,
 Todd C. Mowry, Trishul Chilimbi
@CMU
Slide Note
Embed
Share

This study introduces Page Overlays, a new virtual memory framework designed to enable fine-grained memory management. By efficiently storing pages with similar data and providing powerful access semantics, Page Overlays improve performance and reduce memory redundancy compared to existing virtual memory systems. The framework retains the existing virtual memory structure while offering low-cost implementation and supporting various applications such as overlay-on-write and sparse data structure representation.

  • Memory Management
  • Virtual Memory
  • Fine-grained
  • Page Overlays
  • Performance

Uploaded on Oct 03, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Page Overlays An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management Vivek Seshadri Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, Trishul Chilimbi @CMU

  2. Executive Summary Sub-page memory management has several applications More efficient capacity management, protection, metadata, Page-granularity virtual memory inefficient implementations Low performance and high memory redundancy Page Overlays: New Virtual Memory Framework Virtual Page (physical page, overlay) Overlay contains new versions of subset of cache lines Efficiently store pages with mostly similar data Largely retains existing virtual memory structure Low cost implementation over existing frameworks Powerful access semantics Enables many applications E.g., overlay-on-write, efficient sparse data structure representation Improves performance and reduces memory redundancy P V O 2

  3. Existing Virtual Memory Systems Virtual memory enables many OS functionalities Flexible capacity management Inter-process data protection, sharing Copy-on-write, page flipping Page Tables Virtual page Physical Page 4KB 4KB 3

  4. Case Study: Copy-on-Write Page Tables Virtual Address Space Physical Address Space Virtual page Physical Page Copy entire page 2 Copy-on- Write Copy Write Change mapping 1Allocate new page 3 4

  5. Shortcomings of Page-granularity Management Page Tables Virtual Address Space Physical Address Space Virtual page Physical Page Copy entire page 2 Copy-on- Write High memory redundancy Copy Write Change mapping 1Allocate new page 3 5

  6. Shortcomings of Page-granularity Management Page Tables Virtual Address Space Physical Address Space Virtual page Physical Page Copy entire page 2 Copy-on- Write 4KB copy: High Latency Copy Write Change mapping 1Allocate new page 3 6

  7. Shortcomings of Page-granularity Management Page Tables Virtual Address Space Physical Address Space Virtual page Physical Page Copy entire page 2 TLB Shootdown High Latency Copy-on- Write Copy Write Change mapping 1Allocate new page 3 7

  8. Fine-grained Memory Management Fine-grained data protection (simpler programs) Higher Performance (e.g., more efficient copy-on-write) Fine-grained Memory Management Fine-grained metadata management (better security, efficient software debugging) More efficient capacity management (avoid internal fragmentation, deduplication) 8

  9. Goal: Efficient Fine-grained Memory Management Existing Virtual Memory Framework Low performance High memory redundancy New Virtual Memory Framework Enable efficient fine-grained management P V Low implementation cost O 9

  10. Outline Shortcomings of Existing Framework Page Overlays Overview Implementation Challenges and solutions Applications and Evaluation Conclusion 10

  11. The Page Overlay Framework Physical Page C0 C1 C2 C3 C4 C5 The overlay contains only a subset of cache lines from the virtual page C1 Virtual Page C0 C1 C2 C3 C4 C5 C5 C1 Access Semantics: Only cache lines not present in the overlay are accessed from the physical page Overlay C2 C5 C5 11

  12. Overlay-on-Write: An Efficient Copy-on-Write Page Tables Virtual Address Space Physical Address Space Virtual page Physical Page Overlay contains only modified cache lines Copy-on- Write Overlay Write Does not require full page copy 12

  13. Outline Shortcomings of Existing Framework Page Overlays Overview Implementation Challenges and solutions Applications and Evaluation Conclusion 13

  14. Implementation Overview Virtual Main Memory Address Space Regular Physical Pages P V O Overlays Three challenges 14

  15. Implementation Challenges Physical Page Virtual Page C0 C1 C2 C3 C4 C5 C0 C1 C2 C3 C4 C5 C5 3How to keep the TLBs coherent? C3 ? Overlay C2 C5 C5 1Does the cache line belong to the overlay? 2What is the address/tag of the overlay cache line? 15

  16. Identifying Overlay Cache Lines: Overlay Bit Vector Physical Page Virtual Page C0 C1 C2 C3 C4 C5 C0 C1 C2 C3 C4 C5 C5 Indicates which cache lines belong to the overlay ? Overlay C2 C5 C5 1Does the cache line belong to the overlay? Overlay Bit Vector 0 0 1 0 0 1 16

  17. Addressing Overlay Cache Lines: Nave Approach Virtual Use the location of the overlay in main memory to tag overlay cache lines Main Memory Address Space P V O 1. Processor must compute the address 2. Does not work with virtually-indexed caches 3. Complicates overlay cache line insertion 17

  18. Addressing Overlay Cache Lines: Dual Address Design Physical Address Space Virtual Main Memory Address Space P P V O O same size Unused physical address space address space Overlay cache 18

  19. Virtual-to-Overlay Mappings Physical Address Space Virtual Main Memory Address Space Overlay Mapping Table (OMT) (maintained by memory controller) P P V O O Overlay cache address space Direct Mapping 19

  20. Keeping TLBs Coherent Physical Page Virtual Page C0 C1 C2 C3 C4 C5 C0 C1 C2 C3 C4 C5 C5 3How to keep the TLBs coherent? C3 Overlay C2 C5 C5 Use the cache coherence protocol to keep TLBs coherent! 20

  21. Final Implementation Overlay Bit Vectors 3 1 OMT 2 TLB Overlays OMT Cache Last Level Cache Regular Physical Pages L1 Memory Controller CPU Cache 21

  22. Other Details in the Paper Virtual-to-overlay mapping TLB and cache coherence OMT management (by the memory controller) Hardware cost 94.5 KB of storage OS Support 22

  23. Outline Shortcomings of existing frameworks Page Overlays Overview Implementation Challenges and solutions Applications and Evaluation Conclusion 23

  24. Methodology Memsim memory system simulator [Seshadri+ PACT 2012] 2.67 GHz, single core, out-of-order, 64 entry instruction window 64-entry L1 TLB, 1024-entry L2 TLB 64KB L1 cache, 512KB L2 cache, 2MB L3 cache Multi-entry Stream Prefetcher [Srinath+ HPCA 2007] Open row, FR-FCFS, 64 entry write buffer, drain when full 64-entry OMT cache DDR3 1066 MHz, 1 channel, 1 rank, 8 banks 24

  25. Overlay-on-Write Overlay-on-Write Copy-on-Write Virtual page Physical Page Virtual page Physical Page 2 Copy-on- Write Copy-on- Write Overlay Write Write 3 1 Lower memory redundancy Lower latency 25

  26. Fork Benchmark write Copy-on-Write Parent Process 300 million insts Overlay-on-Write Time Fork (child idles) Applications from SPEC CPU 2006 (varying write working sets) Additional memory consumption Performance (cycles per instruction) 26

  27. Overlay-on-Write vs. Copy-on-Write on Fork Overlay-on-Write Copy-on-Write 60 8 7 50 6 Additional Memory (MBs) Cycles per Instruction 40 5 30 4 53% 15% 3 20 2 10 1 0 0 Small DenseSparse Mean Write Working Set Small DenseSparse Mean Write Working Set 27

  28. Conclusion Sub-page memory management has several applications More efficient capacity management, protection, metadata, Page-granularity virtual memory inefficient implementations Low performance and high memory redundancy Page Overlays: New Virtual Memory Framework Virtual Page (physical page, overlay) Overlay contains new versions of subset of cache lines Efficiently store pages with mostly similar data Largely retains existing virtual memory structure Low cost implementation over existing frameworks Powerful access semantics Enables many applications E.g., overlay-on-write, efficient sparse data structure representation Improves performance and reduces memory redundancy P V O 28

  29. Page Overlays An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management Vivek Seshadri Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, Trishul Chilimbi @CMU

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#