Understanding Memory Virtualization in Operating Systems

Memory Virtualization
Brian Kocoloski, Marion Sudvarg, Chris Gill
CSE 522S – Advanced Operating Systems
Washington University in St. Louis
St. Louis, MO 63130
1
Memory Virtualization
We’re used to virtual addresses which get
mapped to physical addresses
But what about guest addresses? How do
they map to host addresses?
Virtualization
 adds another level of
indirection that the hypervisor has to deal
with
CSE 522S – Advanced Operating Systems
2
(x86) Supervisor Mode
CSE 522S – Advanced Operating Systems
3
Last time we discussed 
virtualization
extensions – 
hardware updates in recent
processors to make virtualization more efficient
Without extensions, we need to run guest code in
Ring 3 
to prevent it from corrupting shared
hardware
What if the guest ran in 
Ring 0
? It 
could update
its page tables to map any physical memory 
major security issue
(x86) Virtualization Extensions
CSE 522S – Advanced Operating Systems
4
Virtualization extensions 
allow us to run
guest code directly in 
Ring 0
 but to do so
safely thanks to a new orthogonal privilege
mode, 
supervisor mode
Hardware
P1
P2
Guest OS
Hypervisor + Host OS
Ring 0,
Supervisor Mode
Ring 3, Supervisor Mode
Ring 0, Guest Mode
Ring 3, Guest Mode
(x86) Rings and Supervisor Modes
CSE 522S – Advanced Operating Systems
5
Rings and supervisor mode
 are
architecture specific (ARM is a lot different
than x86)
Intel/AMD (x86)
VMX root (
host mode
) VMX non-root (
guest mode
)
Everything in this discussion pertains to x86,
though ARM does have some similarities
The basic idea of the virtualization extensions
is to 
duplicate
 some of the hardware to add
a second level of virtualization
(x86) Nested Page Tables
CSE 522S – Advanced Operating Systems
6
Without VMX support, we can’t let the
guest directly update its page tables
mov cr3, 0x…
Such an instruction needs to be trapped
and emulated by something like QEMU
(we’ll talk about how in a bit)
But with VMX support, we can actually let
the guest modify cr3 (control register
pointing to the page table) directly
Review: Paging Example
Example 0x0000  cd00  2240  6c90
1010 1011 0
000 0000 00
10 0010 010
0 0000 0110
 
1010 1001 0000
(       c      d      0      0       2       2      4      0       6      c       9      0    )
Top level page number (9 bits)
1 0101 0110
 : 342
2
nd
 level page number (9 bits)
0 0000 0000 
: 0
3
rd
 level page number (9 bits)
1 0001 0010 
: 274
4
th
 level page number (9 bits)
0 0000 0110 
: 6
Page offset (12 bits)
1010 1001 0000 
: 2704
7
CSE 522S – Advanced Operating Systems
Paging Review: Use of Control Registers
CPU instruction:
WR R1, 0x0000  cd00  2240  6c90
Step 1: Get Top-level page table address
 MMU
WR 0x0000  cd00  2240  6c90
  Control Register
(x86: CR3;
ARM: CP15 c2)
Top level page table
address (a)
8
CSE 522S – Advanced Operating Systems
a
 
Example 0x0000  cd00  2240  6c90
1010 1011 0
000 0000 00
10 0010 010
0 0000 0110
 
1010 1001 0000
(     342                0                  274                6                   2704         )
 Physical Page
(4KB)
Grab the
2704’th byte
in this page
9
CSE 522S – Advanced Operating Systems
Address Translation
CSE 522S – Advanced Operating Systems
10
Virtual Address Space
Physical Address Space
CSE 522S – Advanced Operating Systems
11
VM 1 Virtual Address Space
Physical Address Space
VM 2 Virtual Address Space
 
Can’t VMs corrupt
each other’s memory?
Address Translation
Solution is to add 
another
 level of indirection
Second level of virtualization
Virtual -> Intermediate Physical -> Physical
More commonly thought of as:
Guest virtual -> Guest Physical -> Host Physical
The guest ”thinks” it is mapping physical
memory, but it is really mapping just another
type of virtual memory
CSE 522S – Advanced Operating Systems
12
Address Translation
CSE 522S – Advanced Operating Systems
13
VM 2 Guest Physical 
Address Space
VM 1 Guest Physical 
Address Space
Address Translation
Host Physical
Address Space
VM 1 Virtual Address Space
VM 2 Virtual Address Space
Virtualizing Memory in Software
Three abstractions of memory:
0
4GB
Current Guest Process
0
4GB
Guest OS
Virtual 
Address Spaces
Guest Physical
Address Spaces
Virtual RAM
Virtual
ROM
Virtual
Devices
Virtual
Frame
Buffer
0
4GB
Host Physical
Address Space
RAM
ROM
Devices
Frame
Buffer
14
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
CSE 522S – Advanced Operating Systems
Shadow Page Tables
VMM maintains shadow page tables that map
guest virtual pages directly to host physical
pages.
Guest modifications to V
GP tables synced
to VMM V
HP shadow page tables.
Guest OS page tables marked as read-only.
Modifications of page tables by guest OS 
trapped to VMM.
Shadow page tables synced to the guest OS
tables
15
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Virtual CR3
Real CR3
16
CSE 522S – Advanced Operating Systems
1.
Guest OS sets CR3
a)
Typically to support a context
switch among guest
applications
b)
Virtual CR3 points to the
page table for newly active
guest application
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Virtual CR3
Real CR3
17
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
1.
Guest OS sets CR3
a)
Typically to support a context
switch among guest
applications
b)
Virtual CR3 points to the
page table for newly active
guest application
2.
Host OS “shadows” the update
a)
Instruction traps to host
b)
Host allows update of virtual
CR3
c)
Host updates physical CR3 to
point to shadow page table
Shadow Page Tables
Shadow paging relies on support for 
trapping
 updates to
either (1) cr3, or (2) any of the page tables pointed to by cr3
Hypervisor is invoked when the guest tries to modify cr3 or
the page tables
This is called a 
VM-exit
Hypervisor then “shadows” the update, and can also perform
a memory translation from GPA to HPA if the guest is
mapping a new memory region
18
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Virtual CR3
Real CR3
19
Map new GPA 
P
CSE 522S – Advanced Operating Systems
1.
Guest adds new page mapping
a)
Guest application requires a
new virtual page
b)
Guest kernel maps new Guest
Physical Address 
P
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Virtual CR3
Real CR3
20
CSE 522S – Advanced Operating Systems
Map new GPA 
P
Map new HPA 
P’
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
1.
Guest adds new page mapping
a)
Guest application requires a
new virtual page
b)
Guest kernel maps new Guest
Physical Address 
P
2.
Hypervisor maps PTE to HPA
a)
PTE marked read-only
b)
Write causes VM-exit,
trapped by hypervisor
c)
Hypervisor maps Page Table
Entry to Host Page Address 
P’
Drawbacks: Shadow Page Tables
Need to handle 
trap (often called VM exit)
 on all
page table updates (and context switches)
Processor moves from guest mode to host mode
Similar to a CPU context switch, but actually 
more expensive
If guest has frequent switches or page table updates,
requires frequent 
traps
 to maintain consistency
between guest page tables and shadow page tables
Loss of performance due to TLB flush on every
“world-switch”
Memory overhead due to shadow copying of guest
page tables
21
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Nested / Extended Page Tables
Extended page-table mechanism (EPT) used to
support the virtualization of physical memory.
Guest-physical addresses are translated by
traversing a set of EPT paging structures 
to
produce physical addresses that are used to
access memory.
The hardware gives us a 
2nd set of page tables
 to
do the translation without needing VMM intervention
Of course, the VMM 
is
 still responsible for setting up
the EPT, but this generally only needs to be done once
at guest boot time
22
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
CSE 522S – Advanced Operating Systems
23
Source: 
https://www.exploit-db.com/docs/45546
Advantages: EPT
Simplified VMM design (no need to maintain any
“shadow” state or complex software MMU
structures)
Guest page table modifications need not be
trapped, hence VM exits reduced.
Reduced memory footprint compared to shadow
page table algorithms.
24
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
Disadvantages: EPT
TLB miss is 
very costly 
since guest-physical
address to machine address needs an extra EPT
walk for each stage of guest-virtual address
translation.
25
CSE 522S – Advanced Operating Systems
Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx
CSE 522S – Advanced Operating Systems
26
Source: 
https://research.cs.wisc.edu/multifacet/papers/isca16_agile_paging.pdf
Nested Paging Lookup
A guest-virtual to host-physical
address translation must occur
Assume a TLB miss
In this case, traversal is nested
Consider x86-64
4 page table references grow to 24
Today’s Studio
Experience using QEMU emulator and KVM
hypervisor on the Raspberry Pi
Measure and compare performance of:
Virtualization vs Emulation
Memory bound workloads in native and virtualized
environments
Think about what types of workloads would
benefit from shadow paging and what type
would benefit from nested paging
CSE 522S – Advanced Operating Systems
27
Today’s Readings
LKD pages 231-233 & 320-322: A quick review of
paging, page tables, and the TLB.
Paul Barham, Boris Dragovic et al. 2003. “Xen and
the art of virtualization.” In 
Proceedings of the
nineteenth ACM symposium on Operating systems
principles (SOSP '03).
Fabrice Bellard. 2005. “QEMU, a fast and portable
dynamic translator.” In 
Proceedings of the annual
conference on USENIX Annual Technical Conference
(ATEC '05)
.
CSE 522S – Advanced Operating Systems
28
Slide Note
Embed
Share

Memory virtualization in operating systems involves mapping guest addresses to host addresses with an added level of indirection managed by the hypervisor. Virtualization extensions in x86 processors enhance efficiency by allowing safe execution of guest code in Ring 0 through supervisor mode. The architecture-specific rings and supervisor modes in Intel/AMD x86 chips facilitate a second level of virtualization to improve system security. With VMX support, nested page tables enable direct modifications by guests, enhancing system performance and control.


Uploaded on Jul 17, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Memory Virtualization Brian Kocoloski, Marion Sudvarg, Chris Gill CSE 522S Advanced Operating Systems Washington University in St. Louis St. Louis, MO 63130 1

  2. Memory Virtualization We re used to virtual addresses which get mapped to physical addresses But what about guest addresses? How do they map to host addresses? Virtualization adds another level of indirection that the hypervisor has to deal with CSE 522S Advanced Operating Systems 2

  3. (x86) Supervisor Mode Last time we discussed virtualization extensions hardware updates in recent processors to make virtualization more efficient Without extensions, we need to run guest code in Ring 3 to prevent it from corrupting shared hardware What if the guest ran in Ring 0? It could update its page tables to map any physical memory major security issue CSE 522S Advanced Operating Systems 3

  4. (x86) Virtualization Extensions Virtualization extensions allow us to run guest code directly in Ring 0 but to do so safely thanks to a new orthogonal privilege mode, supervisor mode Ring 3, Guest Mode P1 Ring 0, Guest Mode Guest OS P2 Ring 3, Supervisor Mode Ring 0, Supervisor Mode Hypervisor + Host OS Hardware CSE 522S Advanced Operating Systems 4

  5. (x86) Rings and Supervisor Modes Rings and supervisor mode are architecture specific (ARM is a lot different than x86) Intel/AMD (x86) VMX root (host mode) VMX non-root (guest mode) Everything in this discussion pertains to x86, though ARM does have some similarities The basic idea of the virtualization extensions is to duplicate some of the hardware to add a second level of virtualization CSE 522S Advanced Operating Systems 5

  6. (x86) Nested Page Tables Without VMX support, we can t let the guest directly update its page tables mov cr3, 0x Such an instruction needs to be trapped and emulated by something like QEMU (we ll talk about how in a bit) But with VMX support, we can actually let the guest modify cr3 (control register pointing to the page table) directly CSE 522S Advanced Operating Systems 6

  7. Review: Paging Example Example 0x0000 cd00 2240 6c90 1010 1011 0000 0000 0010 0010 0100 0000 0110 1010 1001 0000 ( c d 0 0 2 2 4 0 6 c 9 0 ) Top level page number (9 bits) 1 0101 0110 : 342 2nd level page number (9 bits) 0 0000 0000 : 0 3rd level page number (9 bits) 1 0001 0010 : 274 4th level page number (9 bits) 0 0000 0110 : 6 Page offset (12 bits) 1010 1001 0000 : 2704 CSE 522S Advanced Operating Systems 7

  8. Paging Review: Use of Control Registers CPU instruction: WR R1, 0x0000 cd00 2240 6c90 Step 1: Get Top-level page table address Control Register (x86: CR3; ARM: CP15 c2) MMU WR 0x0000 cd00 2240 6c90 Top level page table address (a) CSE 522S Advanced Operating Systems 8

  9. Example 0x0000 cd00 2240 6c90 1010 1011 0000 0000 0010 0010 0100 0000 0110 1010 1001 0000 ( 342 0 274 6 2704 ) Index Page Grab the 2704 th byte in this page a 0 Physical Page (4KB) Index Page Index Page 342 b 0 c 0 511 Index Page 274 d 0 511 511 6 e 511 CSE 522S Advanced Operating Systems 9

  10. Address Translation Virtual Address Space Physical Address Space CSE 522S Advanced Operating Systems 10

  11. Address Translation VM 1 Virtual Address Space VM 2 Virtual Address Space Can t VMs corrupt each other s memory? Physical Address Space CSE 522S Advanced Operating Systems 11

  12. Address Translation Solution is to add another level of indirection Second level of virtualization Virtual -> Intermediate Physical -> Physical More commonly thought of as: Guest virtual -> Guest Physical -> Host Physical The guest thinks it is mapping physical memory, but it is really mapping just another type of virtual memory CSE 522S Advanced Operating Systems 12

  13. Address Translation VM 1 Virtual Address Space VM 2 Virtual Address Space VM 1 Guest Physical Address Space VM 2 Guest Physical Address Space Host Physical Address Space CSE 522S Advanced Operating Systems 13

  14. Virtualizing Memory in Software Three abstractions of memory: 0 4GB Virtual Current Guest Process Guest OS Address Spaces 0 4GB Guest Physical Address Spaces Virtual Frame Buffer Virtual Devices Virtual ROM Virtual RAM 0 4GB Host Physical Address Space Frame Buffer RAM Devices ROM Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 14

  15. Shadow Page Tables VMM maintains shadow page tables that map guest virtual pages directly to host physical pages. Guest modifications to V GP tables synced to VMM V HP shadow page tables. Guest OS page tables marked as read-only. Modifications of page tables by guest OS trapped to VMM. Shadow page tables synced to the guest OS tables Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 15

  16. 1. Guest OS sets CR3 a) Typically to support a context switch among guest applications b)Virtual CR3 points to the page table for newly active guest application Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 16

  17. 1. Guest OS sets CR3 a) Typically to support a context switch among guest applications b)Virtual CR3 points to the page table for newly active guest application Guest Page Table Guest Page Table Guest Page Table Virtual CR3 2. Host OS shadows the update a) Instruction traps to host b)Host allows update of virtual CR3 c) Host updates physical CR3 to point to shadow page table Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 17

  18. Shadow Page Tables Shadow paging relies on support for trapping updates to either (1) cr3, or (2) any of the page tables pointed to by cr3 Hypervisor is invoked when the guest tries to modify cr3 or the page tables This is called a VM-exit Hypervisor then shadows the update, and can also perform a memory translation from GPA to HPA if the guest is mapping a new memory region Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 18

  19. 1. Guest adds new page mapping a) Guest application requires a new virtual page b)Guest kernel maps new Guest Physical Address P Map new GPA P Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 19

  20. 1. Guest adds new page mapping a) Guest application requires a new virtual page b)Guest kernel maps new Guest Physical Address P Map new GPA P 2. Hypervisor maps PTE to HPA a) PTE marked read-only b)Write causes VM-exit, trapped by hypervisor c) Hypervisor maps Page Table Entry to Host Page Address P Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Map new HPA P Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 20

  21. Drawbacks: Shadow Page Tables Need to handle trap (often called VM exit) on all page table updates (and context switches) Processor moves from guest mode to host mode Similar to a CPU context switch, but actually more expensive If guest has frequent switches or page table updates, requires frequent traps to maintain consistency between guest page tables and shadow page tables Loss of performance due to TLB flush on every world-switch Memory overhead due to shadow copying of guest page tables Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 21

  22. Nested / Extended Page Tables Extended page-table mechanism (EPT) used to support the virtualization of physical memory. Guest-physical addresses are translated by traversing a set of EPT paging structures to produce physical addresses that are used to access memory. The hardware gives us a 2nd set of page tables to do the translation without needing VMM intervention Of course, the VMM is still responsible for setting up the EPT, but this generally only needs to be done once at guest boot time Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 22

  23. Source: https://www.exploit-db.com/docs/45546 CSE 522S Advanced Operating Systems 23

  24. Advantages: EPT Simplified VMM design (no need to maintain any shadow state or complex software MMU structures) Guest page table modifications need not be trapped, hence VM exits reduced. Reduced memory footprint compared to shadow page table algorithms. Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 24

  25. Disadvantages: EPT TLB miss is very costly since guest-physical address to machine address needs an extra EPT walk for each stage of guest-virtual address translation. Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 25

  26. Nested Paging Lookup A guest-virtual to host-physical address translation must occur Assume a TLB miss In this case, traversal is nested Consider x86-64 4 page table references grow to 24 Source: https://research.cs.wisc.edu/multifacet/papers/isca16_agile_paging.pdf CSE 522S Advanced Operating Systems 26

  27. Todays Studio Experience using QEMU emulator and KVM hypervisor on the Raspberry Pi Measure and compare performance of: Virtualization vs Emulation Memory bound workloads in native and virtualized environments Think about what types of workloads would benefit from shadow paging and what type would benefit from nested paging CSE 522S Advanced Operating Systems 27

  28. Todays Readings LKD pages 231-233 & 320-322: A quick review of paging, page tables, and the TLB. Paul Barham, Boris Dragovic et al. 2003. Xen and the art of virtualization. In Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP '03). Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the annual conference on USENIX Annual Technical Conference (ATEC '05). CSE 522S Advanced Operating Systems 28

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#