Understanding Memory Virtualization in Operating Systems

Slide Note
Embed
Share

Memory virtualization in operating systems involves mapping guest addresses to host addresses with an added level of indirection managed by the hypervisor. Virtualization extensions in x86 processors enhance efficiency by allowing safe execution of guest code in Ring 0 through supervisor mode. The architecture-specific rings and supervisor modes in Intel/AMD x86 chips facilitate a second level of virtualization to improve system security. With VMX support, nested page tables enable direct modifications by guests, enhancing system performance and control.


Uploaded on Jul 17, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Memory Virtualization Brian Kocoloski, Marion Sudvarg, Chris Gill CSE 522S Advanced Operating Systems Washington University in St. Louis St. Louis, MO 63130 1

  2. Memory Virtualization We re used to virtual addresses which get mapped to physical addresses But what about guest addresses? How do they map to host addresses? Virtualization adds another level of indirection that the hypervisor has to deal with CSE 522S Advanced Operating Systems 2

  3. (x86) Supervisor Mode Last time we discussed virtualization extensions hardware updates in recent processors to make virtualization more efficient Without extensions, we need to run guest code in Ring 3 to prevent it from corrupting shared hardware What if the guest ran in Ring 0? It could update its page tables to map any physical memory major security issue CSE 522S Advanced Operating Systems 3

  4. (x86) Virtualization Extensions Virtualization extensions allow us to run guest code directly in Ring 0 but to do so safely thanks to a new orthogonal privilege mode, supervisor mode Ring 3, Guest Mode P1 Ring 0, Guest Mode Guest OS P2 Ring 3, Supervisor Mode Ring 0, Supervisor Mode Hypervisor + Host OS Hardware CSE 522S Advanced Operating Systems 4

  5. (x86) Rings and Supervisor Modes Rings and supervisor mode are architecture specific (ARM is a lot different than x86) Intel/AMD (x86) VMX root (host mode) VMX non-root (guest mode) Everything in this discussion pertains to x86, though ARM does have some similarities The basic idea of the virtualization extensions is to duplicate some of the hardware to add a second level of virtualization CSE 522S Advanced Operating Systems 5

  6. (x86) Nested Page Tables Without VMX support, we can t let the guest directly update its page tables mov cr3, 0x Such an instruction needs to be trapped and emulated by something like QEMU (we ll talk about how in a bit) But with VMX support, we can actually let the guest modify cr3 (control register pointing to the page table) directly CSE 522S Advanced Operating Systems 6

  7. Review: Paging Example Example 0x0000 cd00 2240 6c90 1010 1011 0000 0000 0010 0010 0100 0000 0110 1010 1001 0000 ( c d 0 0 2 2 4 0 6 c 9 0 ) Top level page number (9 bits) 1 0101 0110 : 342 2nd level page number (9 bits) 0 0000 0000 : 0 3rd level page number (9 bits) 1 0001 0010 : 274 4th level page number (9 bits) 0 0000 0110 : 6 Page offset (12 bits) 1010 1001 0000 : 2704 CSE 522S Advanced Operating Systems 7

  8. Paging Review: Use of Control Registers CPU instruction: WR R1, 0x0000 cd00 2240 6c90 Step 1: Get Top-level page table address Control Register (x86: CR3; ARM: CP15 c2) MMU WR 0x0000 cd00 2240 6c90 Top level page table address (a) CSE 522S Advanced Operating Systems 8

  9. Example 0x0000 cd00 2240 6c90 1010 1011 0000 0000 0010 0010 0100 0000 0110 1010 1001 0000 ( 342 0 274 6 2704 ) Index Page Grab the 2704 th byte in this page a 0 Physical Page (4KB) Index Page Index Page 342 b 0 c 0 511 Index Page 274 d 0 511 511 6 e 511 CSE 522S Advanced Operating Systems 9

  10. Address Translation Virtual Address Space Physical Address Space CSE 522S Advanced Operating Systems 10

  11. Address Translation VM 1 Virtual Address Space VM 2 Virtual Address Space Can t VMs corrupt each other s memory? Physical Address Space CSE 522S Advanced Operating Systems 11

  12. Address Translation Solution is to add another level of indirection Second level of virtualization Virtual -> Intermediate Physical -> Physical More commonly thought of as: Guest virtual -> Guest Physical -> Host Physical The guest thinks it is mapping physical memory, but it is really mapping just another type of virtual memory CSE 522S Advanced Operating Systems 12

  13. Address Translation VM 1 Virtual Address Space VM 2 Virtual Address Space VM 1 Guest Physical Address Space VM 2 Guest Physical Address Space Host Physical Address Space CSE 522S Advanced Operating Systems 13

  14. Virtualizing Memory in Software Three abstractions of memory: 0 4GB Virtual Current Guest Process Guest OS Address Spaces 0 4GB Guest Physical Address Spaces Virtual Frame Buffer Virtual Devices Virtual ROM Virtual RAM 0 4GB Host Physical Address Space Frame Buffer RAM Devices ROM Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 14

  15. Shadow Page Tables VMM maintains shadow page tables that map guest virtual pages directly to host physical pages. Guest modifications to V GP tables synced to VMM V HP shadow page tables. Guest OS page tables marked as read-only. Modifications of page tables by guest OS trapped to VMM. Shadow page tables synced to the guest OS tables Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 15

  16. 1. Guest OS sets CR3 a) Typically to support a context switch among guest applications b)Virtual CR3 points to the page table for newly active guest application Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 16

  17. 1. Guest OS sets CR3 a) Typically to support a context switch among guest applications b)Virtual CR3 points to the page table for newly active guest application Guest Page Table Guest Page Table Guest Page Table Virtual CR3 2. Host OS shadows the update a) Instruction traps to host b)Host allows update of virtual CR3 c) Host updates physical CR3 to point to shadow page table Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 17

  18. Shadow Page Tables Shadow paging relies on support for trapping updates to either (1) cr3, or (2) any of the page tables pointed to by cr3 Hypervisor is invoked when the guest tries to modify cr3 or the page tables This is called a VM-exit Hypervisor then shadows the update, and can also perform a memory translation from GPA to HPA if the guest is mapping a new memory region Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 18

  19. 1. Guest adds new page mapping a) Guest application requires a new virtual page b)Guest kernel maps new Guest Physical Address P Map new GPA P Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 19

  20. 1. Guest adds new page mapping a) Guest application requires a new virtual page b)Guest kernel maps new Guest Physical Address P Map new GPA P 2. Hypervisor maps PTE to HPA a) PTE marked read-only b)Write causes VM-exit, trapped by hypervisor c) Hypervisor maps Page Table Entry to Host Page Address P Guest Page Table Guest Page Table Guest Page Table Virtual CR3 Real CR3 Map new HPA P Shadow Page Table Shadow Page Table Shadow Page Table Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 20

  21. Drawbacks: Shadow Page Tables Need to handle trap (often called VM exit) on all page table updates (and context switches) Processor moves from guest mode to host mode Similar to a CPU context switch, but actually more expensive If guest has frequent switches or page table updates, requires frequent traps to maintain consistency between guest page tables and shadow page tables Loss of performance due to TLB flush on every world-switch Memory overhead due to shadow copying of guest page tables Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 21

  22. Nested / Extended Page Tables Extended page-table mechanism (EPT) used to support the virtualization of physical memory. Guest-physical addresses are translated by traversing a set of EPT paging structures to produce physical addresses that are used to access memory. The hardware gives us a 2nd set of page tables to do the translation without needing VMM intervention Of course, the VMM is still responsible for setting up the EPT, but this generally only needs to be done once at guest boot time Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 22

  23. Source: https://www.exploit-db.com/docs/45546 CSE 522S Advanced Operating Systems 23

  24. Advantages: EPT Simplified VMM design (no need to maintain any shadow state or complex software MMU structures) Guest page table modifications need not be trapped, hence VM exits reduced. Reduced memory footprint compared to shadow page table algorithms. Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 24

  25. Disadvantages: EPT TLB miss is very costly since guest-physical address to machine address needs an extra EPT walk for each stage of guest-virtual address translation. Source: http://www.cs.cmu.edu/~412/lectures/L04_VTx.pptx CSE 522S Advanced Operating Systems 25

  26. Nested Paging Lookup A guest-virtual to host-physical address translation must occur Assume a TLB miss In this case, traversal is nested Consider x86-64 4 page table references grow to 24 Source: https://research.cs.wisc.edu/multifacet/papers/isca16_agile_paging.pdf CSE 522S Advanced Operating Systems 26

  27. Todays Studio Experience using QEMU emulator and KVM hypervisor on the Raspberry Pi Measure and compare performance of: Virtualization vs Emulation Memory bound workloads in native and virtualized environments Think about what types of workloads would benefit from shadow paging and what type would benefit from nested paging CSE 522S Advanced Operating Systems 27

  28. Todays Readings LKD pages 231-233 & 320-322: A quick review of paging, page tables, and the TLB. Paul Barham, Boris Dragovic et al. 2003. Xen and the art of virtualization. In Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP '03). Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the annual conference on USENIX Annual Technical Conference (ATEC '05). CSE 522S Advanced Operating Systems 28

More Related Content