Understanding Memory Management in Programming

Slide Note
Embed
Share

Memory management is a critical aspect of programming to prevent issues like memory leaks, use-after-free bugs, and memory corruption. Proper memory allocation and deallocation techniques can improve performance and avoid security vulnerabilities. Learn about heap storage, garbage collection, smart pointers, and more to handle memory effectively.


Uploaded on Oct 01, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. CSE P501 Compiler Construction Conventional Heap Storage Garbage Collection Spring 2014 Jim Hogg - UW - CSE P501 W-1

  2. Conventional Heap Storage C ... char* s = (char*) malloc(50); ... free(s); In Use C Runtime Heap Memory Developer must remember to free memory when no longer required Eventual fragmentation => slow to malloc, slow to free Spring 2014 Jim Hogg - UW - CSE P501 W-2

  3. Heap Storage Fragmentation In Use C Runtime Heap Memory malloc: walk the freelist to find a slot big enough for current request free: adjust freelist; collapse contiguous freespace fragmentation: plenty free chunks but none big enough for request cannot compact the used space - may contain pointers; may be pointed-at Spring 2014 Jim Hogg - UW - CSE P501 W-3

  4. Bugs Forget to free => eventually run out of memory called a "memory leak" Call free, but continue to use! called "use-after-free", or "dangling pointer" memory corruption - wrong answers; crash if lucky! major source of security issues detect via "pool poisoning" 2 pointers free via 1 free malloc; corruption! Spring 2014 Jim Hogg - UW - CSE P501 W-4

  5. Solving Memory Leaks C++ int f() { ... Car c = Car("Chevy"); ... } object c is destructed [calls ~Car() ]automatically when c it falls out of scope called RAII ("Resource Acquisition Is Initialization") more general case still requires Developer to delete c when no longer needed "smart pointers" can help Spring 2014 Jim Hogg - UW - CSE P501 W-5

  6. Solving Leaks and Use-After-Free public static int Main(String[] args) { ... if (...) { ArrayList cars = new ArrayList(); for (int i = 0; i < 1000; i++) { cars.Add(new Car()); } ... } ... Boat b = new Boat(); ... } At point A I have 1000 Car objects allocated on the heap At some later time I ask for more heap memory. May trigger a "garbage collection" - automatic, and silent Compiler realizes that cars is not referenced later by the program; so it's "garbage"; GC recycles it for use by other objects A B Magically solves leaks and use-after-free! W-6

  7. Garbage Collection next Allocate an object; fast! next Allocate more objects; and one more, please? next Spring 2014 Jim Hogg - UW - CSE P501 W-7

  8. Garbage Collection, 1 Allocate another object next "roots" Trace reachable objects next "roots" Compact unreachables; update all pointers next GC does not find garbage: it finds live objects and ignores all other memory Spring 2014 Jim Hogg - UW - CSE P501 W-8

  9. Roots? "Roots" are locations that hold a pointer to any object in the heap: Method-local variables Method Arguments Global variables Class-static fields Registers Spring 2014 Jim Hogg - UW - CSE P501 W-9

  10. GC Start root root Spring 2014 Jim Hogg - UW - CSE P501 W-10

  11. GC Mark Phase root root Unreachable Reachable Spring 2014 Jim Hogg - UW - CSE P501 W-11

  12. GC Sweep Phase root root Reachable With memory free, now allocate space for object that provoked the GC Spring 2014 Jim Hogg - UW - CSE P501 W-12

  13. No Bugs GC removes control over free'ing memory from the hands of the Developer GC will find all live objects and reclaim all other memory each time it is invoked. So no "memory leak"s An object is garbage-collected only if it was unreachable by the program; being unreachable, GC removes "use-after-free" bugs So what's a "memory leak" in C# or Java? - holding on to objects that really could be freed - eg, by setting the object-ref to null. Particularly troublesome in a Generational Garbage Collector Forget to free => eventually run out of memory called a "memory leak" Call free, but continue to use! called "use-after-free" Spring 2014 Jim Hogg - UW - CSE P501 W-13

  14. C# Finalizers class C { private StreamWriter sw; ... ~C() { sw.Close(); } ... } // flush output buffer Class C above defines a "Finalize" method. Syntax is ~C() Syntactic sugar for: protected override void Finalize() As each object of class C is created, the CLR links it onto a Finalization Queue When the object is about to be reclaimed, it is moved onto the Freachable Queue; a Finalizer background thread will run that Finalize method later Spring 2014 Jim Hogg - UW - CSE P501 W-14

  15. Finalization root Finalization Queue root Finalization Queue Freachable Queue Not live; not dead; zombies Spring 2014 Jim Hogg - UW - CSE P501 W-15

  16. Threads Compiler needs to create GC info - at each point in program, where are current roots? - variables, registers GC info is bulky - cannot store for every instruction! So store GC info for selected points in the program - "GC-safepoints" Typical GC-safepoint is a function return Need to stop all threads to carry out a GC For each thread stack-frame, stomp its return address; when it reaches there, the thread will return, not to its caller, but into GC code Spring 2014 Jim Hogg - UW - CSE P501 W-16

  17. Hijacking a Thread Thread continues to run Eventually returns and hits the hijack Save & Stomp retaddr Top of Stack retaddr gc gc gc GC GC GC caller Spring 2014 Jim Hogg - UW - CSE P501 W-17

  18. Threads Run to Their Safepoints Start GC After each thread has hit its safepoint, GC can start its sweep phase Spring 2014 Jim Hogg - UW - CSE P501 W-18

  19. Object Age Distribution Performing GC over large heap consumes lots of cpu time (and trashes the caches!) Experiment reveals: Many objects live a very short time (high "infant mortality") Some object live a very long time Bi-modal number Object Ages age Spring 2014 Jim Hogg - UW - CSE P501 W-19

  20. Generational Garbage Collection Gen0 Gen1 Gen2 'Nursery' Divide heap into 3 areas Allocate new objects from Gen0 At next GC move all Gen0 survivors into Gen1 move all Gen1 survivors into Gen2 Spring 2014 Jim Hogg - UW - CSE P501 W-20

  21. Migration thru Generations Gen0 Gen1 Gen2 Survivors Gen0 Gen1 Gen2 Gen0 Gen1 Gen2 Gen0 0 Gen1 Gen2 Spring 2014 Jim Hogg - UW - CSE P501 W-21

  22. Generational GC Garbage-collecting entire heap is expensive A process of diminishing returns Instead, Garbage-Collect only the nursery (Gen0) Smaller, so faster Yields lots of free space (objects die young) Maximum 'bang for the buck' Sometime collect Gen0+Gen1 When the going gets tough, collect Gen0+Gen1+Gen2 ("full" collection) Some old objects may become unreachable - just ignore them in a Gen0 collect Above picture assumes no writes to old objects kept Gen0 objects reachable Spring 2014 Jim Hogg - UW - CSE P501 W-22

  23. Card Table In practice, older objects are written-to between GCs Might update a pointer that results in keeping young object alive; missing this is a disaster Gen0 Gen1 Gen2 Gen1 Card Table Gen2 Card Table Card Tables: bitmap, with 1 bit per 128 Bytes of Gen2 heap Create a "write barrier" - any write into Gen1 or Gen2 heap sets corresponding bit in Card Table During GC sweep, check Card Tables for any "keep alive" pointers In practice, Card Tables contain few 1 bits Spring 2014 Jim Hogg - UW - CSE P501 W-23

  24. Debugging the GC What if a GC goes wrong? Fails to free some memory that is unreachable => memory leak Frees some memory that is still reachable - impending disaster Frees some non-objects - disaster Colloquially called a GC 'hole' Causes Bug in GC - mark, sweep, compaction Bug in GC info Symptoms Wrong answers Process Crash If we're 'lucky' failure comes soon after error Otherwise, failure may occur seconds, minutes or hours later Testing GC-Stress eg: maximally randomize the heap; force frequent GCs; check results Spring 2014 Jim Hogg - UW - CSE P501 W-24

  25. Further Design Complexities Objects with Finalizers: take 2 GCs to die extend the lifetime of other connected objects run at some later time => "non-deterministic" finalization no guarantee on order that finalizers runs uncaught exceptions escape and disappear! Strong refs Weak refs - "short" and "long" "Dispose" pattern Resurrection Large Object Heap (objects > 85 kB in size) Concurrent Garbage Collection Per-processor heaps; processor affinity Fully-interruptible GC Safe-point on back-edge of loop Spring 2014 Jim Hogg - UW - CSE P501 W-25

  26. References for CLR GC Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework by Jeffrey Richter Garbage Collection Part 2: Automatic Memory Management in the Microsoft .NET Framework - by Jeffrey Richter Garbage Collector Basics and Performance Hints by Rico Mariani Using GC Efficiently Part 1 by Maoni Using GC Efficiently Part 2 by Maoni Using GC Efficiently Part 3 by Maoni Using GC Efficiently Part 4 by Maoni GC Performance Counters - by Maoni Tools that help diagnose managed memory related issues by Maoni Clearing up some confusion over finalization and other areas in GC by Maoni Spring 2014 Jim Hogg - UW - CSE P501 W-26

  27. And a bit of perspective Automatic GC has been around since LISP I in 1958 Ubiquitous in functional and object-oriented programming communities for decades Mainstream since Java(?) (mid-90s) Now conventional? - nope! Specialized patterns of allocate/free are still better coded by- hand - eg "Arena" storage inside compiler optimization phases Spring 2014 Jim Hogg - UW - CSE P501 W-27

Related