Memory Management in Programming

Spring 2014
Jim Hogg - UW - CSE P501
W-1
CSE P501 – Compiler Construction
Conventional Heap Storage
Garbage Collection
Spring 2014
Jim Hogg - UW - CSE P501
W-2
...
char* s = (char*) 
malloc(50
);
...
free(s
);
C
C Runtime Heap Memory
Developer must remember to free memory when no longer required
Eventual fragmentation => slow to 
malloc
, slow to 
free
In Use
Conventional Heap Storage
Spring 2014
Jim Hogg - UW - CSE P501
W-3
Heap Storage Fragmentation
C Runtime Heap Memory
In Use
malloc
: walk the freelist to find a slot big enough for current request
free
: adjust freelist; collapse contiguous freespace
fragmentation: plenty free chunks but none big enough for request
cannot compact the used space - may contain pointers; may be pointed-at
Spring 2014
Jim Hogg - UW - CSE P501
W-4
Bugs
Forget to 
free
 => eventually run out of memory
called a "memory leak"
Call 
free
, but continue to use!
called "use-after-free", or "dangling pointer"
memory corruption - wrong answers; crash if lucky!
major source of security issues
detect via "pool poisoning"
2 pointers
free via 1
free malloc;
corruption!
Spring 2014
Jim Hogg - UW - CSE P501
W-5
Solving Memory Leaks
int f() {
   ...
   Car c = Car("Chevy");
   ...
}
C++
object 
c
 is 
destructed
 [calls 
~Car()
 ]automatically when 
c
 it falls out of scope
called RAII ("Resource Acquisition Is Initialization")
more general case still requires Developer to 
delete c
 when no longer needed
"smart pointers" can help
W-6
Solving Leaks and Use-After-Free
public static int Main(String[] args) {
   ...
   if (...) {
      ArrayList cars = new ArrayList();
      for (int i = 0; i < 1000; i++) {
         cars.Add(new Car());
      }
      ...
   }
   ...
   Boat b = new Boat();
   ...
}
A
At point 
A
 I have 1000 
Car
 objects allocated on the heap
At some later time I ask for more heap memory.  May trigger a
"garbage collection" - automatic, and silent
Compiler realizes that 
cars
 is not referenced later by the program;
so it's "garbage"; GC recycles it for use by other objects
Magically solves leaks 
and 
use-after-free!
B
Spring 2014
Jim Hogg - UW - CSE P501
W-7
next
next
Allocate an object; fast!
next
Allocate more objects;
and one more, please?
Garbage Collection
Spring 2014
Jim Hogg - UW - CSE P501
W-8
Garbage Collection, 1
Allocate another object
next
next
"roots"
Trace reachable objects
next
"roots"
Compact unreachables;
update 
all
 pointers
GC does not find garbage: it finds live objects and ignores all other memory
Spring 2014
Jim Hogg - UW - CSE P501
W-9
Roots?
"Roots" are locations that hold a pointer to
any object in the heap:
Method-local variables
Method Arguments
Global variables
Class-static fields
Registers
Spring 2014
Jim Hogg - UW - CSE P501
W-10
GC Start
root
root
Spring 2014
Jim Hogg - UW - CSE P501
W-11
GC Mark Phase
root
root
Unreachable
Reachable
Spring 2014
Jim Hogg - UW - CSE P501
W-12
GC Sweep Phase
root
root
Reachable
With memory free, now allocate space for object that provoked the GC
Spring 2014
Jim Hogg - UW - CSE P501
W-13
No Bugs
Forget to 
free
 => eventually run out of memory
called a "memory leak"
Call 
free
, but continue to use!
called "use-after-free"
GC removes control over free'ing memory from the hands of the Developer
GC will find all live objects and reclaim all other memory each time it is
invoked.  So no "memory leak"s
An object is garbage-collected only if it was unreachable by the program;
being unreachable, GC removes "use-after-free" bugs
So what's a "memory leak" in C# or Java? - holding on to objects that really
could be freed - eg, by setting the object-ref to null.  Particularly troublesome
in a 
Generational
 Garbage Collector
Spring 2014
Jim Hogg - UW - CSE P501
W-14
C# Finalizers
Class 
C
 above defines a "Finalize" method.  Syntax is 
~C()
Syntactic sugar for: 
protected override void Finalize()
As each object of class 
C
 is created, the CLR links it onto a 
Finalization
 Queue
When the object is about to be reclaimed, it is moved onto the 
Freachable
Queue; a Finalizer background thread will run that 
Finalize
 method later
class C {
   private StreamWriter sw;
   ...
   ~C() { sw.Close(); }
 
// flush output buffer
   ...
}
Spring 2014
Jim Hogg - UW - CSE P501
W-15
Finalization
root
Finalization
Queue
root
Finalization
Queue
Freachable
Queue
Not live; not dead; zombies
Spring 2014
Jim Hogg - UW - CSE P501
W-16
Threads
Compiler needs to create GC info - at each point in program, where are
current roots? - variables, registers
GC info is bulky - cannot store for every instruction!
So store GC info for selected points in the program - "GC-safepoints"
Typical GC-safepoint is a function return
Need to stop all threads to carry out a GC
For each thread stack-frame, stomp its return address; when it reaches
there, the thread will return, not to its caller, but into GC code
Spring 2014
Jim Hogg - UW - CSE P501
W-17
Hijacking a Thread
retaddr
gc
caller
GC
gc
GC
gc
GC
Top of Stack
Save & Stomp
retaddr
Thread continues
to run
Eventually returns
and hits the
hijack
Spring 2014
Jim Hogg - UW - CSE P501
W-18
Threads Run to Their Safepoints
Start GC
After each thread has hit its safepoint, GC can start its sweep phase
Spring 2014
Jim Hogg - UW - CSE P501
W-19
Object Age Distribution
Performing GC over large heap consumes lots of cpu time (and trashes
the caches!)
Experiment reveals:
Many objects live a very short time (high "infant mortality")
Some object live a very long time
Bi-modal
number
age
Object Ages
Spring 2014
Jim Hogg - UW - CSE P501
W-20
Generational Garbage Collection
Gen0
Gen2
Gen1
'Nursery'
Divide heap into 3 areas
Allocate new objects from Gen0
At next GC
move all Gen0 survivors into Gen1
move all Gen1 survivors into Gen2
Spring 2014
Jim Hogg - UW - CSE P501
W-21
Migration thru Generations
Gen0
Gen2
Gen1
Gen0
Gen2
Gen1
Gen0 
0
Gen2
Gen1
Survivors
Gen0
Gen2
Gen1
Spring 2014
Jim Hogg - UW - CSE P501
W-22
Generational GC
Garbage-collecting entire heap is expensive
A process of diminishing returns
Instead, Garbage-Collect only the nursery (Gen0)
Smaller, so faster
Yields lots of free space (objects die young)
Maximum 'bang for the buck'
Sometime collect Gen0+Gen1
When the going gets tough, collect Gen0+Gen1+Gen2
("full" collection)
Some old objects may become unreachable - just ignore them in a Gen0 collect
Above picture 
assumes
 no writes to old objects kept Gen0 objects reachable
Spring 2014
Jim Hogg - UW - CSE P501
W-23
Card Table
In practice, older objects are written-to between GCs
Might update a pointer that results in keeping young object alive; missing this is
a disaster
Gen0
Gen2
Gen1
Card Tables: bitmap, with  1 bit per 128 Bytes of Gen2 heap
Create a "write barrier" - any write into Gen1 or Gen2 heap sets corresponding
bit in Card Table
During GC sweep, check Card Tables for any "keep alive" pointers
In practice, Card Tables contain few 1 bits
Gen2 Card Table
Gen1 Card Table
Spring 2014
Jim Hogg - UW - CSE P501
W-24
Debugging the GC
What if a GC goes wrong?
Fails to free some memory that is unreachable => memory leak
Frees some memory that is still reachable - impending disaster
Frees some non-objects - disaster
Colloquially called a GC 'hole'
Causes
Bug in GC - mark, sweep, compaction
Bug in GC info
Symptoms
Wrong answers
Process Crash
If we're 'lucky' failure comes soon after error
Otherwise, failure may occur seconds, minutes or hours later
Testing
GC-Stress
eg: maximally randomize the heap; force frequent GCs; check results
Spring 2014
Jim Hogg - UW - CSE P501
W-25
Further Design Complexities
Objects with Finalizers:
take 2 GCs to die
extend the lifetime of other connected objects
run at some later time => "non-deterministic" finalization
no guarantee on order that finalizers runs
uncaught exceptions escape and disappear!
Strong refs
Weak refs - "short" and "long"
"Dispose" pattern
Resurrection
Large Object Heap (objects > 85 kB in size)
Concurrent Garbage Collection
Per-processor heaps; processor affinity
Fully-interruptible GC
Safe-point on back-edge of loop
Spring 2014
Jim Hogg - UW - CSE P501
W-26
References for CLR GC
Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework – by Jeffrey Richter
Garbage Collection—Part 2: Automatic Memory Management in the Microsoft .NET Framework - by Jeffrey
Richter
Garbage Collector Basics and Performance Hints – by Rico Mariani
Using GC Efficiently – Part 1 by Maoni
Using GC Efficiently – Part 2 by Maoni
Using GC Efficiently – Part 3 by Maoni
Using GC Efficiently – Part 4 by Maoni
GC Performance Counters - by Maoni
Tools that help diagnose managed memory related issues – by Maoni
Clearing up some confusion over finalization and other areas in GC – by Maoni
And a bit of perspective…
Automatic GC has been around since LISP I in 1958
Ubiquitous in functional and object-oriented programming
communities for decades
Mainstream since Java(?) (mid-90s)
Now conventional? - nope!
Specialized patterns of allocate/free are still better coded by-
hand - eg "Arena" storage inside compiler optimization phases
Spring 2014
Jim Hogg - UW - CSE P501
W-27
Slide Note
Embed
Share

Memory management is a critical aspect of programming to prevent issues like memory leaks, use-after-free bugs, and memory corruption. Proper memory allocation and deallocation techniques can improve performance and avoid security vulnerabilities. Learn about heap storage, garbage collection, smart pointers, and more to handle memory effectively.

  • Memory Management
  • Programming
  • Heap Storage
  • Garbage Collection
  • Smart Pointers

Uploaded on Oct 01, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CSE P501 Compiler Construction Conventional Heap Storage Garbage Collection Spring 2014 Jim Hogg - UW - CSE P501 W-1

  2. Conventional Heap Storage C ... char* s = (char*) malloc(50); ... free(s); In Use C Runtime Heap Memory Developer must remember to free memory when no longer required Eventual fragmentation => slow to malloc, slow to free Spring 2014 Jim Hogg - UW - CSE P501 W-2

  3. Heap Storage Fragmentation In Use C Runtime Heap Memory malloc: walk the freelist to find a slot big enough for current request free: adjust freelist; collapse contiguous freespace fragmentation: plenty free chunks but none big enough for request cannot compact the used space - may contain pointers; may be pointed-at Spring 2014 Jim Hogg - UW - CSE P501 W-3

  4. Bugs Forget to free => eventually run out of memory called a "memory leak" Call free, but continue to use! called "use-after-free", or "dangling pointer" memory corruption - wrong answers; crash if lucky! major source of security issues detect via "pool poisoning" 2 pointers free via 1 free malloc; corruption! Spring 2014 Jim Hogg - UW - CSE P501 W-4

  5. Solving Memory Leaks C++ int f() { ... Car c = Car("Chevy"); ... } object c is destructed [calls ~Car() ]automatically when c it falls out of scope called RAII ("Resource Acquisition Is Initialization") more general case still requires Developer to delete c when no longer needed "smart pointers" can help Spring 2014 Jim Hogg - UW - CSE P501 W-5

  6. Solving Leaks and Use-After-Free public static int Main(String[] args) { ... if (...) { ArrayList cars = new ArrayList(); for (int i = 0; i < 1000; i++) { cars.Add(new Car()); } ... } ... Boat b = new Boat(); ... } At point A I have 1000 Car objects allocated on the heap At some later time I ask for more heap memory. May trigger a "garbage collection" - automatic, and silent Compiler realizes that cars is not referenced later by the program; so it's "garbage"; GC recycles it for use by other objects A B Magically solves leaks and use-after-free! W-6

  7. Garbage Collection next Allocate an object; fast! next Allocate more objects; and one more, please? next Spring 2014 Jim Hogg - UW - CSE P501 W-7

  8. Garbage Collection, 1 Allocate another object next "roots" Trace reachable objects next "roots" Compact unreachables; update all pointers next GC does not find garbage: it finds live objects and ignores all other memory Spring 2014 Jim Hogg - UW - CSE P501 W-8

  9. Roots? "Roots" are locations that hold a pointer to any object in the heap: Method-local variables Method Arguments Global variables Class-static fields Registers Spring 2014 Jim Hogg - UW - CSE P501 W-9

  10. GC Start root root Spring 2014 Jim Hogg - UW - CSE P501 W-10

  11. GC Mark Phase root root Unreachable Reachable Spring 2014 Jim Hogg - UW - CSE P501 W-11

  12. GC Sweep Phase root root Reachable With memory free, now allocate space for object that provoked the GC Spring 2014 Jim Hogg - UW - CSE P501 W-12

  13. No Bugs GC removes control over free'ing memory from the hands of the Developer GC will find all live objects and reclaim all other memory each time it is invoked. So no "memory leak"s An object is garbage-collected only if it was unreachable by the program; being unreachable, GC removes "use-after-free" bugs So what's a "memory leak" in C# or Java? - holding on to objects that really could be freed - eg, by setting the object-ref to null. Particularly troublesome in a Generational Garbage Collector Forget to free => eventually run out of memory called a "memory leak" Call free, but continue to use! called "use-after-free" Spring 2014 Jim Hogg - UW - CSE P501 W-13

  14. C# Finalizers class C { private StreamWriter sw; ... ~C() { sw.Close(); } ... } // flush output buffer Class C above defines a "Finalize" method. Syntax is ~C() Syntactic sugar for: protected override void Finalize() As each object of class C is created, the CLR links it onto a Finalization Queue When the object is about to be reclaimed, it is moved onto the Freachable Queue; a Finalizer background thread will run that Finalize method later Spring 2014 Jim Hogg - UW - CSE P501 W-14

  15. Finalization root Finalization Queue root Finalization Queue Freachable Queue Not live; not dead; zombies Spring 2014 Jim Hogg - UW - CSE P501 W-15

  16. Threads Compiler needs to create GC info - at each point in program, where are current roots? - variables, registers GC info is bulky - cannot store for every instruction! So store GC info for selected points in the program - "GC-safepoints" Typical GC-safepoint is a function return Need to stop all threads to carry out a GC For each thread stack-frame, stomp its return address; when it reaches there, the thread will return, not to its caller, but into GC code Spring 2014 Jim Hogg - UW - CSE P501 W-16

  17. Hijacking a Thread Thread continues to run Eventually returns and hits the hijack Save & Stomp retaddr Top of Stack retaddr gc gc gc GC GC GC caller Spring 2014 Jim Hogg - UW - CSE P501 W-17

  18. Threads Run to Their Safepoints Start GC After each thread has hit its safepoint, GC can start its sweep phase Spring 2014 Jim Hogg - UW - CSE P501 W-18

  19. Object Age Distribution Performing GC over large heap consumes lots of cpu time (and trashes the caches!) Experiment reveals: Many objects live a very short time (high "infant mortality") Some object live a very long time Bi-modal number Object Ages age Spring 2014 Jim Hogg - UW - CSE P501 W-19

  20. Generational Garbage Collection Gen0 Gen1 Gen2 'Nursery' Divide heap into 3 areas Allocate new objects from Gen0 At next GC move all Gen0 survivors into Gen1 move all Gen1 survivors into Gen2 Spring 2014 Jim Hogg - UW - CSE P501 W-20

  21. Migration thru Generations Gen0 Gen1 Gen2 Survivors Gen0 Gen1 Gen2 Gen0 Gen1 Gen2 Gen0 0 Gen1 Gen2 Spring 2014 Jim Hogg - UW - CSE P501 W-21

  22. Generational GC Garbage-collecting entire heap is expensive A process of diminishing returns Instead, Garbage-Collect only the nursery (Gen0) Smaller, so faster Yields lots of free space (objects die young) Maximum 'bang for the buck' Sometime collect Gen0+Gen1 When the going gets tough, collect Gen0+Gen1+Gen2 ("full" collection) Some old objects may become unreachable - just ignore them in a Gen0 collect Above picture assumes no writes to old objects kept Gen0 objects reachable Spring 2014 Jim Hogg - UW - CSE P501 W-22

  23. Card Table In practice, older objects are written-to between GCs Might update a pointer that results in keeping young object alive; missing this is a disaster Gen0 Gen1 Gen2 Gen1 Card Table Gen2 Card Table Card Tables: bitmap, with 1 bit per 128 Bytes of Gen2 heap Create a "write barrier" - any write into Gen1 or Gen2 heap sets corresponding bit in Card Table During GC sweep, check Card Tables for any "keep alive" pointers In practice, Card Tables contain few 1 bits Spring 2014 Jim Hogg - UW - CSE P501 W-23

  24. Debugging the GC What if a GC goes wrong? Fails to free some memory that is unreachable => memory leak Frees some memory that is still reachable - impending disaster Frees some non-objects - disaster Colloquially called a GC 'hole' Causes Bug in GC - mark, sweep, compaction Bug in GC info Symptoms Wrong answers Process Crash If we're 'lucky' failure comes soon after error Otherwise, failure may occur seconds, minutes or hours later Testing GC-Stress eg: maximally randomize the heap; force frequent GCs; check results Spring 2014 Jim Hogg - UW - CSE P501 W-24

  25. Further Design Complexities Objects with Finalizers: take 2 GCs to die extend the lifetime of other connected objects run at some later time => "non-deterministic" finalization no guarantee on order that finalizers runs uncaught exceptions escape and disappear! Strong refs Weak refs - "short" and "long" "Dispose" pattern Resurrection Large Object Heap (objects > 85 kB in size) Concurrent Garbage Collection Per-processor heaps; processor affinity Fully-interruptible GC Safe-point on back-edge of loop Spring 2014 Jim Hogg - UW - CSE P501 W-25

  26. References for CLR GC Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework by Jeffrey Richter Garbage Collection Part 2: Automatic Memory Management in the Microsoft .NET Framework - by Jeffrey Richter Garbage Collector Basics and Performance Hints by Rico Mariani Using GC Efficiently Part 1 by Maoni Using GC Efficiently Part 2 by Maoni Using GC Efficiently Part 3 by Maoni Using GC Efficiently Part 4 by Maoni GC Performance Counters - by Maoni Tools that help diagnose managed memory related issues by Maoni Clearing up some confusion over finalization and other areas in GC by Maoni Spring 2014 Jim Hogg - UW - CSE P501 W-26

  27. And a bit of perspective Automatic GC has been around since LISP I in 1958 Ubiquitous in functional and object-oriented programming communities for decades Mainstream since Java(?) (mid-90s) Now conventional? - nope! Specialized patterns of allocate/free are still better coded by- hand - eg "Arena" storage inside compiler optimization phases Spring 2014 Jim Hogg - UW - CSE P501 W-27

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#