Reviving Reference Counting: A Comprehensive Analysis

Slide Note
Embed
Share

Background garbage collection techniques like tracing and reference counting are crucial in managing memory in different settings. This article delves into the historical context, advantages, disadvantages, and challenges of reference counting in garbage collection. It presents an in-depth analysis of key design choices and optimizations, aiming to eliminate performance overhead and reintroduce reference counting in high-performance environments.


Uploaded on Sep 25, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Down for the Count: Getting Reference Counting Back in the Ring Rifat Shahriyar, Stephen M Blackburn, Daniel Frampton The Australian National University

  2. 52 years ago 2

  3. 3

  4. Today 4

  5. Background Garbage collection is ubiquitous Two ideas underpin large literature: - Tracing [McCarthy60] - Reference Counting [Collins60] However Tracing used in all high performance GCs Reference counting only in non-performance critical settings Why? 5

  6. Background Tracing [McCarthy1960] B A D C E F 6

  7. Background Reference Counting [Collins 1960] 1 1 1 2 1 0 1 2 1 7

  8. Background Reference Counting vs. Tracing Advantages Immediate Incremental Reclaim as-you-go Object-local Overhead distributed Very simple Trivial implementation for na ve RC Disadvantages Maintain count Time and space overheads Cycles Can t be collected Complex High performance implementation about as complex as tracing 8

  9. The Challenge One of the two fundamental GC algorithms Many advantages Neglected by all performance-conscious VMs So how much slower is it? 30% Can we get RC back in the ring? 9

  10. Background Contributions In-depth analysis of key design choices for RC implementations Optimizations that completely eliminate the performance overhead Detailed performance study with other high performance collectors 10

  11. Understanding Reference Counting 11

  12. Understanding Reference Counting Design Space Storing the count A word per object Some available bits Maintaining the count Naive Deferred Coalescing Generational and Age-Oriented Cycles Backup tracing Trial deletion 12

  13. Understanding Reference Counting 1. Storing the Count Space Dedicated word (32 bits) per object Steal bits from each object s header How many bits we actually need? 13

  14. Understanding Reference Counting Maximum Reference Count Distribution 100 compress db mpegaudio jack bloat eclipse hsqldb lusearch pmd xalan jess javac mtrt avrora chart fop luindex pjbb2005 sunflow mean 90 80 percentage of objects 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8-15 16-31 32-63 > 63 maximum reference count Vast majority of objects have max counts of 7 or less 14

  15. Understanding Reference Counting Overflow 100 95 compress db mpegaudio jack bloat eclipse hsqldb lusearch pmd xalan jess javac mtrt avrora chart fop luindex pjbb2005 sunflow mean 90 85 percentage of overflowed objects 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0 1 2 3 4 5 number of bits used . But these attract 18% of incs and decs! 4 bits: 0.11% overflow 15

  16. Understanding Reference Counting 2. Maintaining the Count Naive RC requires inc and dec on every pointer mutation Easy to implement only write barriers, no GC maps Deferred RCignores changes to stacks and register Temporary increment used to fake liveness of roots Normal incs and decs deferred until collection time Much faster than naive but requires GC maps Coalescing RC ignores many changes to heap Coalesce pointer mutations record just first and last in chain of changes 16

  17. Understanding Reference Counting Sources of Incs and Decs New objects Mutations to: non-new scalar objects non-new array objects Temporary operations due to root reachability Cycle collection-generated decrements 17

  18. Understanding Reference Counting Increments due to young objects 71% increments for newly allocated objects 18

  19. Understanding Reference Counting Decrements due to young objects 71% decrements for newly allocated objects 19

  20. Understanding Reference Counting Survival ratio Only about 10% survive 20

  21. Improving Reference Counting 21

  22. Improving Reference Counting 1. Storing the Count Findings: Max count < 8 in most cases Very few overflows The percentage of stuck objects is very low Design: Handling stuck objects Auxiliary data structure to store count of the overflowedobjects Ignore them let backup tracing collect stuck objects Restore them let backup tracing restore stuck counts 22

  23. Improving Reference Counting Limited bit strategies heap size = 2x the minimum heap size 5% faster 23

  24. Improving Reference Counting 2. Maintaining the Count Findings: New objects are the source of most incs and decs Survival ratio is low New objects a fruitful focus [Azatchi & Petrank 03] [Blackburn & McKinley 03] 24

  25. Improving Reference Counting Existing Handling of New Objects Implicitly dirty Marked as dirty and enqueued Inc applied to each referent at next collection Implicitly live Initial count of one Dec enqueued, processed at next collection 25

  26. Improving Reference Counting New: Handling of New Objects Implicitly clean Lazily dirty at collection time only if (transitively) inc d by old object or roots Non-surviving objects never processed Implicitly dead Lazily increment at collection time only if (transitively) inc d by old object or roots Available to free list unless they are incremented Minor change to existing RC, not hybrid of two collectors 26

  27. Improving Reference Counting Implicitly Clean & Implicitly Dead heap size = 2x the minimum heap size 16% &19% faster respectively 27

  28. Improving Reference Counting Standard vs. Improved RC heap size = 2x the minimum heap size 24% faster (i.e. standard is 30% slower) 28

  29. Back in the Ring 29

  30. Back in the Ring RC vs. MS heap size = 2x the minimum heap size 30

  31. Back in the Ring RC vs. MS heap size = 2x the minimum heap size New RC MS 31

  32. Back in the Ring RC vs. URC vs. Immix heap size = 2x the minimum heap size 2% slower than URC, 3% slower than Immix 32

  33. Back in the Ring RC vs. Sticky Collectors heap size = 2x the minimum heap size Same as StickyMS and 10% slower than StickyImmix 33

  34. Summary and Conclusion RC is neglected despite many advantages RC 30% slower than tracing when using the same allocator Three optimizations 4 bit count Implicitly clean Implicitly dead Eliminate overhead entirely RC is back in the ring! 34

  35. Questions? 35

Related