Extensible Shape Analysis: Designing with the User in Mind
The article discusses the challenges in data structure analysis and proposes a user-centric approach for designing program analyses. By using developer-written testing code as specifications, the method aims to enhance precision and efficiency in inferring data structure properties. Various contributions and examples of shape analysis by removing duplicates are also highlighted.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Extensible Shape Analysis by Designing with the User in Mind Bor-Yuh Evan Chang, Xavier Rival, and George Necula University of California, Berkeley OSQ Retreat May 16, 2008
Motivation Analyses find many kinds of bugs For example, read( ); acquire( ); Reading from a closed file: Reacquiring a locked lock: But often struggle when objects are put into data structures Shape Analysis is Data Structure Analysis Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 2
Whats hard about data structures? Abstraction too coarse or not precise enough (e.g., lost x is always unlocked) code // x now points to an unlocked lock in a linked list ideal analysis state analysis state ? or or or x x x x acquire(x); code For decidability, must abstract (e.g., merge objects) mislabels good code as buggy Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 3
To address the precision challenge Traditional program analysis mentality: Why can t developers write more specifications for our analysis? Then, we could verify so much more. Since developers won t write specifications, we will use default abstractions (perhaps coarse) that work hopefully most of the time. Our approach: Can we design program analyses around the user? Developers write testing code. Can we adapt the analysis to use those as specifications? Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 4
Overview of contributions Precise inference of data structure properties Able to check, for instance, the locking example Targeted to software developers Uses data structure checking code for guidance Turns testing code into a specification for static analysis Efficient Builds abstraction out of developer-supplied checking code Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 5
Shape analysis by example: Removing duplicates Example/Testing Code Review/Static Analysis 2 2 4 4 l sorted dl list l // l is a sorted doubly-linked list for each node cur inlist l { remove cur if duplicate; program-specific intermediate state more complicated 2 4 4 l segment with no duplicates sorted dl list l } assert l is sorted, doubly-linked with no duplicates; 2 4 cur cur l no duplicates l Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 6
Shape analysis is not yet practical Choosing the heap abstraction difficult for precision Traditional approaches: Parametric in low-level, analyzer-oriented predicates + Very general and expressive - Hard for non-expert 89 TVLA [Sagiv et al.] Built-in high-level predicates - Hard to extend + No additional user effort (if precise enough) Space Invader [Distefano et al.] Our approach: Parametric in high-level, developer-oriented predicates + Extensible + Targeted to developers Xisa Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 7
Key insight for being developer-friendly and efficient Utilize run-time checking code as specification for static analysis. Contribution: Build the abstraction for analysis out of developer-specified checking code dll(h, p) = if (h =null) then true else h!prev = p and dll(h!next, h) checker p specifies where prev should point assert(sorted_dll(l, )); l for each node cur inlist l { remove cur if duplicate; Contribution: Automatically generalize checkers for complicated intermediate states l cur } assert(sorted_dll_nodup(l, )); l Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 8
Our framework is An automated shape analysis with a precise memory abstraction based around invariant checkers. dll(h, p) = if (h =null) then true else h!prev = prev and dll(h!next, h) checkers shape analyzer Extensible and targeted for developers Parametric in developer-supplied checkers Precise yet compact abstraction for efficiency Data structure-specific based on properties of interest to the developer Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 9
Shape analysis is an abstract interpretation on memory descriptions with Splitting of summaries (materialization) l l cur cur To reflect updates precisely (strong updates) l l cur cur And summarizing for termination (widening) l l cur cur Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 10
Outline Learn information about the checker to use it as an abstraction 1 splitting and interpreting update 2 type pre-analysis on checker definitions dll(h, p) = if (h =null) then true else h!prev = prev and dll(h!next, h) summarizing checkers abstract interpretation shape analyzer Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 11
Overview: Split summaries to interpret updates precisely Want abstract update to be exact , that is, to update one concrete memory cell . The example at a high-level: iterate using cur changing the doubly-linked list from purple to red. l Challenge: How does the analysis split summaries and know where to split ? split at cur cur l update cur purple to red cur l l cur cur Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 12
Split summaries by unfolding induction dll segment l to cur dll(cur, p) l p cur get: cur!next Technical Details: What about unfolding segments? How are segments unfolded? (Key: Segments are also inductively defined) Analysis doesn t forget the empty case [POPL 08] l dll segment l to cur null dll(h, p) = if (h =null) then true else h!prev = p and dll(h!next, h) cur How does the analysis know to do unfolding? dll segment l to cur dll(n, cur) l p n cur Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 13
Outline How do we decide where to unfold? Derives additional information to guide unfolding 1 splitting and interpreting update 2 type pre-analysis on checker definitions dll(h, p) = if (h =null) then true else h!prev = prev and dll(h!next, h) summarizing checkers abstract interpretation Contribution: Turns testing code into specification for static analysis shape analyzer Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 14
Types for deciding where to unfold Summary If it exists, where is: cur!next ? l dll segment l to cur dll(cur, p) p 0 cur Instance -1 p!next ? l p n cur Checker Definition Checker Run (call tree/derivation) h : {nexth0i, prevh0i } p : {nexth-1i, prevh-1i } dll(l,null) -2 dll(p,l) dll(h, p) = if (h =null) then true else h!prev = p and dll(h!next, h) -1 Says: For h!next/h!prev, unfold from h For p!next/p!prev, unfold before h dll(cur,p) 0 dll(n,cur) 1 dll(null,n) Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 15
Types make the analysis robust with respect to how checkers are written h : {nexth0i, prevh0i } p : {nexth-1i, prevh-1i } Doubly-linked list checker (as before) Instance dll(h, p) = if (h =null) then true else h!prev = p and dll(h!next, h) cur Alternative doubly-linked list checker h : {nexth0i, prevh-1i } dll0(h) = if (h!next =null) then true else h!next!prev = h and dll0(h!next) Instance Different types for different unfolding cur cur!prev ? -1 Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 16
Summary of checker parameter types Tell where to unfold for which fields Make analysis robust with respect to how checkers are written Learn where in summaries unfolding won t help Can be inferred automatically with a fixed- point computation on the checker definitions Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 17
Times negligible for data structure operations (often in sec or 1/10 sec) Results: Performance Expressiveness: Different data structures Max. Num. Graphs at a Program Pt 1 1 2 5 5 5 5 6 4 Analysis Time (ms) Benchmark singly-linked list reverse doubly-linked list reverse doubly-linked list copy doubly-linked list remove doubly-linked list remove and back search tree with parent insert search tree with parent insert and back two-level skip list rebalance Linux scull driver (894 loc) (char arrays ignored, functions inlined) TVLA: 290 ms 0.6 1.4 5.3 6.5 6.8 8.3 47.0 87.0 Space Invader only analyzes lists (built-in) TVLA: 850 ms 9710.0 Verified shape invariant as given by the checker is preserved across the operation. Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 18
Conclusion Key Insight: Checkers as specifications Developer View: Global, Expressed in a familiar style Analysis View: Capture developer intent, Not arbitrary inductive definitions Constructing the program analysis Intermediate states: Generalized segment predicates dll segment l to cur Splitting: Checker parameter types with levels h : {nexth0i, prevh0i} p : {nexth-1i, prevh-1i} Bor-Yuh Evan Chang - Extensible Shape Analysis by Designing with the User in Mind 19
What can inductive shape analysis do for you? http://xisa.cs.berkeley.edu