Understanding Program Analysis with Set Constraints
Explore the concept of program analysis with set constraints, delving into techniques like set-variable-based analysis, constant propagation, and constraint graphs. Learn about term constraints, additional implicit constraints, and function calls in the context of set-constraint based analysis. Gain insights into how constraint resolution plays a key role in determining program variables and values using set constraints.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Program Analysis with Set Constraints Ravi Chugh
Set-constraint based analysis Another technique for computing information about program variables Phase 1: constraint generation Create set variables corresponding to program Add inclusion constraints between these sets Usually a local, syntax-directed process (ASTs vs CFGs) Phase 2: constraint resolution Solve for values of all set variables Extends naturally to inter-procedural analysis
Constant propagation Want to determine whether x and y are constant values when they are used int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } We will build a flow- insensitive analysis
Set constraints Terms t := c (constant) (set variable) (constructed term) | X | C(t1,...,tn) Constraints t1 t2 Constructors C(v1,...,vn) is an n-arg ctor C with variances vi viis either + (covariant) or (contravariant) Covariance corresponds to forwards flow Contravariance corresponds to backwards flow (set inclusion)
Additional constraints Implicit constraints added by following rules: 1) Transitivity if t1 t2and t2 t3then t1 t3 2) Variance through constructed terms if C(...,ti,...) C(...,ui,...) then ti uifor covariant positions of C ui tifor contravariant positions of C
Constraint graphs 1 X 1 X X Y Y B C A Ctor Ctor(A,B,C) Ctor(D,E,F) where Ctor(+,-,+) E F D Ctor
Function calls Define ctor Fun(-,+) for one input/one output To encode a function def/call: int z = id(2); Fun(i,r) id Fun(2,z) By contravariance, the actual 2 flows to i By covariance, the return value of id flows to z r i Fun id z 2 Fun
int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... }
int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... }
Fun(i,r1) abs int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } i Fun r1 abs
Fun(i,r1) abs int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } i Fun r1 abs
Fun(i,r1) abs i r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } i Fun r1 abs
Fun(i,r1) abs i r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } i Fun r1 abs
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } T i Fun r1 abs
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } T i Fun r1 abs
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id T i j Fun Fun r1 r2 abs id
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id T i j Fun Fun r1 r2 abs id
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 T i j Fun Fun r1 r2 abs id
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 T i j Fun Fun r1 r2 abs id
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 1 a 2 b T i j Fun Fun r1 r2 abs id a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 1 a 2 b T i j Fun Fun r1 r2 abs id a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 1 a 2 b abs Fun(a,x) T i j Fun Fun r1 r2 abs id x Fun a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 1 a 2 b abs Fun(a,x) T i j Fun Fun r1 r2 abs id x Fun a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 1 a 2 b abs Fun(a,x) id Fun(b,y) T i j Fun Fun r1 r2 abs id x y Fun Fun a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 ?? x ?? y 1 a 2 b abs Fun(a,x) id Fun(b,y) T i j Fun Fun r1 r2 abs id x y Fun Fun a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 {1,T} x ?? y 1 a 2 b abs Fun(a,x) id Fun(b,y) T i j Fun Fun r1 r2 abs id x y Fun Fun a b 1 2
Fun(i,r1) abs i r1 T r1 int abs(int i) { if (...) { return i; } else { return i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b); ... use x ... ... use y ... } Fun(j,r2) id j r2 {1,T} x {2} y 1 a 2 b abs Fun(a,x) id Fun(b,y) T i j Fun Fun r1 r2 abs id x y Fun Fun a b 1 2
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 int i = 1; int *p = &i; *p = 2; int j = *p;
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 Ref int i = 1; int *p = &i; *p = 2; int j = *p;
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 Ref int i = 1; int *p = &i; *p = 2; int j = *p; p
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 Ref int i = 1; int *p = &i; *p = 2; int j = *p; p 2 Ref
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 Ref int i = 1; int *p = &i; *p = 2; int j = *p; p j Ref 2 Ref
Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations i 1 Ref int i = 1; int *p = &i; *p = 2; int j = *p; p j Ref 2 Ref
More on functions This encoding supports higher-order functions Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); j Fun id
More on functions This encoding supports higher-order functions Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); j Fun id id Ref
More on functions This encoding supports higher-order functions Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); j Fun id Ref funcPtr
More on functions This encoding supports higher-order functions Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); j Fun id Ref funcPtr Ref x 0 Fun
Context (in)sensitivity Multiple call sites int x = id(1); int y = id(2); {1,2} x{1,2} y r id j Fun x y 1 2 Fun Fun
Context sensitivity Multiple call sites int x = id1(1); int y = id2(2); Option 1: Specialization Each call idi gets a new copy of id {1} x {2} y r2 Fun j2 r1 Fun j1 id2 id1 y 2 Fun x 1 Fun Eliminates smearing but increases graph size
Context sensitivity Option 2: Unique labeled edges for each call site Not using Fun constructor j r [1 ]1 [2 ]2 1 x 2 y There is flow only if there is a path that spells a substring of a well-bracketed string [a[b]b]a and [a]a[b are valid; [a[b]a]b is not For both options, if there are higher-order functions or function pointers, need a first pass to compute pointer targets
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); r j Fun id o.f o.g Fldf Fldg o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); r j Fun id o.f o.g Fldf Fldg o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); Fldg r j Fun id r3 p Fun o.f o.g Fldf Fldg readG o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); Fldg r j Fun id r3 p Fun o.f o.g Fldf Fldg w Fun readG Fldf o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); Fldg r j Fun id r3 p Fun o.f o.g Fldf Fldg w Fun readG Fldf o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); Fldg r j Fun id r3 p Fun o.f o.g Fldf Fldg w Fun readG z Fun Fldf o 3 Fldf 4 Fldg
Field sensitivity For each field f, define Fldf(-,+)constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); Fldg r j Fun id r3 p Fun o.f o.g Fldf Fldg w Fun readG z Fun Fldf o 3 Fldf 4 Fldg
Scalability Constraint graph for entire program is in memory Even for flow-insensitive analyses, this can become a bottleneck Even worse for flow-sensitive analyses Techniques for analyzing parts of program in isolation and storing summaries of their observable effects
Summary Set constraints a natural way to express various program analyses Constant propagation, pointer analysis Closure analysis Receiver class analysis, prototype-based inheritance Information flow Rich literature on solving systems of constraints Non-trivial to extend to flow-sensitive or summary-based analyses Interference between functions and references