Dataflow Analysis in Program Optimization

1
Dataflow Analysis
 
 
 
 
2
Overview
To perform optimizations like constant propagation or
dead code elimination, we must
Analyze program to find opportunities for performing
optimizations safely
Transform program
Analysis is called 
dataflow analysis
Model values of interest as domain
Formulate system of fixpoint equations whose unknowns are
the required information at different points in program
Solve system of equations
Later we will study other types of program analyses
Some concepts
A statement is a
definition
 of a variable
v if it may write to v.
A statement is a 
use
 of
variable v if it may read
from v.
A variable v is 
live
 at a
point p in a CFG if
there is a path from p
to END on which there
is a use of v before any
definition of v.
 
 
3
 
{x,y,z,d,c}
if (c)
x = y+1
y=2*z
if(d)
x=y+z
z=1
z=x
 
{}
 
{x}
 
 
4
Use/Def for each 3-address op
 
if I is 
x = y OP z
 :  
 
use[I] = 
{y, z}
    
 
def[I] = 
{x}
 
if I is 
x = OP y
    :  use[I] = 
{y}
  
 
def[I] = 
{x}
 
if I is 
x = y
         :  use[I] = 
{y}
    
 
def[I] = 
{x}
 
if I is 
if (x)
         :   use[I] = 
{x}
    
 
def[I] = 
{}
 
if I is 
return x
    :   
 
use[I] = 
{x}
    
 
def[I] = 
{}
 
if I is 
x = f(y
1 
,…, y
n
)
 :  
 
use[I] = 
{y
1
, …, y
n
}
     
def[I] = 
{x}
Aliasing
Two or more program names for the same memory
location
Can happen if you have pointers or call-by-reference
parameters
       x = 3
       y = @x
     *y = 5
            …x…
Aliasing complicates def/use computation
We will assume no aliasing for now so variable name
is enough to determine defs and uses
x and *y are aliases
Dataflow information
Facts about program execution that are
useful for optimization at compile-time
Examples:
Is the value of x always a constant at this point
of the program? If so, what is the value?
Analysis: constant propagation
Given a use of x in a statement, which
assignment statements for x could have written
the value at that use? Analysis: reaching
definitions
Is this computation of “x+y” redundant because
it has been computed earlier in the program?
Analysis: common subexpression elimination
 
 
7
Program Points
Two program points
 for each instruction:
There is a program point before each instruction
There is a program point after each instruction
In a basic block:
Program point after an instruction = program point
before the successor instruction
 
 
8
Program Points: Example
Multiple successor blocks
means that point at the
end of a block has multiple
successor program points
Depending on the
execution, control flows
from a program point to
one of its successors
Reaching definitions
A definition d of variable v is said
to reach a point p if
there is a path from START to p
that contains d
there are no other definitions of v
on that path after d
Convention: START has a definition
for all variables of interest
How do we compute reaching
definitions at each point of
program?
To keep things simple, we consider
only one variable
in general, you would perform
analysis for all variables of interest
{d3}
To find definitions that reach a point p
Enumerate all paths from START to p
For each path, find the definition that
reaches p
Compute the union of these definitions
This is called the “meet-over-paths”
(MOP)solution
Confluence operation
:
How do we combine information from
different paths to a given point?
Problem:
If program has loops, set of paths to p
will be unbounded
Better idea: dataflow equations
One solution: meet-over-paths
{d3}
Dataflow equations
System of fixpoint equations in which unknowns
are solutions to the dataflow problem at different
points in program
Solve the system of equations as described in
previous lecture
For many problems, this gives the MOP solution
and for other problems, it gives a “safe”
approximation to MOP solution
Safe approximation to MOP solution
Not as precise as MOP solution but will not lead
to incorrect optimization
Example: an analysis that says no variable is a
constant anywhere in the program is a safe
approximation to MOP
Granularity of dataflow equations
Statement-level: unknowns are associated with
input or output points of each statement
Basic-block level: more common
Dataflow Equations
{d3}
START
a:= ….
if p(a)
a:= ….
a:= ….
 …a...
d1
d2
d3
{d2,d3}
{d2}
{d1}
{START}
Reaching definitions: domain
 x4
START
a:= ….
if p(a)
a:= ….
a:= ….
 …a...
d1
d2
d3
x5
 x3
 x1
x0
Unknowns: reaching definitions at output of each statement.
Solution at each point will be some element of domain.
 x2
 x2
Domain is Power-set ({START,d1,d2,d3})
Reaching definitions: equations
 x4
START
a:= ….
if p(a)
a:= ….
a:= ….
 …a...
d1
d2
d3
x5
 x3
 x1
x0
 x2
 x2
x0 = {START}
x1 = {d1}
x2 = x1
x3 = {d2}
x4 = {d3}
x5 = x3 U x4
Give a name to output of each statement
For each statement, write down equation for
output as a function of its input(s).
Rule:
Reaching definitions: multiple variables
You can solve for one variable at a time, but it is better to solve for all
variables at the same time
Assignment to variable x will
 “kill” all definitions of x that reach input of assignment but will
pass through all other assignments
“generate” itself at output
Equation: Out = (In – Kill) U Gen
Notation: Dx = set of definitions of variable x
More complicated example
x1 = {START}
x2 = x1 U x7
x3 = (x2 – {d6,d3}) U {d3}
x4 = (x3 – {d4}) U {d4}
x5 = x4
x6 = (x5 – {d3,d6}) U {d6}
x7 = ((x6 U x5) – {d7,d8}) U d7
x8 = (x2 – {d7,d8}) U {d8}
Observations
System of equations can be solved to find least solution
Iterative method for solving equations will converge even though there
are an unbounded number of paths in program
Intuitively we need to consider only a finite number of paths to
find solution to reaching definitions problem since longer paths do
not give us more information
In general, when you have loops, the system of equations will have
multiple solutions. Exercise: convince yourself using the example in
the last slide
Which one do we want?
We want the least solution. So in iterative solution, initialize all
unknowns to {}.
 
 
17
Classification of dataflow
problems
Direction of information propagation
Forward-flow
: information propagates in
the direction of control-flow from input to
output
Backward-flow
: information propagates in
reverse direction of control-flow from
output to input
All-paths vs. any-path
All-paths problem
: dataflow fact is true at
some point in program if it is true along all
paths to/from that point
Any-path problem
: dataflow fact is true at
some point in program if it is true along
any path to/from that point
Classification of reaching definitions?
 
 
18
Examples
Forward
Backward
All-paths
Any-path
Constant propagation
Available expressions
Reaching definitions
Anticipatable expressions
Live variables
Recall
 
{x,y,z,d,c}
if (c)
x = y+1
y=2*z
if(d)
x=y+z
z=1
z=x
 
{}
 
{x}
A variable v is 
live
 at a point p in a CFG if
there is a path from p to END on which
there is a use of v before any definition of
v
We want to compute the set of variables
live at each point in program
Backward-flow problem: information at p
depends on paths from p to END
Any-path problem: variable is live at p if
there is any path from p to END that
satisfies the condition
Domain: power-set of set of variables in
procedure
END
START
 
 
20
Straight-line code
Relation between Live sets:
 
 
Live1 = ( Live2-
{x} 
) 
 
{y}
  
Live2 = ( Live3-
{y} 
) 
 
{z}
  
Live3 = ( Live4-
{}
 
 
 ) 
 
{d}
Relation:
 
  
in[I] = ( out[I] – def[I] ) 
 
use[I]
Information flows backward!
Instructions:
 can compute in[I] if we
know out[I]
Live1
x = y+1
Live2
y =2*z
Live3
if (d)
Live4
I1
I2
I3
21
in[B
1
]
B
1
Analyze Control Flow
A variables is live at end of block B if it is live at
the beginning of 
one (or more)
 successor blocks
Characterizes all possible program executions
Mathematically:
  
out[B] = 
 
in[B’]
 
B
out[B]
in[B
n
]
B
n
B’
 
 
succ(B)
Confluence operator is union
Compute least solution
Live variables
L10 = { }
L3 = {x} U (L10- {z})
L2 = L4 U L3 U {c}
L7 = L2 – {z}
L8 = {y,z} U (L7 – {x})
L6 = L8 U L7 U {d}
L5 = {z} U (L6 – {y})
L4 = {y} U (L5 – {x})
 
{x,y,z,d,c}
 
{x,y,d,c}
 
{y,z,d,c}
 
{x,y,d,c}
if (c)
x = y+1
y=2*z
if(d)
x=y+z
z=1
z=x
 
{}
 
{x}
 
{x,y,z,d,c}
 
{x,z,d,c}
 
{x,y,z,d,c}
2
3
4
5
6
7
8
10
START
END
Available expressions
An expression “x op y” is available at
p if every path from START to p
contains an evaluation of x op y after
which there are no assignments to x
or y.
Classification: forward-flow, all-paths
Domain: power-set of expressions of
interest
Equation for “x = y op z”:
Out = (In – Ex) U {“y op z”}
Confluence operator: intersection
Compute greatest solution: start by
assuming all expressions are available
everywhere except at START and
iterate
Anticipatable expressions
An expression “x op y” is
anticipatable at p if every path
from p to END contains an
evaluation of “x op y” before any
assignment to x or y.
Classification: backward-flow, all-
paths
Domain: power-set of expressions
of interest
Equation for “x = y op z”:
In = (Out – Ex) U {“y op z”}
Confluence operator: intersection
Compute greatest solution: start
by assuming all expressions are
anticipatable everywhere except at
END and iterate
Constant propagation
A variable x is a constant c at a point
p if x has the value c at that point on
all paths from START to that point.
Classification: forward-flow, all-paths
Perform constant propagation on set
of variables V
Dataflow information at point p:
vector of size |V| where each value
comes from domain
Equation for statement  x = e
OUT = IN [(Eval(e) in IN) / x]
Confluence operation: join
Initialize all vectors to [
,
,
,..
]
except at START where it is [T,T,..T]
….   -2   -1   0   1   2 …
T
Domain for
constant propagation
Definitely not
constant
May or may not
be constant
Constant propagation (contd.)
Solution to dataflow
equations does not give
meet-over-paths solution in
general
Gives a safe approximation
Example on left:
In B5, x+y is constant 5 but
dataflow solution will give T,
which is a safe approximation
START
if (c)
x = 3
y = 2
x = 2
y = 3
…x+y..
B5
[2,3]
[3,2]
[T,T]
[T,T]
[T,T]
[T,T]
Dataflow vector: [x,y]
Implementing constant propagation
Using vectors to represent values of
variables is inefficient
Assignment is to one variable but values
of other variables must be copied from
input vector to output vector
Idea: can we use def-use chains to
perform constant propagation?
If we have an assignment “x = 3”, find
all uses reached only by this definition
and replace use with constant 3.
This works but it may not find all the
constants that the CFG algorithm
does
Solution: use a variant of def-use chains
called the 
static single assignment form
(SSA)
See later
OUT = IN [(Eval(e) in IN) / x]
x = e
IN
OUT
SOLVING DATAFLOW EQUATIONS
USING ELIMINATION
High-level idea
Solving systems of linear equations
Iterative methods: Jacobi, Gauss-Seidel,..
Elimination-based methods: Gaussian elimination
Dataflow equations
We have seen iterative method
For structured programs, we can use elimination-
based methods and avoid iteration
For unstructured programs, elimination can be
used to reduce the number of equations given to
iterative solver
Intuitive idea of elimination
Reaching-definitions
Basic-block level equations
…..
Out[B5] = Out[B4]
Out[B6] = {d7} U (Out[B5]- Dz)
Out[B7] = {d9} U (Out[B5]- Dz)
…..
Replace equations with this
   …..
   Out[F] = {d7,d9} U (Out[B4]-Dz)
   ……
Solve equations iteratively and
then interpolate solution into F
 
START
if (c)
z=w*w
z = x+y
…...
 
……
d7
d9
B4
B5
B6
B7
B8
 
F
….
General idea for reaching
definitions
gen[R]: set of definitions in R
for which there is a path from
that definition to exit free of
other definitions of that variable
kill[R]: set of definitions in R
that do not reach exit even if
they reach entry of R
Equation for region R:
out[R] = gen[R] U (in[R] – kill[R])
entry
exit
Elimination and interpolation
Elimination-based methods
Same idea can be used for other
dataflow problems
Compute gen and kill sets
For structured programs
Can solve dataflow equations without
iteration
Can compute dataflow solution from
AST without building control-flow graph
For unstructured programs
Generalization of loop: interval
Single-entry multiple-exit loop
Reducible program: program that can
be decomposed into nested intervals
(Cocke and Allen)
Irreducible program: needs iteration
John Cocke (in wheelchair)
and Fran Allen (pink shirt).
Both won the Turing Award
Summary
Program analysis is needed to perform optimizations safely
Analysis must consider all possible execution paths
Meet over paths (MOP) solution
Dataflow analysis
Solve a set of fixpoint equations to find dataflow information
Domain: usually a powerset ordered by containment
Transfer function for assignment statements
Confluence operator: usually either union or intersection
Type of solution
Confluence operator is union: least solution
Confluence operator is intersection: greatest solution
For many problems, dataflow solution gives MOP solution. For other
problems like constant propagation, it gives a safe approximation to MOP.
In practice
Compute transfer function for entire basic blocks so you have as many
unknowns as basic blocks rather than statements, and then iterate
For structured programs, you can avoid iteration entirely: elimination-
based methods
Sets are represented using bit-vectors and union/intersection are
implemented using bitwise OR/AND
Slide Note
Embed
Share

Dataflow analysis is crucial for optimizing programs by identifying opportunities for improvements like constant propagation and dead code elimination. This analysis involves modeling values of interest, formulating fixpoint equations, and solving them to enhance program efficiency. Concepts such as statement definitions, uses, live variables, and use/def for each 3-address operation play a vital role in this process. Aliasing, dataflow information, and program points are also key aspects in the realm of program optimization.

  • Dataflow Analysis
  • Program Optimization
  • Constant Propagation
  • Dead Code Elimination
  • Aliasing

Uploaded on Oct 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Dataflow Analysis 1

  2. Overview To perform optimizations like constant propagation or dead code elimination, we must Analyze program to find opportunities for performing optimizations safely Transform program Analysis is called dataflow analysis Model values of interest as domain Formulate system of fixpoint equations whose unknowns are the required information at different points in program Solve system of equations Later we will study other types of program analyses 2

  3. Some concepts A statement is a definition of a variable v if it may write to v. A statement is a use of variable v if it may read from v. A variable v is live at a point p in a CFG if there is a path from p to END on which there is a use of v before any definition of v. {x,y,z,d,c} if (c) x = y+1 y=2*z if(d) {x} x=y+z z=1 z=x {} 3

  4. Use/Def for each 3-address op if I is x = y OP z : use[I] = {y, z} def[I] = {x} if I is x = OP y : use[I] = {y} if I is x = y : use[I] = {y} if I is if (x) : use[I] = {x} if I is return x : use[I] = {x} if I is x = f(y1 , , yn) : use[I] = {y1, , yn} def[I] = {x} def[I] = {x} def[I] = {} def[I] = {} def[I] = {x} 4

  5. Aliasing Two or more program names for the same memory location Can happen if you have pointers or call-by-reference parameters x = 3 y = @x *y = 5 x x and *y are aliases Aliasing complicates def/use computation We will assume no aliasing for now so variable name is enough to determine defs and uses

  6. Dataflow information Facts about program execution that are useful for optimization at compile-time Examples: Is the value of x always a constant at this point of the program? If so, what is the value? Analysis: constant propagation Given a use of x in a statement, which assignment statements for x could have written the value at that use? Analysis: reaching definitions Is this computation of x+y redundant because it has been computed earlier in the program? Analysis: common subexpression elimination

  7. Program Points Two program points for each instruction: There is a program point before each instruction There is a program point after each instruction Point before x = y+1 Point after In a basic block: Program point after an instruction = program point before the successor instruction 7

  8. Program Points: Example Multiple successor blocks means that point at the end of a block has multiple successor program points x = y+1 y =2*z if (d) Depending on the execution, control flows from a program point to one of its successors x = y+z z = 1 8

  9. Reaching definitions A definition d of variable v is said to reach a point p if there is a path from START to p that contains d there are no other definitions of v on that path after d Convention: START has a definition for all variables of interest How do we compute reaching definitions at each point of program? To keep things simple, we consider only one variable in general, you would perform analysis for all variables of interest START {START} d1 a:= . {d1} if p(a) d2 d3 a:= . {d2} a:= . {d3} a... {d2,d3}

  10. One solution: meet-over-paths To find definitions that reach a point p Enumerate all paths from START to p For each path, find the definition that reaches p Compute the union of these definitions This is called the meet-over-paths (MOP)solution Confluence operation: How do we combine information from different paths to a given point? Problem: If program has loops, set of paths to p will be unbounded Better idea: dataflow equations START {START} d1 a:= . {d1} if p(a) d2 d3 a:= . {d2} a:= . {d3} a... {d2,d3}

  11. Dataflow Equations Dataflow equations System of fixpoint equations in which unknowns are solutions to the dataflow problem at different points in program Solve the system of equations as described in previous lecture For many problems, this gives the MOP solution and for other problems, it gives a safe approximation to MOP solution Safe approximation to MOP solution Not as precise as MOP solution but will not lead to incorrect optimization Example: an analysis that says no variable is a constant anywhere in the program is a safe approximation to MOP Granularity of dataflow equations Statement-level: unknowns are associated with input or output points of each statement Basic-block level: more common START {START} d1 a:= . {d1} if p(a) d2 d3 a:= . {d2} a:= . {d3} a... {d2,d3}

  12. Reaching definitions: domain START x0 d1 a:= . x1 if p(a) x2 Domain is Power-set ({START,d1,d2,d3}) x2 d2 d3 a:= . x3 a:= . x4 a... x5 Unknowns: reaching definitions at output of each statement. Solution at each point will be some element of domain.

  13. Reaching definitions: equations START Give a name to output of each statement x0 d1 For each statement, write down equation for output as a function of its input(s). a:= . x1 Rule: if p(a) x2 x2 xi1 xi2 dn d2 d3 (v==a): equation is xo = {dn} otherwise: xo = xi1 U xi2 v:= . a:= . x3 a:= . xo x4 x0 = {START} x1 = {d1} x2 = x1 x3 = {d2} x4 = {d3} x5 = x3 U x4 a... x5

  14. Reaching definitions: multiple variables You can solve for one variable at a time, but it is better to solve for all variables at the same time Assignment to variable x will kill all definitions of x that reach input of assignment but will pass through all other assignments generate itself at output Equation: Out = (In Kill) U Gen Notation: Dx = set of definitions of variable x xi1 xi2 dn Dataflow equation: xo = ((xi1 U xi2) Dv) U {dn} v:= . xo

  15. More complicated example START 1 x1 = {START} x2 = x1 U x7 x3 = (x2 {d6,d3}) U {d3} x4 = (x3 {d4}) U {d4} x5 = x4 x6 = (x5 {d3,d6}) U {d6} x7 = ((x6 U x5) {d7,d8}) U d7 x8 = (x2 {d7,d8}) U {d8} 2 if (c) x = y+1 3 y=2*z 4 if(d) 5 x=y+z 6 7 z=1 z=x 8

  16. Observations System of equations can be solved to find least solution Iterative method for solving equations will converge even though there are an unbounded number of paths in program Intuitively we need to consider only a finite number of paths to find solution to reaching definitions problem since longer paths do not give us more information In general, when you have loops, the system of equations will have multiple solutions. Exercise: convince yourself using the example in the last slide Which one do we want? We want the least solution. So in iterative solution, initialize all unknowns to {}.

  17. Classification of dataflow problems Direction of information propagation Forward-flow: information propagates in the direction of control-flow from input to output Backward-flow: information propagates in reverse direction of control-flow from output to input All-paths vs. any-path All-paths problem: dataflow fact is true at some point in program if it is true along all paths to/from that point Any-path problem: dataflow fact is true at some point in program if it is true along any path to/from that point Classification of reaching definitions? START p END 17

  18. Examples All-paths Any-path Constant propagation Available expressions Reaching definitions Forward Anticipatable expressions Live variables Backward START p END 18

  19. Recall START {x,y,z,d,c} if (c) A variable v is live at a point p in a CFG if there is a path from p to END on which there is a use of v before any definition of v x = y+1 We want to compute the set of variables live at each point in program y=2*z Backward-flow problem: information at p depends on paths from p to END if(d) Any-path problem: variable is live at p if there is any path from p to END that satisfies the condition x=y+z z=1 Domain: power-set of set of variables in procedure {x} z=x {} END

  20. Straight-line code Relation between Live sets: Live1 = ( Live2-{x} ) {y} Live2 = ( Live3-{y} ) {z} Live3 = ( Live4-{} ) {d} Live1 x = y+1 Live2 y =2*z Live3 if (d) Live4 I1 I2 Relation: in[I] = ( out[I] def[I] ) use[I] I3 Information flows backward! in[I] I Instructions: can compute in[I] if we know out[I] out[I] 20

  21. Analyze Control Flow A variables is live at end of block B if it is live at the beginning of one (or more) successor blocks Characterizes all possible program executions Mathematically: out[B] = in[B ] B out[B] in[B1] B1 in[Bn] Bn B succ(B) Confluence operator is union Compute least solution 21

  22. Live variables START {x,y,z,d,c} if (c) L10 = { } L3 = {x} U (L10- {z}) L2 = L4 U L3 U {c} L7 = L2 {z} L8 = {y,z} U (L7 {x}) L6 = L8 U L7 U {d} L5 = {z} U (L6 {y}) L4 = {y} U (L5 {x}) 2 {x,y,z,d,c} x = y+1 {x,z,d,c} 4 y=2*z 5 {x,y,z,d,c} 6 if(d) {y,z,d,c} {x,y,d,c} 8 x=y+z {x,y,d,c} 7 z=1 {x} 3 z=x {} 10 END

  23. Available expressions An expression x op y is available at p if every path from START to p contains an evaluation of x op y after which there are no assignments to x or y. Classification: forward-flow, all-paths Domain: power-set of expressions of interest Equation for x = y op z : Out = (In Ex) U { y op z } Confluence operator: intersection Compute greatest solution: start by assuming all expressions are available everywhere except at START and iterate START START if (c) if (c) z = x+y t = x+y z=t w=x+y t=x+y w=t B5 B5 x+y.. t.. Computation in B5 is totally redundant .

  24. Anticipatable expressions An expression x op y is anticipatable at p if every path from p to END contains an evaluation of x op y before any assignment to x or y. Classification: backward-flow, all- paths Domain: power-set of expressions of interest Equation for x = y op z : In = (Out Ex) U { y op z } Confluence operator: intersection Compute greatest solution: start by assuming all expressions are anticipatable everywhere except at END and iterate START START if (c) if (c) z = x+y t = x+y z=t t=x+y B5 B5 x+y.. t.. Computation in B5 is partially redundant . in B5 is removed. Partial redundancy

  25. Constant propagation A variable x is a constant c at a point p if x has the value c at that point on all paths from START to that point. Classification: forward-flow, all-paths Perform constant propagation on set of variables V Dataflow information at point p: vector of size |V| where each value comes from domain Equation for statement x = e OUT = IN [(Eval(e) in IN) / x] Confluence operation: join Initialize all vectors to [ , , ,..] except at START where it is [T,T,..T] Definitely not constant T . -2 -1 0 1 2 May or may not be constant Domain for constant propagation

  26. Constant propagation (contd.) Solution to dataflow equations does not give meet-over-paths solution in general Gives a safe approximation Example on left: In B5, x+y is constant 5 but dataflow solution will give T, which is a safe approximation START [T,T] if (c) [T,T] [T,T] x = 2 y = 3 x = 3 y = 2 [3,2] [2,3] [T,T] x+y.. B5 Dataflow vector: [x,y]

  27. Implementing constant propagation Using vectors to represent values of variables is inefficient Assignment is to one variable but values of other variables must be copied from input vector to output vector Idea: can we use def-use chains to perform constant propagation? If we have an assignment x = 3 , find all uses reached only by this definition and replace use with constant 3. This works but it may not find all the constants that the CFG algorithm does Solution: use a variant of def-use chains called the static single assignment form (SSA) See later IN x = e OUT OUT = IN [(Eval(e) in IN) / x]

  28. SOLVING DATAFLOW EQUATIONS USING ELIMINATION

  29. High-level idea Solving systems of linear equations Iterative methods: Jacobi, Gauss-Seidel,.. Elimination-based methods: Gaussian elimination Dataflow equations We have seen iterative method For structured programs, we can use elimination- based methods and avoid iteration For unstructured programs, elimination can be used to reduce the number of equations given to iterative solver

  30. Intuitive idea of elimination Reaching-definitions Basic-block level equations .. Out[B5] = Out[B4] Out[B6] = {d7} U (Out[B5]- Dz) Out[B7] = {d9} U (Out[B5]- Dz) .. Replace equations with this .. Out[F] = {d7,d9} U (Out[B4]-Dz) Solve equations iteratively and then interpolate solution into F START . B4 F B5 if (c) d9 d7 z = x+y B6 z=w*w B7 ... B8

  31. General idea for reaching definitions gen[R]: set of definitions in R for which there is a path from that definition to exit free of other definitions of that variable kill[R]: set of definitions in R that do not reach exit even if they reach entry of R Equation for region R: out[R] = gen[R] U (in[R] kill[R]) entry exit

  32. Elimination and interpolation

  33. Elimination-based methods Same idea can be used for other dataflow problems Compute gen and kill sets For structured programs Can solve dataflow equations without iteration Can compute dataflow solution from AST without building control-flow graph For unstructured programs Generalization of loop: interval Single-entry multiple-exit loop Reducible program: program that can be decomposed into nested intervals (Cocke and Allen) Irreducible program: needs iteration John Cocke (in wheelchair) and Fran Allen (pink shirt). Both won the Turing Award

  34. Summary Program analysis is needed to perform optimizations safely Analysis must consider all possible execution paths Meet over paths (MOP) solution Dataflow analysis Solve a set of fixpoint equations to find dataflow information Domain: usually a powerset ordered by containment Transfer function for assignment statements Confluence operator: usually either union or intersection Type of solution Confluence operator is union: least solution Confluence operator is intersection: greatest solution For many problems, dataflow solution gives MOP solution. For other problems like constant propagation, it gives a safe approximation to MOP. In practice Compute transfer function for entire basic blocks so you have as many unknowns as basic blocks rather than statements, and then iterate For structured programs, you can avoid iteration entirely: elimination- based methods Sets are represented using bit-vectors and union/intersection are implemented using bitwise OR/AND

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#