Ensuring Equivalence in Compiler Optimization Programs

undefined
Rahul Sharma, Eric Schkufza, Berkeley Churchill,
Alex Aiken
Prove two programs are equivalent
Compiler optimizations
Validate refactorings
Cross checking different implementations
Old and well studied problem
Undecidable in general
Major challenge: prove equivalence of loops
Straight line programs relatively easy
Prove equivalence of two binaries
while …
Trustworthy Compiler
CompCert, gcc –O0
Optimizing Compiler
gcc –O3, icc –O3
Straight
Line
Code
Trustworthy Compiler
CompCert, gcc –O0
STOKE (ASPLOS 13)
Random mutations
while …
Do not support “while” loops: [CHR00],
[FH02], [FH05], [AEF
+
05], [SBC
+
05], [MSF06]
Do not reason about termination: [SDE
+
08],
[GS09], [RE11], [LHM
+
12],  [PY13], [LMS
+
13]
Translation validation: [Nec00],[GZB05], …
Need information from the compiler
Decompose proof
 
movq 8(rsp), rdi
#rdi != 0
movq 8(rsp), rdi
decq rdi
movq rdi, 8(rsp)
retq
 
movq 8(rsp), r9
#r9 != 0
decq r9
retq
 
a
 
a’
 
b
 
b’
 
c
 
c’
 
Given a simulation relation, proofs for loops
reduce to proofs for loop free fragments
Use decision procedures
 
Main challenge: infer a simulation relation
Infer synchronization points
Infer invariants
 
We use compilers as black boxes
 
Mine relations from concrete executions
 
 
Run some tests to get 
data
From executions, unit tests, random tests, etc.
 
B
retq
B’
retq
2n
n
B;B
n
Attempt to detect synchronization points
Number of times program points are executed
Values align
 
movq 8(rsp), rdi
#rdi != 0
movq 8(rsp), rdi
decq rdi
movq rdi, 8(rsp)
retq
movq 8(rsp), r9
#r9 != 0
decq r9
retq
n
1
n
n+1
n+1
n
Invariants are restricted to equalities
Infer invariants from observed data values
movq 8(rsp), rdi
#rdi != 0
movq 8(rsp), rdi
decq rdi
movq rdi, 8(rsp)
retq
Invariants are restricted to equalities
Infer invariants from observed data values
 
movq 8(rsp), r9
#r9 != 0
decq r9
retq
 
 
The executions are synchronized
The invariants are maintained
 
movq 8(rsp), rdi
#rdi != 0
movq 8(rsp), rdi
decq rdi
movq rdi, 8(rsp)
retq
 
movq 8(rsp), r9
#r9 != 0
decq r9
retq
 
a
 
a’
 
b
 
b’
 
c
 
c’
States equal
Live outs equal
 
The executions are synchronized
The invariants  are maintained
 
Queries in quantifier free bitvector arithmetic
Complete SMT solvers!
Incorporate counter-examples in relations
 
Sound but not complete
If checking succeeds then equivalent
Can fail to infer a sound simulation relation
Insufficient data to infer a sound relation
Expressiveness of invariants
Inequalities, quantifiers, etc.
Expressiveness of SMT solver
Floating point, multiply, divide, etc.
Run tests and generate data
https://github.com/eschkufz/x64asm
Nullspace computation
libIML: integer matrix library
SMT solver: Z3
Compute kernel inside OpenSSL
Validating CompCert against gcc
Stochastic optimization for loops
Multiplication kernel
Extensive performance tests
Run the kernel ~15 million times
Choose 16 random tests for inference
Compile with gcc –O0 and gcc –O3
Successfully prove equivalence
Prove equivalence of loops in two stages
Infer simulation relation
Check the inferred relation using SMT solvers
Use runtime data for inference
No change required to the compilers
Better verifiers lead to better optimizers
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz,
and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci.
Comput. Program., 69(1-3):35–45, 2007
 T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to
discover polynomial and array invariants. ICSE 2012
P. Garg, C. Löding, P. Madhusudan, D. Neider: Learning Universally Quantified
Invariants of Linear Data Structures. CAV 2013
R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven
Approach for Algebraic Loop Invariants. ESOP 2013
R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as
Learning Geometric Concepts. SAS 2013
A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013
Slide Note
Embed
Share

Explore the challenges of proving equivalence in compiler optimization programs, validate refactorings, and analyze the trustworthiness of compilers through binary equivalence testing. Learn about handling loops, utilizing decision procedures, and running tests to confirm program behavior.

  • Compiler optimization
  • Equivalence proof
  • Trustworthy compilers
  • Loop handling
  • Decision procedures

Uploaded on Sep 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Rahul Sharma, Eric Schkufza, Berkeley Churchill, Alex Aiken

  2. Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations Old and well studied problem Undecidable in general Major challenge: prove equivalence of loops Straight line programs relatively easy

  3. Prove equivalence of two binaries Trustworthy Compiler CompCert, gcc O0 ? while Confidence of ?, Performance of ? Optimizing Compiler gcc O3, icc O3 ?

  4. while Straight Line Code Trustworthy Compiler CompCert, gcc O0 ? ? STOKE (ASPLOS 13) Random mutations

  5. Do not support while loops: [CHR00], [FH02], [FH05], [AEF+05], [SBC+05], [MSF06] Do not reason about termination: [SDE+08], [GS09], [RE11], [LHM+12], [PY13], [LMS+13] Translation validation: [Nec00],[GZB05], Need information from the compiler

  6. Decompose proof Rewrite ? a Target ? movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 #r9 != 0 b b decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq c c ??: states equal ??: 8(rsp)=rdi=r9 ??: live out equal

  7. Given a simulation relation, proofs for loops reduce to proofs for loop free fragments Use decision procedures Main challenge: infer a simulation relation Infer synchronization points Infer invariants We use compilers as black boxes Mine relations from concrete executions

  8. Run some tests to get data From executions, unit tests, random tests, etc.

  9. Ensure the loops iterate for equal iterations Use data to align ? and ? Target ? 2n n Rewrite ? n B;B B B retq retq

  10. Attempt to detect synchronization points Number of times program points are executed Values align n Rewrite ? Target ? movq 8(rsp), r9 movq 8(rsp), rdi #rdi != 0 n+1 #r9 != 0 n+1 decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) n retq 1 n

  11. Invariants are restricted to equalities Infer invariants from observed data values Target ? 8( 8(rsp rsp) ) rdi rdi movq 8(rsp), rdi #rdi != 0 2 2 1 1 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq 0 0

  12. Invariants are restricted to equalities Infer invariants from observed data values Rewrite ? 8( 8(rsp rsp) ) rdi rdi r9 r9 movq 8(rsp), r9 2 2 2 1 1 1 #r9 != 0 0 0 0 decq r9 retq

  13. Mine all equalities 8( 8(rsp rsp) ) rdi rdi r9 r9 2 2 2 Find all ? s.t. ?? = 0 ? 1 1 1 Nullspace or kernel 0 0 0 ?1= [ 1,1,0] ?2= [0,1, 1] ? 8 ??? = ??? ??? = ?9 ? 4??? = ??? + 3 10??? + ??? = ???

  14. The executions are synchronized The invariants are maintained Rewrite ? a States equal Target ? movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 8 ??? = ??? ??? = ?9 #r9 != 0 b b decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq c c Live outs equal

  15. The executions are synchronized The invariants are maintained Queries in quantifier free bitvector arithmetic Complete SMT solvers! Incorporate counter-examples in relations Sound but not complete If checking succeeds then equivalent Can fail to infer a sound simulation relation

  16. Insufficient data to infer a sound relation Expressiveness of invariants Inequalities, quantifiers, etc. Expressiveness of SMT solver Floating point, multiply, divide, etc.

  17. Run tests and generate data https://github.com/eschkufz/x64asm Nullspace computation libIML: integer matrix library SMT solver: Z3

  18. Compute kernel inside OpenSSL Validating CompCert against gcc Stochastic optimization for loops

  19. Multiplication kernel Extensive performance tests Run the kernel ~15 million times Choose 16 random tests for inference Compile with gcc O0 and gcc O3 Successfully prove equivalence

  20. Program Bansal Stoke vs gcc -O0 1.58X Stoke vs gcc O3 1.04X SAXPY 9.22X 1.48X

  21. Prove equivalence of loops in two stages Infer simulation relation Check the inferred relation using SMT solvers Use runtime data for inference No change required to the compilers Better verifiers lead to better optimizers

  22. M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35 45, 2007 T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. ICSE 2012 P. Garg, C. L ding, P. Madhusudan, D. Neider: Learning Universally Quantified Invariants of Linear Data Structures. CAV 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven Approach for Algebraic Loop Invariants. ESOP 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as Learning Geometric Concepts. SAS 2013 A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#