Ensuring Equivalence in Compiler Optimization Programs
Explore the challenges of proving equivalence in compiler optimization programs, validate refactorings, and analyze the trustworthiness of compilers through binary equivalence testing. Learn about handling loops, utilizing decision procedures, and running tests to confirm program behavior.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations Old and well studied problem Undecidable in general Major challenge: prove equivalence of loops Straight line programs relatively easy
Prove equivalence of two binaries Trustworthy Compiler CompCert, gcc O0 ? while Confidence of ?, Performance of ? Optimizing Compiler gcc O3, icc O3 ?
while Straight Line Code Trustworthy Compiler CompCert, gcc O0 ? ? STOKE (ASPLOS 13) Random mutations
Do not support while loops: [CHR00], [FH02], [FH05], [AEF+05], [SBC+05], [MSF06] Do not reason about termination: [SDE+08], [GS09], [RE11], [LHM+12], [PY13], [LMS+13] Translation validation: [Nec00],[GZB05], Need information from the compiler
Decompose proof Rewrite ? a Target ? movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 #r9 != 0 b b decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq c c ??: states equal ??: 8(rsp)=rdi=r9 ??: live out equal
Given a simulation relation, proofs for loops reduce to proofs for loop free fragments Use decision procedures Main challenge: infer a simulation relation Infer synchronization points Infer invariants We use compilers as black boxes Mine relations from concrete executions
Run some tests to get data From executions, unit tests, random tests, etc.
Ensure the loops iterate for equal iterations Use data to align ? and ? Target ? 2n n Rewrite ? n B;B B B retq retq
Attempt to detect synchronization points Number of times program points are executed Values align n Rewrite ? Target ? movq 8(rsp), r9 movq 8(rsp), rdi #rdi != 0 n+1 #r9 != 0 n+1 decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) n retq 1 n
Invariants are restricted to equalities Infer invariants from observed data values Target ? 8( 8(rsp rsp) ) rdi rdi movq 8(rsp), rdi #rdi != 0 2 2 1 1 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq 0 0
Invariants are restricted to equalities Infer invariants from observed data values Rewrite ? 8( 8(rsp rsp) ) rdi rdi r9 r9 movq 8(rsp), r9 2 2 2 1 1 1 #r9 != 0 0 0 0 decq r9 retq
Mine all equalities 8( 8(rsp rsp) ) rdi rdi r9 r9 2 2 2 Find all ? s.t. ?? = 0 ? 1 1 1 Nullspace or kernel 0 0 0 ?1= [ 1,1,0] ?2= [0,1, 1] ? 8 ??? = ??? ??? = ?9 ? 4??? = ??? + 3 10??? + ??? = ???
The executions are synchronized The invariants are maintained Rewrite ? a States equal Target ? movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 8 ??? = ??? ??? = ?9 #r9 != 0 b b decq r9 retq movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq c c Live outs equal
The executions are synchronized The invariants are maintained Queries in quantifier free bitvector arithmetic Complete SMT solvers! Incorporate counter-examples in relations Sound but not complete If checking succeeds then equivalent Can fail to infer a sound simulation relation
Insufficient data to infer a sound relation Expressiveness of invariants Inequalities, quantifiers, etc. Expressiveness of SMT solver Floating point, multiply, divide, etc.
Run tests and generate data https://github.com/eschkufz/x64asm Nullspace computation libIML: integer matrix library SMT solver: Z3
Compute kernel inside OpenSSL Validating CompCert against gcc Stochastic optimization for loops
Multiplication kernel Extensive performance tests Run the kernel ~15 million times Choose 16 random tests for inference Compile with gcc O0 and gcc O3 Successfully prove equivalence
Program Bansal Stoke vs gcc -O0 1.58X Stoke vs gcc O3 1.04X SAXPY 9.22X 1.48X
Prove equivalence of loops in two stages Infer simulation relation Check the inferred relation using SMT solvers Use runtime data for inference No change required to the compilers Better verifiers lead to better optimizers
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35 45, 2007 T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. ICSE 2012 P. Garg, C. L ding, P. Madhusudan, D. Neider: Learning Universally Quantified Invariants of Linear Data Structures. CAV 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven Approach for Algebraic Loop Invariants. ESOP 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as Learning Geometric Concepts. SAS 2013 A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013