Automated Program Repair and Anti-Patterns in Search-Based Program Repair
Automated program repair aims to fix bugs in software through fault localization, patch generation, and patch validation using search-based techniques. This research discusses anti-patterns, previous work, repair patterns from human patches, and challenges such as weak oracles in automated program repair.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Anti-patterns in Search-based Program Repair Shin Hwei Tan* Hiroaki Yoshida+ National University of Singapore* Fujitsu Laboratories of America, Inc.+ Mukul R. Prasad+ Abhik Roychoudhury* **Part of this work was conducted during at an internship at Fujitsu Laboratories of America
What is automated program repair? BUG! Given failing Test T , buggy program P 1.Fault localization- Where to fix? 2. Patch Generation - How to fix? 3. Patch Validation Are all tests passing?
Previous work on Search-Based Program Repair GenProg [ICSE '12] PAR [ICSE '13] SPR [FSE '15] Genetic Programming Improved version for adaptive equivalent search (AE) Mutations & crossovers Evolutionary Algorithm Specialized Search with condition synthesis Patterns with and without abstract condition Search Patterns learned from Human Patches Operators C Programs Java Programs C Programs Program
Search-Based Program Repair Final Patch Search-Based Repair Tools Tests Fail How do the tests look like? All Tests Pass Patch Generation Tests contains at least one failing test Candidate Patches How do the patches look like? Patch Evaluation
Search-Based Program Repair Test Script $command $argument1 $argument2 RETVAL=$? [ $RETVAL -eq 0 ] && echo Success [ $RETVAL -ne 0 ] && echo Failure Check exit status of command Non-zero exit status denotes test failure Tests Candidate Patches - exit(-2); Patch Evaluation
Repair patterns from human patches Human patches Automatic Program Repair int foo(){ + if(input1) + return(out1) //compute something } Conditional Control Flow: +if(a) + return b; that can be enforced on top of any search-based repair tool. Anti-patterns Set of generic forbidden transformations
Problem: Weak Oracle Failing Test Script $command $argument1 $argument2 RETVAL=$? [ $RETVAL -eq 0 ] && echo Success [ $RETVAL -ne 0 ] && echo Failure Test outcome determined by exit status Statements like exit call/assertions serve as test proxies Test proxies should not be randomlymanipulated A1: Anti-delete CFG exit node Remove return statements, exit calls, functions with the word error , assertions. static void BadPPM(char* file) { fprintf(stderr, "%s: Not a PPM file.\n", file); - exit(-2); }
Problem: Inadequate Test Coverage Repair tools allow removal of code as long as all test passes Statements are mistakenly considered as redundant code Anti-patterns: A2: Anti-delete Control Statement A3: Anti-delete Single-statement CFG A4: Anti-delete Set-Before-If A2: Anti-delete Control Statement Remove control statements (e.g., if-statements, switch- statements, loops). call_result = call_user_function_ex(...); - if (call_result == SUCCESS && ...) { - if (SUCCESS == statbuf_from_array(...)) - ret = 0; - } else if (call_result == FAILURE) {
Problem: Non-termination Automatically generated patches may incorrectly removes loop update Cause infinite loop A5:Anti-delete Loop-Counter Update Remove assignment statement A inside loop L if: ??? ?? ??????????? ????????? ?? ? {??? ?? ??? ?? ?????????? ?} = while( x> 5) - x++;
Problem: Trivial Patch Trivial patch patch that insert return-statements based on expected output Ex: +if(test1) + return out1 A6: Anti-append Early Exit Insert return/goto statement at any location except for after the last statement in a CFG node. + if ((type != 0)) + return; zend_error((1<<3L),"Uninitialized string offset:",...);
Problem: Functionality Removal Removes functionality by inserting T/F A7: Anti-append Trivial Conditions Insert trivial condition. A condition is trivial if and only if it is: 1) True/False Constant 2) Tautology/Contradiction in expression (e.g., if(x || y || !y)) 3) Static analysis (e.g., if(x || y != 0), y is initialized) - if ((fmap[j].key != format->ptr[i + 1])) + if ((fmap[j].key != format->ptr[i + 1]) && !(1)) continue;
Integrating Anti-patterns Final Patch Search-Based Repair Tools Tests Fail Patch Generation All Tests Pass Candidate Patches Tests contains at least one failing test Is Anti-pattern? Patch Evaluation YES NO
Experimental Setup Tests Subjects coreutils findutils grep make php libtiff python gmp gzip wireshark fbc lighthttpd Evaluation on: 1. GenProg AE Uses Adaptive Equivalent Search 2. SPR Uses staged repair algorithm with repair patterns 3. dGenProg GenProg without deletion operator Description kLoC File, Shell and Text manipulation Utility Directory Searching Utility Pattern Matching Utility Program executable generation utilities Programming Language Image Processing Library Programming Language 83.1 18.0 9.4 35.3 1046 4772 1054 1582 528 8471 Corebench benchmark 77 78 35 407 145 491 2814 97 62 GenProg benchmark Math Library 146 12 63 773 295 Data Compression Utility Network Packet Analyzer Compiler Web Server
Evaluation of Anti-patterns How do anti-patterns affect the quality of patches generated by search- based program repair tools? How many nonsensical patches can our anti-patterns eliminate to reduce manual inspection costs? When our modified tools produce the same patch, what is the speedup that we achieve? How does the use of anti-patterns compare to an approach that simply prohibits deletion? RQ1: RQ2: RQ3: RQ4:
Measuring Patch Quality (RQ1) Human generated patch: H Patch generated by Tool: A Original Repair Tool: To Modified Repair Tool: Tm Localizes Correct Line. A and H modify the same line Localizes Correct Function but Incorrect Line. A and H modify statements within the same function. Removes Less Functionality. A removes or skips over fewer lines of code from original program.
RQ1: Patch Quality mGenProg outperforms Genprog in: Localizes Correct Line Localizes Correct Function but Incorrect Line Removes Less Functionality Similar improvement in mSPR
RQ4: Comparison with GenProg without deletion (dGenProg) mGenProg outperforms dGenprog in: Localizes Correct Line Localizes Correct Function but Incorrect Line mGenProg removes less functionality than GenProg without deletion! dGenProg removes functionality via insertions of return/goto statements
RQ2: Reducing Manual Inspection Cost Figure 1: Number of Patches Found by SPR vs. mSPR 35 30 25 20 15 10 5 0 coreutils findutils grep make SPR mSPR Due to batch compilation, SPR may produce multiple patches Overall, SPR generates 87 patches, mSPR only generates 54 patches. All 33 additional patches are plausible but incorrect patches. Anti-patterns reduce manual inspection cost by eliminating nonsensical repairs.
RQ3: Speedup Repair Space Reduction (%) = 1-( ???????? ????? ?????? ?????????? ???????? ????? ?????? ??????????)*100 Repair Space Reduction (%) 70 61 mGenProg MSPR 60 47 50 43 43 43 42 41 38 37 37 36 35 40 32 31 31 26 30 22 20 19 15 20 9 10 0 0 Repair Time Speedup = ???????? ?????? ???? Modified Repair Time mGenProg s Average Repair Time Speedup: 1.42x mSPR s Average Repair Time Speedup: 1.69x Anti-patterns reduce overall repair time via repair space pruning.
Conclusions and Future Work Enforcing anti-patterns leads to patches that delete less functionality and better fix localization. Tools integrated with anti-patterns generate patches faster due to repair space reduction. Future Work Anti-patterns as specification for guiding repair Anti-patternsas selected code smells Tools integrated with anti-patterns can be used as fix localization tools Adapt anti-patterns to other search-based software engineering activities (e.g., specific code anti-patterns identifying energy hot-spots)