
Understanding Decompilers and Phases of Decompilation
Learn about decompilers, phases like Loading, Disassembly, Lifting, Dataflow Analysis, and more. Explore reverse compilation techniques in this comprehensive resource collection.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Resources Cifuentes Thesis: Reverse Compilation Techniques I will be posting other resources on the website
Phases of Decompilation 1. Loading 2. Disassembly 3. Lifting 4. Dataflow Analysis 5. Type Inference 6. CodeGen 7. Name Recovery???
Loading You ve done this!
Disassembly You ve done this too!
Lifting Intermediate Representation (IR) Concise description of instruction semantics Three-argument IR Examples: BAP, VEX Abstract across architectures Common abstractions? Representing memory? Registers? Organized in basic blocks
Data Flow Analysis Constant Propagation Calling Convention Analysis Condition Code Propagation Register Copy Elimination Stack Frame Analysis Dead Register & Condition Code Elimination Variable Recovery
Constant Propagation - resolving indirection - deobfuscation 1 2 3 4 5
Variable Recovery ra var_10 var_40 var_44 var_x var_y
Dead Condition Code Elimination Important when lifted to IR
Stack Frame Analysis Abstract Domain: {rsp, rbp} {1,2, } and and Track constant offsets off of rbp and rsp Determine overapproximation of stack from size at all program points in function
Control Flow Structuring if ( ) { while ( ) { goto lbl } } else if ( ) { } else { if ( ) { lbl: for ( ) { } } }
Control Flow Structuring Basic Pattern Matching Cifuentes Thesis Phoenix No More Goto Focus of next week
Type Inference Early approaches: Unification-Based Relies on the equality constraint Type(a) == Type(b) Problems: Overunification Not expressive enough (missing fields, pointer passing) TIE: subtying! Relies on the subtyping constraint ???? ???? ? ????(????(?)) Retypd: polymorphic types! existential types! (i.e. generics and interfaces)
Codegen Now we have functions, IR, etc. But how do we get C? Translate semantics of IR and control flow into pseudocode
Future Directions: Name Recovery - Now we have variables, but they don t mean much! - Can we automate recovering variables names? - DEEP LEARNING!!! - DIRE, Debin