Typed Assembly Language and Type Inference in Program Compilation
The provided content discusses the significance of typed assembly languages, certifying compilers, and the role of type inference in program compilation. It emphasizes the importance of preserving type information for memory safety and vulnerability prevention. The effectiveness of type inference methods, capable of type checking C# features, is highlighted alongside challenges in existing compilers. Additionally, it addresses issues such as broken C# pseudo-assembly and the need for trustworthy proof checkers.
Uploaded on Sep 28, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Inferable Object Typed Assembly Language Ross Tate, Juan Chen, Chris Hawblitzel Inferable Object- -Oriented Typed Assembly Language Oriented
Typed Assembly Languages Compilers are great C# but they make mistakes and can introduce vulnerabilities Typed assembly language Certifying Compiler includes a proof of (memory) safety verified by a trusted proof checker no need to trust the compiler Certifying compilers TAL generate typed assembly language Trusted Proof Checker traditionally use type-preservation
Source Program sigs types/proofs IR1 annots Intermediate Representation Optimizations/Conversions Type-Preserving Compiler IR2 sigs types/proofs Class/Function Signatures annots Type/Proof Annotations Optimizations/Conversions sigs types/proofs annots x86 Proof Checker Burden to preserve types at each stage Hard to adopt in existing compilers Types/proofs increase size of executable
Source Program IR1 sigs Signature information is already preserved in traditional compilers Optimizations/Conversions Traditional Compiler IR2 sigs Optimizations/Conversions Easy to change compiler to write sig info to file x86 sigs Requires little change Smaller annotation size Type Inference Infer proof annotations Can inference be effective enough? x86 sigs annots Proof Checker
Effectiveness of Type Inference % of Inferable Methods 96.1%97.9% 100.0% 100.0% 100.0% 97.6% 97.4% 94.6% 100% 80% 60% 40% 20% 0% ahcbench mandelform sat_solver zinger lcscbench bartok asmlc Geomean Capable of type checking all C# features except: Exceptions and Delegates matters of implementation, not due to theoretical limitations
Typing C# Assembly Typing C# Assembly
Broken C# Pseudo-Assembly Could actually be an ArrayList Could actually be a LinkedList bool bad(a, b : List) { vt = a.vtable; mp = vt.isEmpty; c = mp(b); return c; } a s implementation of isEmpty may fail to work on b Grabs a s vtable Grabs a s implementation of isEmpty Calls a s isEmpty with b as this
Broken C# Pseudo-Assembly a and b are each instances of some (possibly different) Traditional TAL [PLDI 08] More specific function signature subclass of List bool bad(a, b : List) { bool bad(a, b : List. Ins( )) { vt = a.vtable; mp = vt.isEmpty; c = mp(b); return c; } c = mp(pack b as . Ins( )); open a as Ins( ); must be fresh a is given type exactly Ins( ) where List Pseudo-instruction for the type checker Via signature & memory layout information vt is given type VTable( ) mp is given type ( . Ins( )) bool open b as Ins( ); must be fresh b is given type exactly Ins( ) where List The this pointer must belong to Checks that there is some extending such that b has type Ins( ) Check fails since b has type Ins( ) and does not extend
Broken C# Pseudo-Assembly Traditional TAL [PLDI 08] Inferable TAL bool bad(a, b : List. Ins( )) { open a as Ins( ); No pack annotations vt = a.vtable; mp = vt.isEmpty; open b as Ins( ); c = mp(pack b as . Ins( )); No open annotations No loop invariants! c = mp(b); return c; Use type inference instead }
Inference Strategy Always open existential types as soon as possible Use subtyping in place of pack: : Given a valid substitution of variables Such that the bodies are subtypes after substitution Subsumes using open and pack [ ] . . Use abstract interpretation over existential types Then the existential types are subtypes Requires subtyping and join algorithms Subtyping alone of bounded existential types is undecidable! Constructive: includes abstract algorithms for inference Instructive: specifies type design guidelines Designed a category-theoretic framework for existential types
Type Checking with iTalX Inference Strategy Immediately open a and b List Ins( ) has fields: vtable : VTable( ) isEmpty : ( ) bool Check Fails does not extend bool bad(a, b : List. Ins( )) { vt = a.vtable; mp = vt.isEmpty; c = mp(b); return c; } Signature Information , : List, List. a : Ins( ) b : Ins( ) vt : VTable( ) mp : ( . Ins( )) bool Signature Information List VTable( ) has fields: Type Check Ins( ) . Ins( )
Evaluation of iTalX Evaluation of iTalX
Expressiveness of iTalX iTalX is capable of handling the following features: Classes, interfaces, generics, and multiple inheritance Dynamic dispatch and dynamic casts Covariant arrays as classes, and array-bounds checks By-reference parameters (ref), structs, and value types Jump tables and complex stack manipulation iTalX is also robust with respect to many optimizations iTalX should be able to handle the remaining features: Delegates and exceptions In experiments, iTalX currently verifies 97.9% of methods
Efficiency of iTalX Inference Time/Compilation Time 100% 80% 60% 34.7% 40% 16.0%8.2% 13.9% 13.7% 20% 9.3% 1.9% 1.3% 0% ahcbench mandelform sat_solver zinger lcscbench bartok asmlc Geomean Inferring Assembly-Level Types is Affordable
Type Annotation Size iTalX/Traditional TAL [PLDI 08] 100% 79% 80% 59% 56% 60% 40% 40% 40% 28% 23% 23% 20% 0% Type annotation size is significantly reduced
Implementation Burden of TAL Type Preservation [PLDI 08] Assembly-Level Type Inference Changes to an Existing Compiler (Bartok) 19,000 lines of code cut across code base 5,000 lines of code modular addition to code base Type Checker + Type Inference 13,800 lines of code 15,100 lines of code could be separated to reduce trusted computing base
Conclusion Type inference at the assembly level is expressive enough to verify C# with optimizations flexible enough to accommodate new language features efficient enough to use regularly during compilation compact enough to include in executable binaries modular enough to retrofit existing compilers with Thank You! Thank You!