Automatic Generation and Validation of Instruction Encoders and Decoders
This study presents a novel framework for formalizing instruction formats to support automatic verification of mutual inversion frameworks. It addresses challenges posed by the variety and complexity of instruction formats, focusing on 32-bit X86 format and providing a detailed exploration of the framework's contributions, design, and evaluation. The proposed framework leverages CSLED specification language, patterns for describing instruction structures, and algorithms for validation and synthesis of proofs, emphasizing the generation of encoders and decoders. The implementation includes over 5.2k lines of C++ code and 1k lines of Coq library, contributing significantly to the field of automatic generation and validation of instruction encoders and decoders.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Automatic Generation and Validation of Instruction Encoders and Decoders Xiangzhe Xu, Jinhua Wu, Yuting Wang, Zhenguo Yin, Pengfei Li Shanghai Jiao Tong University, Shanghai, China Computer Aided Verification (CAV) 2021
Motivation Background: Manipulation of Machine Code Machine Instructions Abstract Instructions Compilers & Assemblers ?????? Machine Code Add EAX, EBX 001001010 ?????? Binary Analysis Tools Problem: Mutual Inversion between Encoding and Decoding ? ?,?????? ? = ? ?????? ? = ? 2
Challenges Variety and Complexity of Instruction Formats Example: 32-bit X86 Instruction Format 1 byte ModRM ModRM 1 byte Opcode Opcode Opcode 1 byte SIB 4 byte 4 byte Tokens: 0x03 ModRM 0x1C 0x18 Opcode 0x90 0x03 SIB 0x8C Immediate Displacement NOP NOP Fields: Mod 00 00 RegOp 011 011 RM 100 000 Scale 10 Index 001 Base 100 EBX , [4 ADD ADD EBX EBX , , [4 * * ECX , ADD ADD EBX ESP] ESP] [ [EAX EAX] ] ECX + + 3
Contribution A Novel Framework for Formalizing Instruction Formats that Supports Automatic Verification of Mutual Inversion FrameworksSupport MutualInversion Automation Formalization Instr. Fmt. SLED YES YES NO NO RockSalt YES YES YES NO Narcissus NO YES YES NO Our Framework YES YES YES YES 6
The Framework Instruction Formats CSLED Specification Language Patterns Describing Instruction Structures Unambiguous Relations Between Abstract and Binary Instructions On Paper Encode Instruction Specifications In CSLED Input Generation of Encoders and Decoders Generating Definitions in Coq Algorithms In C++ Validation of Mutual Inversion Synthesizing Proofs in Coq Resolve Verification Conditions in SMT Solvers Proofs of Mutual Inversion In Coq Encoder Decoder 7
Evaluation & Conclusion Implementation Algorithms: 5.2k lines of C++ Coq Library: 1k lines of Coq (version 8.11.0) SMT Solver: Z3 with Theory of Bit-Vectors Test Case: 186 Representative X86-32 Instructions Sufficient for Assembling All X86-32 Instructions in CompCert 260 Lines of Specifications in CSLED Generates 45k Definitions and 42k Proofs in Coq Artifact available at: https://zenodo.org/record/4720184#.YNBOZGgzZPY 8