Understanding the open-source P4 Compiler

Understanding the
open-source P4
16
 compiler
February 15, 2022
P4 developer days
Mihai Budiu (mbudiu-vmw)
mbudiu@vmware.com
What is this about?
P4 is a programming language for network dataplanes
Current spec at 
Open-source “reference” implementation:
 [Apache 2 license]
This presentation should explain how you can modify this compiler
Additional details in
https://github.com/p4lang/p4c/blob/main/docs/compiler-design.pptxhttp//github.com/p4lang/p4chttps://p4.org/p4-spec/docs/P4-16-v1.2.2.html
2
Compiler goals
Support evolution of P4
Support multiple back-ends
Generate code for ASICs, NICs, FPGAs, software switches, and other targets
Provide support for other tools (debuggers, IDEs, control-plane, etc.)
Extensible architecture:
 easy to add new:
Targets
Language extensions
Passes
Optimizations
3
What do I need to know to write compiler code?
C++ (11)
We use garbage-collected C++
Looks more like Java (no 
delete
)
How to install the sources and build the binaries
Your target
The P4
16
 spec
The separation between language and architecture model
The compiler intermediate representation (IR) structure
Compilation using visitors
4
In this
presentation
Presentation Outline
Compiler architecture
Intermediate Representation (IR)
Visitors
5
Compiler structure
Mid-end
Front-end
Back-end
P4
IR
IR
Target-specific
Target-independent
Same IR
IR with target-specific
extensions
6
Parser
IR
output
Structure
Mid-end
Front-end
Back-end
P4
IR
IR
main()
Fixed
Library of pre-built passes
Mix and match
Custom
7
frontends/p4/*
midend/*
backends/*
Nanopass architecture
Very simple passes
Lots of them
./p4c-ebpf 
–v
 ../testdata/p4-16-samples/count_ebpf.p4
FrontEnd_0_P4V1::getV1ModelVersion
ParseAnnotationBodies_0_ParseAnnotations
ParseAnnotationBodies_1_ClearTypeMap
FrontEnd_1_ParseAnnotationBodies
FrontEnd_2_PrettyPrint
MidEnd_26_ParsersUnroll
MidEnd_27_EvaluatorPass
MidEnd_28_MidEndLast
8
378 passes!
(not all different)
Syntactic sugar
P4 Language Layered Design
9
Core
language
Front-end/
mid-end
Core
language
Back-end
Target
language
Many language constructs are eliminated entirely in the front-end/mid-end
Syntactic sugar constructs are thus automatically supported by all back-ends
IR (Intermediate Representation)
ir/ir.h
ir/*.def
10
Intermediate Representation
Object-oriented
Immutable
Strongly-typed (hard to build incorrect programs)
Directed Acyclic Graph (DAG) structure
No parent pointers
Sub-DAGs can be shared
Manipulated almost exclusively by visitors
A
B
C
shared sub-DAG
11
*.def files
Provide the definition of the IR classes
def language is a custom very small sub-set of C++
Front-end IR classes are in 
ir/*.def
IR class hierarchy can be extended by backends
$ ls ../ir/*.def
ir/base.def  ir/expression.def  ir/ir.def  ir/type.def  ir/v1.def
12
All expressions
All types
P4-14
Base classes
Other classes
A fragment of expression.def
abstract 
Expression {}
abstract
 Operation : Expression {
    
virtual
 
int
 getPrecedence() 
const
 = 0;
    
virtual
 cstring getStringOp() 
const
 = 0;
}
abstract
 Operation_Unary : Operation {
    Expression expr;
}
class
 Neg : Operation_Unary {
    stringOp = "-";
}
13
abstract
 Operation_Binary : Operation {
    Expression left;
    Expression right;
}
class
 Add : Operation_Binary {
    stringOp = "+";
    precedence = DBPrint::Prec_Add;
}
Operation_Binary
Expression
Expression
left
right
Generated C++
namespace 
IR
 
{
class 
Expression : 
public
 Node { … }
class
 Operation : 
public
 Expression {
   virtual
 
int
 getPrecedence() 
const
 = 0;
   virtual
 cstring getStringOp() 
const
 = 0;
}
class
 Operation_Binary : 
public
 Operation {
 
public
:
    
const
 IR::Expression* left = nullptr;
    
const
 IR::Expression* right = nullptr;
}
}  // namespace
14
IR fields
Class hierarchy
Strongly typed
Everything
descends from
Node
Front-end IR
IR::Expression – base class for all expressions
IR::Type – base class for all types
IR::Statement – base class for all statements
IR::Declaration – base class for many declarations
IR::Type_Declaration – base class for declarations that are also types
A few other constructs
P4Parser, P4Table, P4Control, Function, etc.
15
Other IR Classes
IR::Node
 – base of whole class hierarchy
IR::Vector<T>
 where T is an 
IR::Node
IR::IndexedVector<T>
Like vector, but maintains a hash-table for 
IDeclarations
 for quick look-up by name
Rejects multiple declarations with the same name
IR::ID
Represents an identifier (including source position)
Stores both the original name (provided by user) and the new internal name
16
The IR tree and the P4 parser
The P4 parser is written in bison
frontend/parsers/p4/p4parser.ypp
It explains how P4 constructs are represented
lvalue '=' expression ';'
      { $$ = new IR::AssignmentStatement(@2, $1, $3); }
expression:
 | expression "+" expression
      { $$ = new IR::Add(@1 + @3, $1, $3); }
17
Understanding the IR
This may seem daunting
P4 grammar 
 IR (very close correspondence)
If you understand the language, you understand most of the front-end IR
However, a few IR classes have no direct correspondence with language
You can probably ignore these
18
IR <=> P4
IR serializable to a P4 program
Simplifies debugging and testing
To read the IR: just generate and read P4
Example:
p4c-ebfp 
--top4 MidEndLast
 […]count_ebpf.p4
Dumps program after pass 
MidEndLast
 is executed
Output written in 
count_ebpf-MidEnd_28_MidEndLast.p4
19
Generate IR As P4 Comments
$ p4c-ebfp 
--top4 MidEndLast
 -v
 […]count_ebpf.p4
 
/* <ParserState>(58300) */
    
state
 ip {
   
/* <MethodCallStatement>(49508)
        <MethodCallExpression>(49509)
          <Member>(49510) 
extract
            <PathExpression>(49511) 
p
          <Vector<Type>>(5314),size=1
            <Type_Name>(5315) 
IPv4_h
          <Vector<Argument>>(49516),size=1
            <Argument>(49517)
              <Member>(49518) 
ipv4
                <PathExpression>(49519) 
he
aders
 */
        p.extract<IPv4_h>(headers.ipv4);
    
/*<PathExpression>(16385) 
accept
 */
        
transition
 accept;
    }
20
ParserState
MethodCallExpression
PathExpression
Member
extract
Vector
Vector
PathExpression
Type_Name
Argument
Member
ipv4
PathExpression
 
accept
 
IPv4_h
 
p
 
headers
MethodCallStatement
Node IDs
Each IR node has a unique ID
 
/* <ParserState>(
58300
) */
Compiler is deterministic
Same node ids will be generated in every execution
So you can set up breakpoints in gdb based on node ids
21
Visitors
 
22
Visitor pattern
https://en.wikipedia.org/wiki/Visitor_pattern
“In object-oriented programming and software engineering, the visitor design
pattern is a way of separating an algorithm from an object structure on which it
operates. A practical result of this separation is the ability to add new operations
to existing object structures without modifying those structures.”
“Structure” = IR
“Algorithms” = program manipulations
23
Compiler visitor (
ir/visitor.h
)
24
IR
DAG
visitor
IR 
DAG
node0
node1
const
 IR::Node* node0;
Visitor visitor;
const
 IR::Node* node1 = node0->apply(visitor);
Most Useful Visitor Kinds
25
ir/pass_manager.h
These are all subclasses of the abstract class 
Visitor
Visitors
IR classes
IR
manipulations
(visitors)
Auto-generated
Write only what
you need
26
Chaining visitors
const IR::Node* node;
IR::Visitor v1, v2;
const IR::Node* intermediate = node->apply(v1);
const IR::Node* final = intermediate->apply(v2);
node
v1
intermediate
v2
final
27
A PassManager is a sequence of Visitors
const
 IR::P4Program *FrontEnd::run(
   
const
 CompilerOptions &options,
   
const
 IR::P4Program* program) {
 
PassManager passes({
        
new
 P4V1::getV1ModelVersion,
        
new
 ParseAnnotationBodies(&parseAnnotations, &typeMap),
        
new
 PrettyPrint(options),
        
new
 ValidateParsedProgram(),
        
new
 CreateBuiltins(),
     });
     
return
 program->apply(passes);
}
28
passes
Visitors modify subtrees
Pattern-match on a node
Generate a new DAG to replace the node
29
Original DAG
New DAG
Copy-on-write applied to the IR DAG
Input DAG
Original DAG
visitor
New DAG
Output DAG
30
Input DAG
Cloned nodes
Original nodes
IR DAG is traversed depth-first
const
 IR::Node *Inspector::apply_visitor(
const
 IR::Node *n) {
     
if
 (visited(n) && visitDagOnce) {
            // do nothing
     } 
else
 {
            
if (this->preorder(n)) {
                visit_children(n);
                this->postorder(n);
            }
            setVisited(n);
     }
     
return
 n;
}
n
parent
children
1
2 preorder
3
5 postorder
4
6
31
Example visitor declaration
class
 StrengthReduction 
final
 : 
public
 Transform {
 
public
:
    StrengthReduction() { visitDagOnce = true; }
    
const
 IR::Node* postorder(IR::Sub* expr) 
override
;
    
const
 IR::Node* postorder(IR::Add* expr) 
override
;
    
const
 IR::Node* postorder(IR::Shl* expr) 
override
;
    
const
 IR::Node* postorder(IR::Shr* expr) 
override
;
    
const
 IR::Node* postorder(IR::Mul* expr) 
override
;
};
modifies program
Types of nodes processed
Repeated nodes produce
the same result
32
Example visitor method
static
 
bool
 isZero(
const
 IR::Expression* expr) 
const
 {
    
auto
 cst = expr->to<IR::Constant>();
    
if
 (cst == nullptr) 
return
 false;
    
return
 cst->value == 0;
}
const
 IR::Node* StrengthReduction::postorder(
         IR::Add* expr) {
    
if
 (isZero(expr->right)) 
return
 expr->left;
    
if
 (isZero(expr->left)) 
return
 expr->right;
    
return
 expr;
}
Helper function
33
The original node
In each visitor method the Node* handed to the method is a 
clone 
of the original node
You can use the 
getOriginal()
 method to access the original node
const
 IR::Node* SubstitutionVisitor::preorder(IR::Type_Var* tv) {
    
auto
 type = bindings->lookup(
getOriginal
());
    
if
 (type == nullptr)
        
return
 tv;
    LOG1("Replacing " << getOriginal() << " with " << type);
    
return
 type;
}
tv is actually a temporary clone of the
getOriginal() node.
34
The context
Available in a Visitor.
A data structure that describes the
path from root to the current node.
getContext() has a pointer to the
parent and to the parent’s context
35
Current node
Context
ReferenceMap
Data structure mapping a name to its declaration
IR::Path generalizes a name
ReferenceMap core methods:
const IR::IDeclaration* getDeclaration(const IR::Path* path)
cstring newName(cstring base)
The pass ResolveReferences creates a ReferenceMap
36
const bit<8> x = 10;
struct S { bit<8> s; }
action a(in S w, out bit<8> z)
{
    z = x + w.s;
}
refMap->getDeclaration(n)
TypeMap
A data structure matching an IR node to its type
Types are IR data structures, but not part of the program IR
Created by the TypeInference pass
37
const bit<8> x = 10;
struct S { bit<8> s; }
action a(in S w, out bit<8> z)
{
    z = x + w.s;
}
Type_Bits(8)
Type_Struct S
Field: s
Vector
Deleting a DAG
In general you can just return
 nullptr
 from the visitor
.
const IR::Node* RemoveUnusedDeclarations::preorder(IR::P4Table* tbl) {
    if (!refMap->isUsed(getOriginal())) {
        ::warning("Table %1% is not used; removing", tbl);
        LOG3("Removing " << tbl);
        
tbl = nullptr
;
    }
    prune();
    return tbl;
}
38
In general you can return an 
IR::Vector<T> 
/ 
IR::IndexedVector<T>
const IR::Node* SpecializeBlocks::postorder(IR::P4Control* cont) {
    auto insertions = blocks->findInsertions(getOriginal());
    auto result = new IR::Vector<IR::Node>();
    result->push_back(cont);
    for (auto d : *insertions)
        result->push_back(d);
    return result;
}
Insert after current node
Inserting an IR::Node
Keep current node
39
Generating a different language (e.g. JSON)
Use an 
Inspector
Keep a 
std::map<
const
 IR::Node*, Util::IJson*> map
;
Or some other data structure where you can build your result.
For p4c-ebpf we generate a string that is the C source
void
 postorder(
const
 IR::Operation_Binary* expression) 
override
 {
    auto e = new Util::JsonObject();
    e->emplace("op", expression->getStringOp());
    auto l = get(map, expression->left);
    e->emplace("left", l);
    auto r = get(map, expression->right);
    e->emplace("right", r);
    map.emplace(expression, e); // actual result
}
40
Conclusions
P4 is a relatively simple language
The front-end eliminates some P4 constructs
Back-ends deal with a simpler language (a subset of P4)
The P4 compiler front-end is highly modular and extensible
Many back-ends have been built on top of it
There are 6 back-ends in the original repository
The IR is relatively close to P4 itself
Most passes are very simple and do just one thing
41
Backup slides
 
42
Debugging logs
Use the LOG*() macros to log internal data structures
the LOG macros call the dbprint() method on IR objects
LOG1("Replacing " << id << " with " << newid);
Logging is controlled from the command-line with the –T flag:
-Tnode:2,pass_manager:1
logs at level 2 in file node.cpp, and level 1 in pass_manager.cpp
E.g., -Tpass_manager:1 will print passes as they are executed
43
Dumping IR
::dump(const IR::Node*)
dumps the internal IR
representation of a node as
human-readable text
Node id
Node type (class)
Fields
Children are 
indented
[107] P4Program
  declarations: [14] IndexedVector<Node>
    [26] Type_Struct name=P
      annotations: [15] Vector<Annotation>
      fields: [16] IndexedVector<StructField>
        [21] StructField name=f1
          annotations: [17] Vector<Annotation>
          type: [20] Type_Bits size=32 isSigned=0
        [25] StructField name=f2
          annotations: [22] Vector<Annotation>
          type: [24] Type_Bits size=32 isSigned=0
    [37] Type_Struct name=T
      annotations: [27] IndexedVector<Annotation>
      fields: [28] Vector<StructField>
        [32] StructField name=t1
          annotations: [29] Vector<Annotation>
          type: [31] Type_Bits size=32 isSigned=1
        [36] StructField name=t2
          annotations: [33] Vector<Annotation>
          type: [35] Type_Bits size=32 isSigned=1
44
What is the hard part?
Keeping track of various node versions
New versions of nodes are created while transformations occur
Even nodes that you are not touching
the ancestors of the nodes you are touching
References to nodes may become stale
pointing to old versions of the nodes, no longer in the IR tree
so your carefully constructed maps may need to be reconstructed if you do 
anything
e.g., ReferenceMap, TypeMap
In general, you cannot run two Transforms in sequence if they use some
precomputed data structures, since the first will change the program and
invalidate the maps
45
Where am I (and how did I get here)
getContext()
 gives your (current) parent node
A node can have many parents
findContext<T>()
 will find an ancestor context of type T
const IR::Node* MoveInitializers::postorder(
                IR::Declaration_Variable* decl){
    if (
getContext
() == nullptr)
        return decl;
    auto parent = getContext()->node;
    if (!parent->is<IR::P4Control>() &&
        !parent->is<IR::P4Parser>())
        // We are not in the local toplevel declarations
        return decl;
node
context
context
node
node
parent
parent
grand
parent
parent
getContext
parent
46
Slide Note
Embed
Share

The open-source P4 compiler is a tool designed to support the evolution of P4 by generating code for various networks and devices. This presentation covers the goals of the compiler, what you need to know to write compiler code, the compiler architecture, intermediate representation, visitors, compiler structure, nanopass architecture, and more. With a focus on enhancing P4's capabilities, this resource aims to provide insights into modifying the compiler to align with specific requirements.

  • P4 Compiler
  • Programming Language
  • Network Dataplanes
  • Compiler Design
  • Compiler Architecture

Uploaded on Feb 22, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Understanding the open-source P416compiler February 15, 2022 P4 developer days Mihai Budiu (mbudiu-vmw) mbudiu@vmware.com

  2. What is this about? P4 is a programming language for network dataplanes Current spec at https://p4.org/p4-spec/docs/P4-16-v1.2.2.html Open-source reference implementation: http//github.com/p4lang/p4c [Apache 2 license] This presentation should explain how you can modify this compiler Additional details in https://github.com/p4lang/p4c/blob/main/docs/compiler-design.pptx 2

  3. Compiler goals Support evolution of P4 Support multiple back-ends Generate code for ASICs, NICs, FPGAs, software switches, and other targets Provide support for other tools (debuggers, IDEs, control-plane, etc.) Extensible architecture: easy to add new: Targets Language extensions Passes Optimizations 3

  4. What do I need to know to write compiler code? C++ (11) We use garbage-collected C++ Looks more like Java (no delete) How to install the sources and build the binaries Your target The P416 spec The separation between language and architecture model The compiler intermediate representation (IR) structure Compilation using visitors In this presentation 4

  5. Presentation Outline Compiler architecture Intermediate Representation (IR) Visitors 5

  6. Compiler structure IR with target-specific extensions Same IR Parser Front-end Mid-end Back-end P4 IR IR IR output Target-independent Target-specific 6

  7. Structure Library of pre-built passes Fixed Custom Mix and match Front-end Mid-end Back-end P4 IR IR main() frontends/p4/* midend/* backends/* 7

  8. Nanopass architecture Very simple passes Lots of them ./p4c-ebpf v ../testdata/p4-16-samples/count_ebpf.p4 FrontEnd_0_P4V1::getV1ModelVersion ParseAnnotationBodies_0_ParseAnnotations ParseAnnotationBodies_1_ClearTypeMap FrontEnd_1_ParseAnnotationBodies FrontEnd_2_PrettyPrint MidEnd_26_ParsersUnroll MidEnd_27_EvaluatorPass MidEnd_28_MidEndLast 378 passes! (not all different) 8

  9. P4 Language Layered Design Syntactic sugar Core language Front-end/ mid-end Core language Back-end Target language Many language constructs are eliminated entirely in the front-end/mid-end Syntactic sugar constructs are thus automatically supported by all back-ends 9

  10. IR (Intermediate Representation) ir/ir.h ir/*.def 10

  11. Intermediate Representation A Object-oriented Immutable Strongly-typed (hard to build incorrect programs) Directed Acyclic Graph (DAG) structure No parent pointers Sub-DAGs can be shared Manipulated almost exclusively by visitors B C shared sub-DAG 11

  12. *.def files Provide the definition of the IR classes def language is a custom very small sub-set of C++ Front-end IR classes are in ir/*.def IR class hierarchy can be extended by backends $ ls ../ir/*.def ir/base.def ir/expression.def ir/ir.def ir/type.def ir/v1.def All expressions Base classes Other classes All types P4-14 12

  13. A fragment of expression.def abstract Operation_Binary : Operation { Expression left; Expression right; } abstract Expression {} abstract Operation : Expression { virtualint getPrecedence() const = 0; virtual cstring getStringOp() const = 0; class Add : Operation_Binary { stringOp = "+"; precedence = DBPrint::Prec_Add; } } abstract Operation_Unary : Operation { Expression expr; } Operation_Binary class Neg : Operation_Unary { left right stringOp = "-"; } Expression Expression 13

  14. Everything descends from Node Generated C++ namespace IR{ class Expression : publicNode { } class Operation : public Expression { virtualint getPrecedence() const = 0; virtual cstring getStringOp() const = 0; } class Operation_Binary : public Operation { public: const IR::Expression* left = nullptr; const IR::Expression* right = nullptr; } } // namespace Class hierarchy IR fields Strongly typed 14

  15. Front-end IR IR::Expression base class for all expressions IR::Type base class for all types IR::Statement base class for all statements IR::Declaration base class for many declarations IR::Type_Declaration base class for declarations that are also types A few other constructs P4Parser, P4Table, P4Control, Function, etc. 15

  16. Other IR Classes IR::Node base of whole class hierarchy IR::Vector<T> where T is an IR::Node IR::IndexedVector<T> Like vector, but maintains a hash-table for IDeclarations for quick look-up by name Rejects multiple declarations with the same name IR::ID Represents an identifier (including source position) Stores both the original name (provided by user) and the new internal name 16

  17. The IR tree and the P4 parser The P4 parser is written in bison frontend/parsers/p4/p4parser.ypp It explains how P4 constructs are represented lvalue '=' expression ';' { $$ = new IR::AssignmentStatement(@2, $1, $3); } expression: | expression "+" expression { $$ = new IR::Add(@1 + @3, $1, $3); } 17

  18. Understanding the IR This may seem daunting P4 grammar IR (very close correspondence) If you understand the language, you understand most of the front-end IR However, a few IR classes have no direct correspondence with language You can probably ignore these 18

  19. IR <=> P4 IR serializable to a P4 program Simplifies debugging and testing To read the IR: just generate and read P4 Example: p4c-ebfp --top4 MidEndLast [ ]count_ebpf.p4 Dumps program after pass MidEndLast is executed Output written in count_ebpf-MidEnd_28_MidEndLast.p4 19

  20. Generate IR As P4 Comments ParserState $ p4c-ebfp --top4 MidEndLast -v [ ]count_ebpf.p4 /* <ParserState>(58300) */ state ip { /* <MethodCallStatement>(49508) <MethodCallExpression>(49509) <Member>(49510) extract <PathExpression>(49511) p <Vector<Type>>(5314),size=1 <Type_Name>(5315) IPv4_h <Vector<Argument>>(49516),size=1 <Argument>(49517) <Member>(49518) ipv4 <PathExpression>(49519) headers */ p.extract<IPv4_h>(headers.ipv4); /*<PathExpression>(16385) accept */ transition accept; } MethodCallStatement PathExpression MethodCallExpression accept Member extract Vector Vector PathExpression Type_Name Argument Member ipv4 IPv4_h p PathExpression headers 20

  21. Node IDs Each IR node has a unique ID /* <ParserState>(58300) */ Compiler is deterministic Same node ids will be generated in every execution So you can set up breakpoints in gdb based on node ids 21

  22. Visitors 22

  23. Visitor pattern https://en.wikipedia.org/wiki/Visitor_pattern In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure on which it operates. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures. Structure = IR Algorithms = program manipulations 23

  24. Compiler visitor (ir/visitor.h) node0 node1 visitor IR DAG IR DAG const IR::Node* node0; Visitor visitor; const IR::Node* node1 = node0->apply(visitor); 24

  25. Most Useful Visitor Kinds ir/pass_manager.h These are all subclasses of the abstract class Visitor Visitor Description Simple read-only visitor that does not modify any IR nodes, just collects information. Inspector Full transformation visitor. Transform Combines several visitors, run in a sequence. PassManager Repeats a sequence of visitors until the IR DAG stops changing. PassRepeated 25

  26. Visitors IR classes Add Subtract VarDecl Parser Control Header Base Auto-generated ConstantFolding IR manipulations (visitors) DefUseAnalysis DeadCode Write only what you need Inlining 26

  27. Chaining visitors const IR::Node* node; IR::Visitor v1, v2; const IR::Node* intermediate = node->apply(v1); const IR::Node* final = intermediate->apply(v2); v1 v2 node final intermediate 27

  28. A PassManager is a sequence of Visitors const IR::P4Program *FrontEnd::run( const CompilerOptions &options, const IR::P4Program* program) { PassManager passes({ new P4V1::getV1ModelVersion, new ParseAnnotationBodies(&parseAnnotations, &typeMap), new PrettyPrint(options), new ValidateParsedProgram(), new CreateBuiltins(), }); return program->apply(passes); } passes 28

  29. Visitors modify subtrees Pattern-match on a node Generate a new DAG to replace the node Original DAG New DAG 29

  30. Copy-on-write applied to the IR DAG Input DAG Output DAG Input DAG visitor Original DAG New DAG Cloned nodes Original nodes 30

  31. IR DAG is traversed depth-first const IR::Node *Inspector::apply_visitor(const IR::Node *n) { if (visited(n) && visitDagOnce) { // do nothing } else { if (this->preorder(n)) { visit_children(n); this->postorder(n); } setVisited(n); } return n; } 6 1 parent 2 preorder n 5 postorder 3 4 children 31

  32. Example visitor declaration Repeated nodes produce the same result modifies program class StrengthReduction final : public Transform { public: StrengthReduction() { visitDagOnce = true; } const IR::Node* postorder(IR::Sub* expr) override; const IR::Node* postorder(IR::Add* expr) override; const IR::Node* postorder(IR::Shl* expr) override; const IR::Node* postorder(IR::Shr* expr) override; const IR::Node* postorder(IR::Mul* expr) override; }; Types of nodes processed 32

  33. Example visitor method Helper function staticbool isZero(const IR::Expression* expr) const { auto cst = expr->to<IR::Constant>(); if (cst == nullptr) return false; return cst->value == 0; } const IR::Node* StrengthReduction::postorder( IR::Add* expr) { if (isZero(expr->right)) return expr->left; if (isZero(expr->left)) return expr->right; return expr; } 33

  34. The original node In each visitor method the Node* handed to the method is a clone of the original node You can use the getOriginal() method to access the original node const IR::Node* SubstitutionVisitor::preorder(IR::Type_Var* tv) { auto type = bindings->lookup(getOriginal()); if (type == nullptr) return tv; LOG1("Replacing " << getOriginal() << " with " << type); return type; } tv is actually a temporary clone of the getOriginal() node. 34

  35. The context Available in a Visitor. A data structure that describes the path from root to the current node. getContext() has a pointer to the parent and to the parent s context Context Current node 35

  36. ReferenceMap const bit<8> x = 10; struct S { bit<8> s; } action a(in S w, out bit<8> z) { z = x + w.s; } refMap->getDeclaration(n) Data structure mapping a name to its declaration IR::Path generalizes a name ReferenceMap core methods: const IR::IDeclaration* getDeclaration(const IR::Path* path) cstring newName(cstring base) The pass ResolveReferences creates a ReferenceMap 36

  37. TypeMap const bit<8> x = 10; struct S { bit<8> s; } action a(in S w, out bit<8> z) { z = x + w.s; } Type_Bits(8) Type_Struct S Vector Field: s A data structure matching an IR node to its type Types are IR data structures, but not part of the program IR Created by the TypeInference pass 37

  38. Deleting a DAG In general you can just return nullptr from the visitor. const IR::Node* RemoveUnusedDeclarations::preorder(IR::P4Table* tbl) { if (!refMap->isUsed(getOriginal())) { ::warning("Table %1% is not used; removing", tbl); LOG3("Removing " << tbl); tbl = nullptr; } prune(); return tbl; } 38

  39. Inserting an IR::Node In general you can return an IR::Vector<T> / IR::IndexedVector<T> const IR::Node* SpecializeBlocks::postorder(IR::P4Control* cont) { auto insertions = blocks->findInsertions(getOriginal()); auto result = new IR::Vector<IR::Node>(); result->push_back(cont); for (auto d : *insertions) result->push_back(d); return result; } Keep current node Insert after current node 39

  40. Generating a different language (e.g. JSON) Use an Inspector Keep a std::map<const IR::Node*, Util::IJson*> map; Or some other data structure where you can build your result. For p4c-ebpf we generate a string that is the C source void postorder(const IR::Operation_Binary* expression) override { auto e = new Util::JsonObject(); e->emplace("op", expression->getStringOp()); auto l = get(map, expression->left); e->emplace("left", l); auto r = get(map, expression->right); e->emplace("right", r); map.emplace(expression, e); // actual result } 40

  41. Conclusions P4 is a relatively simple language The front-end eliminates some P4 constructs Back-ends deal with a simpler language (a subset of P4) The P4 compiler front-end is highly modular and extensible Many back-ends have been built on top of it There are 6 back-ends in the original repository The IR is relatively close to P4 itself Most passes are very simple and do just one thing 41

  42. Backup slides 42

  43. Debugging logs Use the LOG*() macros to log internal data structures the LOG macros call the dbprint() method on IR objects LOG1("Replacing " << id << " with " << newid); Logging is controlled from the command-line with the T flag: -Tnode:2,pass_manager:1 logs at level 2 in file node.cpp, and level 1 in pass_manager.cpp E.g., -Tpass_manager:1 will print passes as they are executed 43

  44. Node id Node type (class) Dumping IR [107] P4Program declarations: [14] IndexedVector<Node> [26] Type_Struct name=P annotations: [15] Vector<Annotation> fields: [16] IndexedVector<StructField> [21] StructField name=f1 annotations: [17] Vector<Annotation> type: [20] Type_Bits size=32 isSigned=0 [25] StructField name=f2 annotations: [22] Vector<Annotation> type: [24] Type_Bits size=32 isSigned=0 [37] Type_Struct name=T annotations: [27] IndexedVector<Annotation> fields: [28] Vector<StructField> [32] StructField name=t1 annotations: [29] Vector<Annotation> type: [31] Type_Bits size=32 isSigned=1 [36] StructField name=t2 annotations: [33] Vector<Annotation> type: [35] Type_Bits size=32 isSigned=1 ::dump(const IR::Node*) dumps the internal IR representation of a node as human-readable text Fields Children are indented 44

  45. What is the hard part? Keeping track of various node versions New versions of nodes are created while transformations occur Even nodes that you are not touching the ancestors of the nodes you are touching References to nodes may become stale pointing to old versions of the nodes, no longer in the IR tree so your carefully constructed maps may need to be reconstructed if you do anything e.g., ReferenceMap, TypeMap In general, you cannot run two Transforms in sequence if they use some precomputed data structures, since the first will change the program and invalidate the maps 45

  46. Where am I (and how did I get here) parent getContext() gives your (current) parent node A node can have many parents findContext<T>() will find an ancestor context of type T grand parent node context parent const IR::Node* MoveInitializers::postorder( IR::Declaration_Variable* decl){ if (getContext() == nullptr) return decl; auto parent = getContext()->node; if (!parent->is<IR::P4Control>() && !parent->is<IR::P4Parser>()) // We are not in the local toplevel declarations return decl; node parent parent context node 46

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#