Networking and Formal Methods Review
Delve into the intricacies of networking, focusing on routing layers, sample network configurations, transport abstractions, and TCP headers for network verification. Explore how data is transmitted across interconnected systems and the importance of formal methods in ensuring robust network communication.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
LECTURE 2: NETWORKING + FORMAL METHODS REVIEW George Varghese 1
Internet Headers Data | HTTP | Dest Port | Src Port| Prot | IP Dest | IP Source | MAC Dest | Mac Src Application IP TCP Ethernet
Focus on Routing Layer Every station has a 32 bit IP address that is used to reach it We use dotted decimal notation to make addresses readable. 128.97.1.2. Each number between dots represents a byte Stations in a common sub-network often share the same first few bits, a prefix. UCLA engineering has 128.97.*
Sample Network Core 1 128.97.2.* EE 128.97.1* CSE CS Router EE Router EE Web Server 128.97.2.6 Core 2 Your laptop 128.97.1.5
Sample Network Core 1 128.97.2.* EE 128.97.1* CS CS Router EE Router EE Web Server 128.97.2.6 Core 2 Your laptop 128.97.1.5
From the Apps Perspective 128.97.2.6 128.97.1.5 Web Server Browser RESPONSE UCLA Network GET TCP TCP
Transport Abstraction: Connected queues 128.97.2.6 128.97.1.5 Web Server Browser RESPONSE GET TCP TCP Queue Queue
Transport Abstraction: Connected queues, ports for multiplexing 128.97.2.6 128.97.1.5 Port 3154 Web Server Port 21 Browser FTP RESPONSE Queue 4 Queue 3 TCP TCP Queue 1 Queue 2 Port 16581 Port 80
TCP Header For network verification these are the only TCP fields that matter as they identify the application and so routers/firewalls use these fields for forwarding
IP Forwarding (Data Plane) 128.97.2.* Core 1 128.97.2.* 128.97.2.* EE 128.97.1* CS CS Router EE Router 128.97.2.6 EE Web Server 128.97.2.6 Core 2 Your laptop 128.97.1.5 128.97.2.*
Controlling Forwarding with ACLS 128.97.2.* Core 1 128.97.2.* 128.97.2.* EE 128.97.1* CS CS Router EE Router EE Web Server 128.97.2.6 128.97.2.6 Core 2 Your laptop 128.97.1.5 Drop if Dest Post = 1433 128.97.2.* A famous attack called Slammer targeted the SQL Port 1433. After that it is common to block this port
IP Forwarding + Load Balancing 128.97.2.* Core 1 128.97.2.* 128.97.2.* EE 128.97.1* CS CS Router EE Router 128.97.2.6 EE Web Server 128.97.2.6 Core 2 Your laptop 128.97.1.5 128.97.2.* When a router has equal cost routes, it forward a packet based on a hash function on the complete header (including TCP Ports): ECMP hashing
Router Model Modified Output ACLs Input ACLs Dest Port = 1433-> Drop
ACL Syntax Different for different vendors Logically, a conjunction of predicates on IP and TCP fields and an action For example, Dest IP = 129.97.* and Dest Port = 1433 Forward to Low Priority Queue Most routers allow 100-1000 of them. When multiple match, first one wins
Other interesting forwarding data planes Bridges: Forward based on MAC addresses MPLS Switches: Forward Based on small indexes to simplify lookups All of these different types (IP Prefix Matching, ACLs, MPLS, Bridges can be modelled as a sequence of rules (predicates + actions) A router dataplane can be abstracted a collection of rules on headers Insight behind design of Software Defined Networks but can be used also for verification.
Building Routes the control plane Core 1 128.97.2.*, cost 1 128.97.2.*, cost 2 128.97.2.* EE 128.97.1* CS CS Router EE Router EE Web Server 128.97.2.6 128.97.2.*, cost 5 Core 2 128.97.2.*, cost 1 Cost 4 Your laptop 128.97.1.5 128.97.2.* When a router gets multiple routes it picks, the shortest route installs it in its forwarding table and sends a new advertisement to its neighbors
Building Routes equal cost routes Core 1 128.97.2.*, cost 1 128.97.2.*, cost 2 128.97.2.* EE 128.97.1* CS CS Router EE Router EE Web Server 128.97.2.6 128.97.2.*, cost 2 Core 2 128.97.2.*, cost 1 Cost 1 Your laptop 128.97.1.5 128.97.2.* When a router gets multiple shortest path routes it installs them all into its Forwarding Table allowing ECMP load balancing
Building Routes on failure Core 1 128.97.2.*, cost 1 128.97.2.* EE 128.97.1* CS CS Router EE Router EE Web Server 128.97.2.6 128.97.2.*, cost 5 Core 2 128.97.2.*, cost 1 Cost 4 Your laptop 128.97.1.5 128.97.2.*
More complex control planes The slides is a model of a simple control plane, such as RIP for shortest path routes But often in data centers, one wants more complex choices of routes. For example, within Google, Google data centers may be allowed on some links Leads to the common use of a protocol called BGP even within organizations, between ISPs
BGP Routing: Beyond shortest path Route1 (p, . .) Route2 (p, . .) LP = 80 LP = 120 Route Processing Policy Static Route For p Static Routes take precedence Then come local preferences at this router (higher wins) Then comes some form of path length And more . . .
Configuration Files Lots of things in routing done automatically by control plane But managers want to steer traffic Managers get to set knobs to do so: for example ACLs in data plane, BGP parameters like LocPreference These are stored in so-called Config files. It is a form of manual router-by-router programing of routes
Recall Data Plane Bug: Setting parsimonious ACLs I POLICY Deny Any C UDP Internet and Compute can communicate Internet cannot send to controllers B Allow Any C Allow C Any D X E F H G DNS Services are now blocked! 26
Recall Control Plane Config Bug Core C via up C via up Static Route: C via M B2 B1 C via up C via M management network Data Center Network M Buggy static route causes B1 to propagate wrong route to C. Works fine till . . . Specification: ? routing messages received PropagatedRoute (B1, e) = PropagatedRoute (B2, e)
Lets summarize problem again Shortest Path D S Manual Configurations: Managers override default shortest paths for security, load balancing, and economic reasons Data Plane + Control Plane: Vendor-specific knobs in both Problem: Manually programming individual routers to implement global policy leads to cloud failures 28
THE CREATIVE HABIT George Varghese 29
How do you warm up? Find your own ritual to start creative work? Basketball players Stravinsky A Cook Yours?
Introduction to Formal Methods Data Plane Verification Data Plane Synthesis Control Plane Verification Control Plane Synthesis INTRODUCTION TO FORMAL METHODS FOR NOVICES (LIKE ME)
Main Idea: Network as a Program Model header as point in high dimensional space and all networking boxes as transformers of header space 3 1 Packet Forwarding 2 Match Action + + 0xx1..x1 11xx..0x Send to port 3 Rewrite with 1xx011..x1 Rewrite with 1x01xx..x1 Send to port 2 ROUTER ABSTRACTED AS SET OF GUARDED COMMANDS . . NETWORK BECOMES A PROGRAM CAN USE PL METHODS FOR VERIFICATION, TESTING, AND SYNTHESIS 32
Semantics Based Landscape (partial) Hoare style proofs (1960s-1980s): axioms, inference rules, invariants : scales poorly, hard for humans Model checking: search algorithmically over state space to check P. e.g., NuSMV, Impact, IC3 Proof Assistants: Humans help but system can check proofs. E.g., Coq, Isabelle. Testing: Symbolic execution to maximize coverage and check property failures (e.g., Klee, SAGE)
Pros and Cons Model checking: SMV, nuSMV, Impact, IC3 Pros: push-button verification ; Cons: finite state systems (e.g., fixed network). Hard to scale (needs abstraction, equivalencing, compression) Proof Assistants: Coq, Isabelle. Pros: Much harder work (must supply invariants) Cons: works for infinite state systems (e.g., all networks) and ideally needs methods to suggest invariants Testing: Pros: Can uncover bugs in large code without models Cons: Incomplete (state space too large), imprecise
To apply these techniques Need to have: Examples: a modelling language for networks state machines, Datalog, Propositional Logic a specification language for correctness CTL, Modal Logic, Datalog, a system, data structures to encode program and data and a solver NuSMV using BDDs for routers and packets
Reactive System/State Machine ?,????,? ? set of states ???? ? set initial states ? ? ? transitions ?3 ?2 ?0 ?1 ?0 ?1 ?2 Runs: ?0,?1,?2, , ?0 ???? ??,??+1 ?,? = 0,1,2, ?0 ?1 ?3 ?2 Computation Tree: ? describes edges in a tree of all behaviors. ?2 ?0 ?1 ?3
A Sample Network N1 R2 R1 src:10* dst: 01* src:10* dst:*** A B src:1** dst:*** src:1** dst:***; dst[1] := 0 D src:*** dst:1** R3 Program: Packets matching source IP 101 or 100, and destination IP 010 or 011 are forwarded from R1 to R2 States: Headers at Ports : (h, p) Initially: All possible packets is at Port A State Transitions: Router forwarding rules move packets between ports
Networks as Reactive Systems ?????,?????,??????,????? ???:????? 2????? - output ports ????? = ?.? ? ?????,? ??? ? } - node + out port ?????:????? ????? - link to next node @?.? ????? Such that ? = ????? ?.? ,? ???(? ) - ? ?????? ????? - ???? is a set packet injection points @? .? ?????? ????? ?????? ?????
CTL Specifications A for all branches, E for some branch. F if one some state in branch, G for all states in branch E F z x A F y x E G x x,y x, y, z y Example: Al-Sharer 2010: VoIP phones cannot communicate with laptop s?? = ? ???? ? ??? = ? ?????? ? ???(?) ?? (??? ? )
Model Checking Encoding of N1 10* *** 10* 01* A B 1** *** dst[1] := 0 1** *** D *** 1** Which packets can reach B from A? ? ???,???,??? ,??? , , = ?1 ?12 ?? = ?2 = ?1 ?13 ?? = ?3 = ?3 ?3? ?? = ? = ?3 ?32 ???0 = ?2 = ?2 ?2? ?? = ? ???? = ?, ?????????? ?, Check Check ??( = ?) - EF : There exists a path such that
Model Checking Evolution Explicit Model Checking: Works for protocols, doesn t scale to many state bits. Symbolic Model Checking: Works on sets of states encoded compactly with BDDs Bounded Model Checking: Unrolls state machine k times and uses SAT solver to verify or find counterexample Impact, IC3 (interpolants, inductive properties): Search for inductive invariants using SAT/SMT solvers
Binary Decision Diagrams From A. Ragunathan, Purdue University
BDDs for Networking Folks Compact encoding for Boolean Functions (beyond SMC) Examples uses: [Al-Sharer 10] : model router forwarding state machine [Yang 13]: model sets of headers Pros: Compositional: Unions, Intersections, Canonical form Works well with some model checkers (NuSMV) Code Available: e.g., BuDDy , CUDD Library Generalizations to non-Booleans available Con: Sensitive to variable ordering
SAT as another useful subroutine Used in BMC but of independent interest! Propositional SAT: Variables, negation, and, or Example: x1 x2, Satisfiable with x1 = TRUE Augmented with constraints, functions: SMT SMT is at least NP hard! But modern heuristic SMT solvers are very fast! Can reduce network verification tasks to SAT Many good solvers. Nikolaj promotes Z3!
Data Plane as Logic Circuit SAT [Malik] Combin ational Logic ?1 ?1() ?3 ?3() ?1 ?2 ?2() ?3 ?2 NetworkFormula: ? ?,? = ?1 ?2 ?3 Model it as a combinational logic circuit? Outputs and signals are functions of only the present value of the inputs If so can use SAT solvers and not (harder) state machine verifiers
SAT Encoding of N1 10* *** 10* 01* A B 1** *** dst[1] := 0 1** *** D *** 1** Which packets can reach B from A? ? ???,???,??? ,??? , , = ?1 ?12 ?? = ?2 = ?1 ?13 ?? = ?3 = ?3 ?3? ?? = ? = ?3 ?32 ???0 = ?2 = ?2 ?2? ?? = ? 0= ? ? ???0,???0,???1,???1, 0, 1 ( 1= ? (? ???1,???1,???2,???2, 1, 2 ( 2= ? (? ???2,???2,???3,???3, 2, 3 3= ?)))
Encoding reachable states with Datalog Declare facts as predicates: Parent (john, aju), Parent (aju, tim) Declare inference rules: Head holds if body Ancestor (X, Y) :- parent (X, Y) Ancestor (X, Y) :- parent (X, Z), ancestor (Z, Y) State queries ?- Ancestor (john, X) System runs to a fixed point and returns all variables john is an ancestor of (aju, tim) Useful if we can state analysis problem recursively and want all solutions not just one. Restrictions eff. decision procedure
Datalog Encoding of N1 10* *** 10* 01* A B 1** *** dst[1] := 0 1** *** D *** 1** Which packets can reach B from A?
ENCODING IN A REAL SOLVER (Z3) N1 encoded into Bounded Model Checking (SAT) and Datalog using Z3 s Python API
SAT Encoding of N1 10* *** 10* 01* A B 1** *** dst[1] := 0 1** *** D *** 1** Which packets can reach B from A? ? ???,???,??? ,??? , , = ?1 ?12 ?? = ?2 = ?1 ?13 ?? = ?3 = ?3 ?3? ?? = ? = ?3 ?32 ???0 = ?2 = ?2 ?2? ?? = ? 0= ? ? ???0,???0,???1,???1, 0, 1 ( 1= ? (? ???1,???1,???2,???2, 1, 2 ( 2= ? (? ???2,???2,???3,???3, 2, 3 3= ?)))