Understanding Branching Programs and Decision Trees in Computation
Branching Programs and Decision Trees serve as simple models for computation, exploring space and time complexity trade-offs. The dual terminology, encompassing Switching Networks to Binary Decision Diagrams, reveals the evolution of these models in complexity theory and formal verification.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Branching Programs Paul Beame University of Washington
Branching Programs are related to a very simple way to think about computation...
Decision Trees (Query Algorithms) ?? ? ? ?1 ?? ?? ? ? ? ? ?? ?? ?? ?? ? ? ? ? ? ? ? ? ?? ?? ?? ?? ?? ?? ?? ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 Decision Tree for ??????(??,??,??,??)
Decision DAGs (Branching Programs/BDDs) ?? ? ? ?? ?? ? ? ? ? ?? ?? ? ? ? ? ?? ?? ? ? ? ? 0 1 Decision DAG for ??????(??,??,??,??)
Dual Terminology Nondeterministic version is the oldest Switching Networks (Shannon 1949) First explicit lower bound by (Ne iporuk 1966) Branching Programs (Masek 1976) Goal: A simple combinatorial model capturing Space (and Time) complexity Multi-way/output (Borodin-Cook 1982) for Time-Space tradeoffs Barrington s Theorem (1986) Most popular terminology in complexity theory Binary Decision Diagrams (Akers 1978, Boute 1976) Goal: Convenient representation of Boolean functions Adopted by (Bryant 1986) for restricted version: Ordered Binary Decision Diagrams (OBDDs) often incorrectly abbreviated simply as BDDs also known as Trellises in coding theory Usually more restricted versions than Branching Programs Most popular terminology in formal verification
Modeling Space-Bounded Computation Read-only Input ? ?? ?is a bit ? ?1 1 ? ?2 2 ? ?3 3 ? ?4 4 . . . . . . . . . . ? ?? ? Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-only Output y y1 1 y y2 2 y y3 3 y y4 4 . . . . . . y y? ? Turing Machines
Modeling Space-Bounded Computation Read-only Input ? ?? ?has ? ?(log bits (log ? ?) ) ? ?1 1 ? ?2 2 ? ?3 3 ? ?4 4 . . . . . . . . . . ? ?? ? R(1) ? ?R(10) if R(5)=0 then R(5) 4 R(7) R(R(4))*R(2) for R(1)=1 to R(0) do R(R(2)) R(R(2))+R(1) ? ?R(13) R(7) Space/c log log ? ? . . 0 . . 1 . . 2 . . 3 . . 4 . . 5 . . . . . . . . . . . . Write-only Output y y1 1 y y2 2 y y3 3 y y4 4 . . . . . . y y? ? Random-Access Machines (RAMs)
Modeling Space-Bounded Computation Read-only Input ? ?? ?is a bit ? ?1 1 ? ?2 2 ? ?3 3 ? ?4 4 . . . . . . . . . . ? ?? ? Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-only Output y y1 1 y y2 2 y y3 3 y y4 4 . . . . . . y y? ? Configuration = snapshot of machine operation
Modeling Space-Bounded Computation Read-only Input ? ?? ?has ? ?(log bits (log ? ?) ) ? ?1 1 ? ?2 2 ? ?3 3 ? ?4 4 . . . . . . . . . . ? ?? ? R(1) ? ?R(10) if R(5)=0 then R(5) 4 R(7) R(R(4))*R(2) for R(1)=1 to R(0) do R(R(2)) R(R(2))+R(1) ? ?R(13) R(7) Space/c log log ? ? . . 0 . . 1 . . 2 . . 3 . . 4 . . 5 . . . . . . . . . . . . Write-only Output y y1 1 y y2 2 y y3 3 y y4 4 . . . . . . y y? ? Configuration = snapshot of machine operation
Full Configuration Graphs for Machine M M Graph ? ?M M,? ? depending on input size? ?-not on input One node per space-?(?)-configuration of M M Labeled by input location ??read in this configuration One out-edge per possible value of ??leading to next configuration of M M (depending on ??value) Start node is initial configuration Halting configurations are sink nodes # of nodes is 2 2O(?(? ?))for both Turing machines and RAMs w.l.o.g. add a step counter to make ? ?M M,? ?acyclic increases space by at most constant factor -
Branching programs Start Time ? = length(longest path) ?? ? = (?,?,?,?, ) ??= ? ??= ? ? ? Space ? = log2(# nodes) ?? ?? ? ? Levels ? ? ??= ? Multiway: inputs ?? ? ?? ?? ?? ? ??= ? ??= ? ? ? ? ? ?? ?? ? ? ?? ?? ?? ?,? ? ? ? ? Levelled w.l.o.g. ??= ? ??= ? ??= ? 1 0 Sink node Multi-output version
Time and Space Theorem:? is computable in time ?(?) and space ? ? log2? by a Turing machine or RAM ? on inputs of length ? can be represented by a Branching Program (BP) with time ?(?) and space ?(?(?)) Corollary:? does not have a polynomial-size BP ? is not computable in space ?(log?) Note: Branching programs are much more powerful than TMs or RAMs since they can perform arbitrary non-local computations on their storage in one time-step. What bounds do we know?
Simple Bounds & Key Questions Decision trees are branching programs For functions ? ? defined on {?,?}? Time ? and Size ?? so Space ? Space bounds ? satisfy log2? ? ?* since it takes space log2? just to read the input. Key questions for branching programs: Space (size) lower bounds? Tradeoff lower bounds between time ? and space ?? Special classes of branching programs with good properties? *For multi-way BP, Size ?? so ? ?log2?
Outline 1. Branching program basics 2. Space (size) lower bounds 3. Multi-output functions Time-Space tradeoff lower bounds for general BPs 4. Single-output functions Restricted classes of BPs OBDDs, Read-once (FBDDs), Oblivious, Read-k Lower bound methods for restricted classes Lower bound methods for general BPs Applying tradeoffs: BPs and static data structures 5. Multi-output functions using single-output techniques Time-Space tradeoff lower bound for encoding good codes
Shannon Lower Bound Lemma: The # of different functions ?:{?,?}? {?,?} computed by Boolean BPs of size ? is at most (???)? Proof: Each Boolean BP of size ? is specified by Variables for each of its nodes: ?? Node names reached by 0 and 1 outedges: ??(? ?) Name of the root: ? Overcounting: ?! different labellings of each BP Total at most ?? ?? ? ? +?/?! (???)? Theorem: Almost all functions ?:{?,?}? {?,?} require BP size ?? ? ? Proof: The # of different functions is ??? To capture at least a (? ?/?) fraction of all functions, need (???)? ???/?; i.e. ? ?? ?? ? (hence space ? = ?(?)). ?? ? ? ?
Subfunctions: Restricting a branching program Set ??= ? ?? ? ? Restricted BP computes a subfunction ?? ?? ? ? ? ? Can repeat to get other subfunctions ?? ?? ?? ? ? ? ? ? Remaining BP has subset of original nodes with restricted variables removed ? ?? ?? ?? ?,? ? ? ? ? 1 0
Element Distinctness and its many subfunctions Defn: For ??,??, ,?? ??= ?,?? log? ? ???,???,??, ,?? = 1 1 iff all ?? are different Subfunctions of ???,?? where all but ??are restricted: The constant function 0 if any equal pair is fixed ?? ?? ?different functions, Otherwise ? ? depending on which values ?? should avoid.
Neiporuk Lower Bound Theorem: Any BP for ???,??: ?,??? ??? ? {0,1} requires size ? ??which is ? ??/????? {0,1} Proof: Let ? be a BP for ???,?? Break up input into blocks ??, ,?? of ?log? bits each ?? is one of the ? numbers input to ???,??. Let ?? be # of nodes of ? with variables in ?? by Shannon argument, for different restrictions outside ??, at most (??log? ??)?? distinct subfunctions for ???,?? on ?? but this is at least ?? ? so ?? ?/?. ? ? choices for ? ? so total size is ??/?. Still largest size lower bound known for any explicit function
Time-Space Tradeoffs for Multi-output Functions
Time-Space Tradeoffs Lower bounds proven for multi-output functions, e.g. Sorting: ?????,??: ?? ? ?? ? Matrix-vector product for certain matrices ? ??:??? ??? given by ??? = ?? for ? ???x? e.g. Discrete Fourier Transform Convolution & Integer multiplication Matrix Multiplication Generalized string matching Universal hash function computation
Time-Space Upper Bounds for DFT Theorem [Savage-Swamy 1978]: For every space bound ? log?? the DFT can be computed in time ? = ?(??/? + ? log?) log ? Sketch: In ?(?) time can compute ? outputs using space ? + log( ? ?) and a single Pass over the input. ?(?/?) passes in total.
Key Property of the DFT Matrix [Yesha 1984] Every ? ? submatrix (not necessarily contiguous) of the DFT matrix for? ?/? and ? ??/? has full rank ?. ? ??/? columns ? ?/? rows For an ? ? matrix if this holds for all ? ?? and ? (? ?)? then we call the matrix ?-hard
Key Lemma Defn: Matrix ? ???x? is ?-hard iff every sub-matrix of ? ? with ? ?? ?rows and at least (? ?)? columns has full rank ?. Key Lemma: If matrix ? ???x? is ?-hard then for any fixed assignment a to at most ?? entries of ? ? and any fixed output assignment b on ? ? [? ?] with |? ?|= ? ? ?? ?, Pr[b? ?=(? ?? ?)? ?|? ? agrees with a] 1 1/? ?? ? Proof: fixed fixed rank ? ? = free
Time-Space Tradeoff Lower Bound Theorem[Abrahamson 89]: For every space bound ? log??, any BP computing ??? = ?? for an ? ? matrix ? that is ?-hard requires time ? = ?(??/?). Corollary: Computing the DFT requires ? = ?(??/?) Proof method: Introduced by [Borodin-Cook 1982] for Sorting: Use # of output values produced as measure of progress If ? ? is small then For each input, can identify large progress in a small piece of BP defined by a BP node Show: each piece can have large progress for only a few inputs So... many nodes needed
Borodin-Cook Lower Bound Method Break BP into layers of ?? time steps Consider behavior on input ? trace trace(?) = sequence of nodes at boundaries between layers visited by input ? ?? T T ? There are ? = The BP produces ? outputs for every input ? So, every input ? has some layer ? ?producing ? = ?/? = ???/? outputs on input ? 1. ?? layers 2. I.e., the mini-BP of length ?? rooted at node v? ?in trace trace(?) produces ? outputs on input ? At most ?? such mini-BPs do this for all inputs
Borodin-Cook Lower Bound Method So... Every input ? ? has some mini-BP of length?? that produces ? = ?/? = ???/? correct outputs for input ? ? At most ?? such mini-BPs together must do this for all inputs Lemma: If matrix ? ???x? is ?-hard then for ? ? ?? ?any mini-BP of length ??produces ? correct outputs for at most a ? ? fraction of all inputs Proof of Theorem given Lemma: Only ?? mini-BPs, each working for only ? ???/? fraction of all inputs. So ?? ? ???/? ?. That is ? ??? (log ?)/?.
Borodin-Cook Lower Bound Method Lemma: If matrix ? ???x? is ?-hard then for ? ? ?? ?, any mini-BP of length ??produces ? correct outputs for at most a ? ? fraction of all inputs. Proof: Each input follows a unique path. So suffices to show this for each path in the mini-BP separately, conditioned on following that path. A path fixes ? output values and ?? input values so it follows from the Key Lemma: Key Lemma: If matrix ? ???x? is ?-hard then for any fixed assignment a to at most ?? entries of ? ? and any fixed output assignment b on ? ? [? ?] with |? ?|= ? ? Pr[b? ?=(? ?? ?)? ?|? ? agrees with a] 1 1/? ?? ? ?? ?,
All you really need A notion of tough outputs, and A probability distribution on input vectors s.t. ? ?tough outputs] 2 2-? ?min 1. Pr [? ? has For any assignment a to assignment b to ? ? ?maxtough output positions Pr [?(?) agrees with b|? ? agrees with a] 1/ ? input positions and any 2. 1/? ?? ? Then for every ? between ?min and ? ?max you get ? = ?(?? (log ?)/?) Note: The space bound is only enforced every ? steps
Some Time-Space Tradeoffs Lower bound of roughly ? = ?(??/?) for Sorting, Unique Elements [Borodin-Cook 1982, B 1991] Matrix-vector product and DFT [Abrahamson 1991] Convolution & Integer multiplication [Abrahamson 1991] Generalized string matching [Abrahamson 1987] Decoding good error-correcting codes [Gr nemeier 2006] Exact frequency moments in sliding windows [B-Clifford-Machmouchi 2013] Lower bound of ? = ?(??/?) for Universal hash function computation [Mansour-Nisan-Tiwari 1993] Lower bound of ??= ?(??/?) for ?x? Matrix multiplication [Abrahamson 1991]
Some Open Problems Prove an ?(?2 2) size lower bound for an explicit Boolean function Find time-space tradeoff lower bounds for other multi-output functions, e.g. Encoding asymptotically good error-correcting codes. [Bazzi-Mitter 2005] conjectured ? = ?(??/?) They showed ? = ? ? ??? ??? ??? ? ? ? ? which we will prove later. Element distinctness in sliding windows [B-Clifford-Machmouchi 2013]? = ?(??/?/??/?) ?