Understanding Lower Bounds in the Cell Probe Model

Slide Note
Embed
Share

Exploring the concept of lower bounds for static data structures, this content delves into the tradeoffs between query time and space efficiency. It discusses the need for proving lower bounds, the model of data structures, and how CPUs access memory in computational devices. Kasper Green Larsen from Aarhus University provides insights and examples in this educational material.


Uploaded on Sep 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Data Structure Lower Bounds in the Cell Probe Model Kasper Green Larsen Aarhus University Kasper Green Larsen 1/45

  2. Part I Static Data Structures Kasper Green Larsen 2/45

  3. Static Data Structures Goal: Represent a set of input data, such that queries about the data can be answered efficiently. Example: Represent the road network of Denmark (data), such that, given two cities (query), one can compute the distance between them (along shortest path). 2 Data Structure 1: Query time: 1 Space: n2 3 Data Structure 1: Precompute and store F C Dist(A,C) is: 1+3+3=7 all answers in table. E 15 3 B 2 1 Data Structure 2: Store graph. Run Dijkstra s when given query. 2 Data Structure 2: Query time: nlgn+m Space: m+n 4 A D Kasper Green Larsen 3/45

  4. Lower Bounds Static Data Structures: Tradeoffs between query time and space. Obvious question: Can the problem be solved more efficiently? Lower Bounds: Goal: Show achieved tradeoff is best possible, that is: There cannot exist a better data structure. Kasper Green Larsen 4/45

  5. This Lecture The following question: How do we prove lower bounds for static data structures? To answer this: We first need a model of what a data structure can do. Kasper Green Larsen 5/45

  6. A Model What do we want from our lower bounds? They must apply to PCs/MACs/SmartPhones etc.: They must apply regardless of programming language: Kasper Green Larsen 6/45

  7. A Model What is common to all these computational devices? Instructions are executed by a CPU. The CPU must access memory (RAM). Memory Access Kasper Green Larsen 7/45

  8. A Model What is the memory (RAM)? A set of S memory cells. Each stores w bits. Each has an address amongst 1, ,S. We assume w max{lgS,lgn}: A cell can address any input object or other cell (can store a pointer). In practice, we have w=32, or 64. Can address any other RAM cell. Kasper Green Larsen 8/45

  9. A Model What can a CPU do? Varies a lot between different architectures: Common: First CPU reads RAM cells into registers. It takes at least one clock-cycle to access a cell. Then: CPU executes instructions on registers. Sum, Multiplication, Bitwise AND, OR, XOR, etc. Comparisons: =0, x>y, x<0, etc. What do we want from our lower bounds? They should not depend on the instruction set! Any! Kasper Green Larsen 9/45

  10. The Cell Probe Model A static data structure is: A collection of S memory cells. Each cell stores w max{lgS,lgn} bits, and has an address amongst 1, ,S. The contents of the cells encodes the input data. Queries: Read up to t memory cells and announce answer based on contents (Enough registers to remember everything). Cell read at each step may depend arbitrarily on the query and the contents of the previously read cells. Kasper Green Larsen 10/45

  11. The Cell Probe Model Queries: Read up to t memory cells and announce answer based on contents (Enough registers to remember everything). Cell read at each step may depend arbitrarily on the query and the contents of the previously read cells. Formally: Depth-t decision tree for each query. Internal nodes: annotated with cell addresses. av Edges: annotated with cell contents. 00 11 01 10 al ai aj ak Leaf nodes: annotated with query answers. 8km Kasper Green Larsen 11/45

  12. The Cell Probe Model Queries: Read up to t memory cells and announce answer based on contents (Enough registers to remember everything). Cell read at each step may depend arbitrarily on the query and the contents of the previously read cells. Query Algorithm: Start at root. Internal node v: -Read cell of addr. av -Descent into child along edge matching cell contents. Leaf: -Return answer Formally: Depth-t decision tree for each query. av 00 11 01 10 al ai aj ak 8km Kasper Green Larsen 12/45

  13. The Cell Probe Model Queries: Read up to t memory cells and announce answer based on contents (Enough registers to remember everything). Cell read at each step may depend arbitrarily on the query and the contents of the previously read cells. Formally: Depth-t decision tree for each query. q2 q3 q4 q1 Kasper Green Larsen 13/45

  14. The Cell Probe Model The cell probe model [Yao 81]: Clean model. We lower bound the number of memory accesses, therefore also the number of clock-cycles needed. We assume the computer can perform arbitrary instructions on the cells. Lower bounds thus apply to all our computational devices: Kasper Green Larsen 14/45

  15. This Lecture The following question: How do we prove lower bounds for static data structures in the cell probe model? Kasper Green Larsen 15/45

  16. Time Line Static Lower Bounds Highest query time lower bounds. Assume linear space, ncqueries for c=O(1) and w= (lgn). 1995: Miltersen: t c. Kasper Green Larsen 16/45

  17. Communication Complexity Static Lower Bounds [Miltersen 95]: Reduction to communication game: Aarhus to Copenhagen. Communicates bits Bob receives road network (data). Alice receives two cities (query). Goal: Compute answer to Alice s query and minimize communication. Kasper Green Larsen 17/45

  18. Communication Complexity Static Lower Bounds [Miltersen 95]: Reduction to communication game: The distance is 305km. I need cell #13 #13 contains 0010 Aarhus to Copenhagen. I need cell #5 #42 contains 1110 Data Structure to Protocol: Bob builds data structure on his input. Alice simulates query algorithm on her query: For each cell needed: Ask Bob for contents. Bits send: Alice: tlgS Bob: tw Kasper Green Larsen 18/45

  19. Communication Complexity Static Lower Bounds [Miltersen 95]: Reduction to communication game: Bits send: Alice: tlgS Bob: tw Aarhus to Copenhagen. Using tools from communication complexity, prove: Either Alice sends A bits or Bob sends B bits. Implication: Either tlgS A or tw B, i.e. t min{A/lgS,B/w}. Kasper Green Larsen 19/45

  20. Communication Complexity Static Lower Bounds [Miltersen 95]: Reduction to communication game: Bits send: Alice: tlgS Bob: tw Aarhus to Copenhagen. Using tools from communication complexity, prove: Either Alice sends A bits or Bob sends B bits. We change our toy problem for simplest possible proof. Kasper Green Larsen 20/45

  21. The Problem Polynomial Evaluation: Input: Degree n-1 polynomial, P, over finite field. For this talk, field is prime field ?pfor p= (n2). Query: Element x in ?p, evaluate P(x). Example p=7, n=3: P(x) = 3x2+x+5. Query x=4, has answer: P(4) = (3*16+4+5) mod 7 = 1. Cell probe model with ? = ?(lg?). Kasper Green Larsen 21/45

  22. The Problem Polynomial Evaluation: Input: Degree n-1 polynomial, P, over finite field. For this talk, field is prime field ?pfor p= (n2). Query: Element x in ?p, evaluate P(x). The Key Property: Any degree n-1 polynomial over ?p is uniquely determined from n points on the polynomial. Same as over the real numbers. The answer to a query x in ?p gives a point on P, namely (x,P(x)). Kasper Green Larsen 22/45

  23. Communication Complexity Static Lower Bounds [Miltersen 95]: Alice receives x in ?p, Bob receives polynomial P. Recall p= (n2). Communicates bits x=5 Bits send: Alice: tlgS Bob: tw Using tools from communication complexity, prove: Either Alice sends lgp/4 bits or Bob sends nlgp bits. Implies tlgS>lgp/4 i.e. t= (lgn/lgS) or t= (nlgp/w)= (n). Kasper Green Larsen 23/45

  24. Communication Complexity A Rectangle Argument. Assume Alice sends < lgp/4 bits and Bob sends B bits where p= (n2). Row for the p queries of Alice. Column for pninputs of Bob. Entry (x,P) stores P(x). x Kasper Green Larsen 24/45

  25. Communication Complexity A Rectangle Argument. Assume Alice sends < lgp/4 bits and Bob sends B bits where p= (n2). Protocol implies rectangle with: p/2lgp/4= p3/4>> n rows. pn/2Bcolumns. Answer to each x fixed in rectangle. Why? Follow protocol and fix most likely message. Alice knows answer when no more communication. Alice says: 0 1 1 Bob says: 0 x Kasper Green Larsen 25/45

  26. Communication Complexity Subtle Detail: Alice knows answer at end of protocol. Normally assume both players do. For 1-bit output, easy: Announce answer when know. If both players know: Monochromatic rect. For poly eval: Answer takes lgp bits. As much as Alice s input! Can only assume Alice knows. Each row in rectangle is monochromatic/fixed. x Kasper Green Larsen 26/45

  27. Communication Complexity A Rectangle Argument. Assume Alice sends < lgp/4 bits and Bob sends B bits where p= (n2). Protocol implies rectangle with: p/2lgp/4= p3/4>> n rows. pn/2Bcolumns. Answer to each x fixed in rectangle. Observe: Any two polys can share no more than n-1 points (uniquely determined from n). Thus 2B pni.e. B > nlgp. x Kasper Green Larsen 27/45

  28. Communication Complexity Static Lower Bounds [Miltersen 95]: Alice receives x in ?p, Bob receives polynomial P. Recall p= (n2). Communicates bits x=5 Bits send: Alice: tlgS Bob: tw Using tools from communication complexity, we proved: Either Alice sends lgp/4 bits or Bob sends nlgp bits. Implies tlgS>lgp/4 i.e. t= (lgn/lgS) or t= (nlgp/w)= (n). Kasper Green Larsen 28/45

  29. Communication Complexity Static Lower Bounds [Miltersen 95]: Reduction to communication game: x Show: Either Alice sends A bits or Bob sends B bits. Implies: t min{A/lgS,B/w}. Space is at least linear, i.e. S n. Bound never better than t 2. Limitations: Trivial Protocol: Alice sends query to Bob. Bob answers. Costs lgp=2lgn bits. Implies: Cannot prove Alice must send more than 2lgn bits. t 2lgn/lgS is the highest bound that can be proved. Any DS problem with nc queries: Bound never better than t c. Kasper Green Larsen 29/45

  30. Time Line Static Lower Bounds Highest query time lower bounds. Assume linear space, ncqueries for c=O(1) and w= (lgn). 1995: Miltersen: t c. 2006: Patrascu and Thorup: t lgn/lglgn. Kasper Green Larsen 30/45

  31. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1=5 Communicates bits x2=13 x3=9 Bob receives polynomial (data). Alice receives k field elements (k queries). Goal: Compute answers to all of Alice s queries and minimize communication. Kasper Green Larsen 31/45

  32. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: The answers: 18, 1, 13. I need cells #13,#15,#24 x1=5 They store: 0010,1001,1111 x2=13 x3=9 Data Structure to Protocol: Bob builds data structure on his input. Alice simulates query algorithm on her queries in parallel: For the cells needed at each step: Ask Bob for contents. Kasper Green Larsen 32/45

  33. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1, ,xk Bits send: Alice: ktlg(S/k) Bob: ktw Data Structure to Protocol: Bob builds data structure on his input. Alice simulates query algorithm on her queries in parallel: For the cells needed at each step: Ask Bob for contents. The Trick: Set of k cells cheaper to specify than k individual cells: lg ? ? klg(S/k) bits, i.e. lg(S/k) bits per cell. Kasper Green Larsen 33/45

  34. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1, ,xk Show: Either Alice sends A bits or Bob sends B bits. Implies: t min{A/klg(S/k),B/kw}. Kasper Green Larsen 34/45

  35. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Assume: Alice sends <klgp/4 bits. Bob sends B bits. x1, ,xk Fixing most likely message: Rectangle with: /2klgp/4 rows. pn/2Bcolumns. Answer to each x1, ,xkfixed in rectangle. p ( ) k Kasper Green Larsen 35/45

  36. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Rectangle with: /2klgp/4 rows. pn/2Bcolumns. p ( ) k x1, ,xk How many distinct queries can Alice answer in rectangle? Each row distinct set of k queries. Kasper Green Larsen 36/45

  37. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Rectangle with: /2klgp/4 rows. pn/2Bcolumns. p ( ) k x1, ,xk Let U be union of all rows. Let X=|U|. Must have: p ( ) k X ( ) k /2klgp/4 Since each row is a k-sized subset of U. Kasper Green Larsen 37/45

  38. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Fact: (n/k)k (en/k)k ( ) k Rectangle with: /2klgp/4 rows. pn/2Bcolumns. ( ) k ( ) k n p ( ) k p X x1, ,xk /2klgp/4 (eX/k)k (p/k)k/2klgp/4 X p/(e2lgp/4)= p3/4/e > n. So > n distinct queries. Kasper Green Larsen 38/45

  39. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Rectangle with: /2klgp/4 rows. pn/2Bcolumns. p ( ) k x1, ,xk Each row set of k points. Assume Alice sends <klgp/4 bits. Then union of query sets has size X>n. Therefore Bob sends B nlgp. Kasper Green Larsen 39/45

  40. Communication Complexity Rectangle Argument for Polynomial Evaluation Let Alice have k elements in ?p. Assume p = (n2). Let Bob have degree n-1 polynomial P. Conclusion: Either Alice sends klgp/4 bits. Or Bob sends nlgp bits. x1, ,xk Kasper Green Larsen 40/45

  41. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1, ,xk Showed for any k: Either Alice sends klgp/4 bits or Bob sends nlgp bits. In protocol: Alice sends tklg(S/k) bits and Bob sends tkw bits. We can choose k: Fix k=nlgp/(2tw). Then Bob sends nlgp/2 bits. So Alice must send klgp/4 bits: tklg(S/k) klgp/4 t lgp/lg(Stw/(nlgp)). Kasper Green Larsen 41/45

  42. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1, ,xk The lower bound: t lgp/lg(Stw/(nlgp)) For w,lgp = (lgn): t lgn/lg(St/n) For linear space S=O(n): t lgn/lgt t lgn/lglgn Sw/nlgpis space overhead over linear space! Kasper Green Larsen 42/45

  43. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: x1, ,xk The lower bound: t lgp/lg(Stw/(nlgp)) Technical Remark: Must have k 1 for previous argument to go through. We chose k=nlgp/(2tw). So either t lgp/lg(Stw/(nlgp)) or t nlgp/(2w). Kasper Green Larsen 43/45

  44. Communication Complexity Static Lower Bounds [Patrascu and Thorup 06]: Refined reduction to communication game: Bits send: Alice: ktlg(S/k) Bob: ktw x1, ,xk Show: Either Alice sends A bits or Bob sends B bits. Implies: t min{A/klg(S/k),B/kw}. Limitations: Trivial Protocols: One player sends input to other. 1: Bob sends nlgp bits 2: Alice sends klgp bits. Implies: (1) implies must set k nlgp/(tw) n/t. (2) and k n/t implies t lgp/lg(St/n) is the peak. Space is at least linear, i.e. S n. Peaks at t lgp/lglgp. Kasper Green Larsen 44/45

  45. Time Line Static Lower Bounds Highest query time lower bounds. Assume linear space, ncqueries for c=O(1) and w= (lgn). 1995: Miltersen: t c. 2006: Patrascu and Thorup: t lgn/lglgn. 2012: Larsen: t lgn. Kasper Green Larsen 45/45

Related