Understanding Skyline Computation Algorithms and Expected Size Analysis
Explore topics related to skyline computation algorithms, skyline point generation, and expected skyline size analysis. Learn about Pareto optimality, dependent points generation, and the expected size of skylining points in a random setting. Delve into the algorithms, probabilities, and complexities involved in computing skyline points efficiently.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Backwards Backwards Analysis Analysis Gerth St lting Brodal Pre-talent track activity in Algorithms, Department of Computer Science, Aarhus University, January 29, 2020
Pareto Pareto optimal / non optimal / non- -dominated dominated points / skyline points / skyline No points Pareto optimal Dominated points Vilfredo Pareto (1848 1923) Skyline
Exercise Exercise 1 (skyline 1 (skyline construction construction) ) Given n points (x1, y1), ..., (xn, yn) in R2 Give an algorithm for computing the points on the skyline ? What is the running time of your algorithm ? skyline Skyline
Problem Problem expected expected skyline skyline size size ? ? Consider n points (x1, y1), ..., (xn, yn) in R2 Each xiand yiis selected independently and uniformly at random from [0, 1] What is the expected skyline size ? 1 0 0 1 Skyline size
Exercise Exercise 2 (dependent points) 2 (dependent points) Describe an algorithm for generating n distinct points (x1, y1), ..., (xn, yn) in R2 Each xiand yiis selected uniformly at random from [0, 1] The points are not independent 1 0 0 1 Skyline size
Generating Generating random Assume the points are generated by the following algorithm randompoints points 1) Generate n random x-values x1, ..., xn 2) sort the x-values in decreasing order 3) for decreasing xigenerate random yi 1 (xi, yi) (xi, yi) is a skyline point yi= max(y1, ..., yi) Prob[(xi, yi) skyline point] = 1 since y1, ..., yiindependt and the same distribution, all permutations are equally likely, i.e. probability yito be largest among i values is 1/i i (x1, y1) 0 0 1 Skyline size
Expected Expected skyline skyline size size Stochastic variable: Xi= 1 if point i on skyline otherwise 0 E[|skyline|] = E[X1 + + Xn] = E[X1] + + E[Xn] = iProb[(xi, yi) skyline point] = 1 = i=1..n i (harmonic number Hn) ln n + 1 1 (xi, yi) 1+ 1 2+ + 1 1 n (x1, y1) 0 0 1 Skyline size
n n-th Harmonic number Hn= 1/1 + 1/2 + 1/3 + + 1/n = 1/i 1 i = 1 n n Hn 1 Hn 1/n 1/x dx = [ ln x ] = ln n ln 1 = ln n 1 ln n + 1/n Hn ln n + 1 1/n 1/1 1/2 1/3 1/4 1/n 1 2 3 4 5 n Harmonic numbers
Exercise Exercise 3 ( 3 (divide divide- -and and- -conquer conquer skyline) skyline) Consider the following algorithm Find the topmost point p in O(n) time recurse on points to the right of p 1 p Show that the expected running time is O(n) 0 0 1 Skyline
QuickSort QuickSort a a randomized randomized sorting sorting algorithm algorithm Quicksort(x1, ..., xn) Pick a random pivot element xi Partition remaining elements: S smaller than xi, and L larger than xi Recursively sort S and L 7 4 15 12 8 11 3 5 1 14 6 7 4 3 5 1 6 8 15 12 11 14 3 1 4 7 5 6 11 15 12 14 1 3 5 6 7 12 14 15 3 5 6 12 15 6 QuickSort
QuickSort QuickSort analysis analysis I I Alternative Quicksort Consider a random permutation of the input, such that x (1)is the first pivot, then x (2), x (3), .... Observations Changes the order unsorted sublists are partitioned, but the pivots are still selected uniformly among the elements xiand xjare compared if and only if xior xjis selected as a pivot before any element in the sorted list between xiand xj output xj xi QuickSort
QuickSort QuickSort analysis analysis II II E[comparisons by quicksort] = E[ i<jcost of comparing xiand xj] = i<jE[cost of comparing xiand xj] = i<j 2 q p + 1 2n 2 d n d 2n ((ln n + 1) 1) = 2n ln n 2 |r j r(i)| + 1where r(i) = position of xiin output = 1 p<q n 1 r(i) output r(j) xj p xi q d QuickSort
Exercise Exercise 4 ( 4 (random randomsearch search trees trees) ) Construct a unbalanced binary search tree for n numbers x1< < xnby inserting the numbers in random order What is the probability that xjis an ancestor of xi? What is the expected depth of a node xi? 15 8 17 13 3 5 10 Insert: 15, 8, 17, 13, 3, 5, 10 Random search trees
Convex Convex hull hull Convex hull = smallest polygon enclosing all points Convex hull
Exercise Exercise 5: 5: Convex Convex hull hull Give an O(n log n) worst-case algorithm finding the Convex Hull Convex hull
Convex Convex hull hull randomized randomized incremental incremental 1) Let p1, ..., pnbe a random permutation of the points 2) Compute convex hull of {p1, p2, p3} 3) c = (p1 + p2 + p3) / 3 4) for i = 4 to n insert piand construct Hifrom Hi-1 : if piinside Hi-1skip, otherwise insert piin Hi-1and delete chain inside p3 pi c c p2 H3 p1 Hi-1 Convex hull
Convex Convex hull hull inserting inserting p pi i pi e c Hi-1 Convex hull
Convex Convex hull hull inserting inserting p pi i pi e c How to find e for pi ? store set of points with e and reference to e from pi Hi-1 Convex hull
Convex Convex hull hull - - analysis analysis Each point inserted / deleted / inside at most once in convex hull E[# point-edge updates] = E[ 4 i n pp updated on insertion i] = 4 i n pE[p updated on insertion i] 2 i 3 2n (ln n + 1) since p only updated on insertion i if piis u or v Total expected time O(n log n) p u e v 4 i n p c Hi Convex hull
Binary Binary search (a (a debugging debugging case) search - - but case) but forgot forgot to sort the array... to sort the array... How many cells can ever be reached by a binary search ? 17 useful 31 43 useful 3 13 47 29 useful not reachable 2 41 11 7 23 19 5 37 find(41) 2 3 41 31 11 13 7 17 23 47 19 43 5 29 37 Binary searching unsorted array
Reachable Reachable nodes nodes analysis analysis Pr[ viuseful ] = |Li| / j|Lj| E[ # useful nodes at level ] = i ( |Li| / j|Lj| ) = 1 E[ # useful nodes in tree ] = height - 1 E[ # reachable nodes in tree ] height2= O(log2n) useful 17 useful 31 43 ]- ,17[ ]17,43[ ]43, [ # useful nodes ? v1 v2 v3 v4 L2 = L1 = { 2,3,4,5,11 } L3 = { 19,23,29,37,41 } L4 = { 47 } Binary searching unsorted array
Closest Closest pair pair p q Given n points, find pair (p, q) with minimum distance Algorithm (idea) permute points randomly p1, p2, ..., pn for i = 2..n compute i= distance between closest pair for p1, ..., pi Observation Pr[ i< i-1] 2 since minimum distance can only decrease if piis defining i i Closest pair
Closest Closest pair pair grid grid cells cells Analysis E[rebuild cost] = E[ 3 i nrebuild cost inserting pi] = 3 i nE[rebuild cost inserting pi] 3 i n Total expected time O(n) Construct grid cells with side-length i-1 Point (x, y) in cell ( x/ i 1, y/ i 1) Note: 4 points in cell if all pairwise distances i-1 Neighbors of piwithin distance i-1are in 9 cells Store non-empty cells in a hash tabel (using randomization) Rebuild grid whenever idecreases pi i-1 2 i i 2n Closest pair
7 0.9 Handin Handin ( (Treaps Treaps) ) 13 0.5 3 0.1 A treap is a binary search tree with a random priority assigned to each element when inserted (in the example elements are white and priorities yellow). A left-to-right inorder traversal gives the elements in sorted order, whereas the priorities satisfy heap order, i.e. priorities increase along a leaf-to-root path. 21 0.4 11 0.3 8 0.1 a) Argue that an arbitrary element in a treap of size n has expected depth O(log n) b) Describe how to insert an element into a treap and give running time Treaps
References References Raimund Seidel, Backwards Analysis of Randomized Geometric Algorithms, in Pach J. (eds) New Trends in Discrete and Computational Geometry. Algorithms and Combinatorics, vol 10, 37-67. Springer, Berlin, Heidelberg 1993. DOI: 10.1007/978-3-642-58043-7_3 (public available at CiteSeerX). [Chapters 4-5] Sariel Har-Peled, Backwards analysis, lecture notes, 2018. sarielhp.org. [Chapters 35.1-35.3] Raimund Seidel and Cecilia R. Aragon, Randomized search trees, Algorithmica, 16(4 5), 464 497, 1996. DOI 10.1007/BF01940876. David R. Karger, Philip N. Klein, Robert E. Tarjan, A randomized linear-time algorithm to find minimum spanning trees, Journal of the ACM, 42(2), 321-328, 1995. DOI 10.1145/201019.201022. Timothy M. Chan, Backwards analysis of the Karger-Klein-Tarjan algorithm for minimum spanning trees, Information Processing Letters, 67(6), 303-304, 1998. DOI 10.1016/S0020-0190(98)00129-X. References