Sketching as a Tool for Algorithmic Design by Alex Andoni - Overview
Utilizing sketching in algorithmic design, Alex Andoni from Columbia University explores methodologies such as succinct efficient algorithms, dimension reduction, sampling, metric embeddings, and more. The approach involves numerical linear algebra, similarity search, and geometric min-cost matching, aiming for efficient solutions and approximate near neighbor search. Sketch-And-Solve techniques, oblivious subspace embedding, and regression methods are discussed for practical applications. The presentation emphasizes the importance of dimension reduction and the use of various linear maps for achieving desired outcomes in algorithmic design.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Sketching as a tool for Algorithmic Design Alex Andoni (Columbia University)
Methodology ? Succinct Efficient Algorithms data representations 000000 011100 010100 000100 010100 011111 011111 000000 011100 010100 000100 010100 dimension reduction Sketching (streaming) Sampling Metric embeddings lossy good for specific task 000000 001100 000100 000100 110100 000000 001100 000100 000100 110100 111111 111111 3
Plan a random sketch of sketching Numerical Linear Algebra Similarity Search Geometric min-cost matching 4
Numerical Linear Algebra Problem: Least Square Regression ? = ???????||?? ?|| where ? is ? ? matrix ? ? approximation 1 + ? Idea: Sketch-And-Solve solve ? = ???????||??? ??|| Where ? is dimension-reduction matrix reduce to dimension ? ? Hope: ||?? ?|| 1 + ? ||?? ?||
Sketch-And-Solve [Sarlos 06, Clarkson-Woodruff 13, ] Oblivious Subspace Embedding: linear map ?: ? ?s.t. for any linear subspace ? ? of dimension ?: ? ? ||? ? || ||?|| 1 ? ?: random 1, together with CountSketch Pr ? 1 ? ? 1 ? ? Sketch-And-Solve: ? ??? ? + log1 ??? ? + ?? 1 +Preconditioner: ? ?
1 regression ? No good dimension reduction in 1 Sketch: linear map ?: ? ?, estimator ?, s.t. for any ? ? : Pr ?: random Cauchy vars [Indyk 00] ?(? ? ) ||?||1 (1 ?) 1 ? ? Poor DR: linear map ?: ? ?s.t. for any linear subspace ? ?of dimension ?: ? ? 1 ||? ? ||1 ||?||1 ?: random 1/Exp vars, & CountSketch [Woodruff Zhang 13] ?? 1 Pr ? 1 ? ? 1 ? ? Can get: ? ??? ? log? + [CW 13, ] More: low-rank approximation & optimization, matrix multiplication, see [Woodruff, FnTTCS 14]
Plan Numerical Linear Algebra Similarity Search Geometric min-cost matching 8
Approximate Near Neighbor Search Preprocess: a set of ? point approximation ? > 1 Query: given a query point ?, report a point ? ? with the smallest distance to ? up to factor ? > 1 ? ? ?? ? Near neighbor: threshold ? ? Parameters: space & query time 9
Locality-Sensitive Hashing [Indyk-Motwani 98] ? ? Random hash function on ? satisfying: for closepair (when ? ? ?) Pr[ (?) = (?)]is high for farpair (when ? ? > ??) Pr[ (?) = (? )]is small ?2= ? ? ?1= not-so-small Use several hash tables Pr[ (?) = (?)] ? =log1/?1 log1/?2 1 ?1 ??, where ?2 ? ? 10 ? ??
LSH maps [Datar-Indyk-Immorlica-Mirrokni 04, A.-Indyk 06] Space ?1+? Time Exponent ?? ? = ? ? = 1/? ? 1/?2 1 2?2 1 ? = 1/2 ? = 1/4 ? = 1/7 ? ? ? ? Even better LSH maps? NO: example of isoperimetry [Motwani-Naor-Panigrahy 06, O Donell-Wu-Zhou 11] Can get better maps, if allowed to depend on the dataset [A-Indyk-Nguyen-Razenshteyn 14, A-Razenshteyn 15] Applications to: Max/Min inner product (MIPS), greedy coordinate descent, kernel approximation & estimation, 11
Plan Numerical Linear Algebra Similarity Search Geometric min-cost matching Can exploit sketches for: input internal state / partial computations Computation input output 12
LP for Geometric Matching Problem: Given two sets ?,? of points in ?, Find min-cost matching a.k.a., Earth-Mover Distance, optimal transport, Wasserstein metric, etc Classically: LP with ?2 variables Best: ?(?2/?4) time* [Altschuler-Weed-Rigolet 17] But can hope for ?2 runtime! min ? + s.t. ?? = ||?? ??|| ??? ?2 ?? 1 ?? and ??? = 1 ?? Solve-And-Sketch [A.-Nikolov-Onak-Yaroslavtsev 14]: For fixed ?,?, can solve in ?1+?(1) time 13
Solve-And-Sketch (=Divide & Conquer) Partition the space hierarchically in a nice way In each part Compute a solution for the local view Sketch the solution using small space Combine local sketches into (more) global solution 14
Solve-And-Sketch for Geo Matching Partition the space hierarchically in a nice way In each part Compute a solution for the local view Sketch the solution using small space Combine local sketches into (more) global solution all potential local solutions quad-tree cannot precompute any local solution after committing to a wrong alternation, cannot get <2 approximation!
Sketching ALL local solutions Let ? = size of ?-net ( portal points) Define solution function ?: ? + input ? ? defines the flow (matching) at the portals ( interface to the rest) ?(?) is the min-cost matching assuming flow ? at portals Sketch of ?: ? log ? ? size ? for any ?, can estimate ?(?) up to 1 + ? factor
Overall algorithm Recomposing sketches into global solution: can formulate as a (small) LP Reinterpretation: decompose the LP into many small ones small ones have size ?? 1 ok to use any poly-time LP solver recompose hierarchically (divide and conquer)
Finale Succinct Efficient Algorithms data representations Numerical Linear Algebra Similarity Search Geometric min-cost matching Dimension reduction Sketching (streaming) Sampling Metric embeddings Just a small selection 18