Near-Optimal Quantum Algorithms for String Problems
This paper discusses near-optimal quantum algorithms for various string problems like exact pattern matching, longest common substring, lexicographically minimal string rotation, longest palindromic substring, and more. It explores quantum black-box models, query complexities, and previous sublinear quantum algorithms.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Near-Optimal Quantum Algorithms for String Problems Ce Jin MIT Shyan Akmal MIT SODA 2022
Basic Problems in Stringology Exact Pattern Matching Longest Common Substring Lexicographically Minimal String Rotation Longest Palindromic Substring Longest Square Substring ? = nan? = banana ? = banana? = ananas ? = bananaMinRot(?) = abanan ? = bananas ? = banbananas KMP 70 Weiner 73 Booth 80, Shiloach 81 Manacher 75 Main-Lorentz 79, Crochemore 81 Well-known (quasi-) linear time classical algorithms Recent interest in quantum algorithms
Quantum Black-box Model Input string ? 1..? ? polynomially bounded alphabet = {0,1,2, ,poly(?)} Quantum Oracle ?? |? |0 |? |?[?] (? ? ) For a quantum query algorithm (with 2/3 success probability): Query complexity ?= number of quantum queries to ?? (Goal: sublinear ? ?1 ? ) Example: For ? {0,1}? computing OR(?) takes ( ?) quantum queries (Grover Search) Time complexity? also counts # elementary gates (& quantum random-access gates) (Goal: time-efficient implementation with ? ? polylog(?) )
Simple Example: Longest Common Prefix ?= bananas ? = banquet LCP = 3 Lexicographical order: ? ? Input: ? 1..? ,?[1..?] as oracles Output: Length of LCP of ?,? Binary Search for prefix length ?: Grover Search detects ? [1..?]: ? ? ? ? in ? ? ? total quantum time ? quantum time. ? hides polylog(?) factors
Previous Sublinear Quantum Algorithms Problem Quantum Time Quantum Query LB ? ?1/2 Ramesh & Vinay (2003) ? ?3/4 Wang & Ying (2020)* ? ?5/6 Le Gall & Seddighin (2020, ITCS 22)* ?1/2 tight Exact Pattern Matching (? = nan? = banana) ?1/2 Lexico. Minimal String Rotation (? = bananaMinRot(?) = abanan ) ? ?2/3 ? Longest Common Substring (? = banana? = ananas ) ? ?1/2 ?1/2 Longest Palindromic Substring (? = bananas ) tight Le Gall & Seddighin (2020, ITCS 22) * They obtained additional results for average-case/approximation/non-repetitive cases
Our Results Problem Quantum Time Quantum Query LB ? ?1/2 Ramesh & Vinay (2003) ??/?+? 1 This work ? ??/? This work ?1/2 tight Exact Pattern Matching (? = nan? = banana) ?1/2 tight Lexico. Minimal String Rotation (? = bananaMinRot(?) = abanan ) ?2/3 Longest Common Substring (? = banana? = ananas ) tight This talk ? ?1/2 ?1/2 Longest Palindromic Substring (? = bananas ) tight Le Gall & Seddighin (2020, ITCS 22) ? ?1/2 This work ?1/2 Longest Square Substring (? = banbananas) tight (and more: Maximum/Minimum Suffix, Longest Lyndon Substring, )
This talk: LCS Problem Quantum Time Quantum Query LB ? ?5/6 Le Gall & Seddighin (2020, ITCS 22)* ?2/3 Longest Common Substring (LCS) ? ??/? This work Le Gall & Seddighin tight Input: two strings ? 1..? ,? 1..? Output: maximum ? such that ? ?..? + ? = ?[?..? + ?) for some ?,? reduction from List Disjointness Problem Input: Two integer lists ?1, ,??, ?1, ,?? Output: Return YES iff ?? ?? for all ?,?(i.e, ??? ?,? = ?) (? = banana? = ananas output = 5) Quantum query LB: ?2/3(Aaronson & Shi 02) Quantum walk algorithm: ?(?2/3) (Ambainis 04) * [LG-S] gave tight ?(?2/3) algo for 1 + ? -approx LCS in non-repetitive strings
LCS Warm up (Le Gall & Seddighin) Decision problem: LCS ?,? ?? i.e., find ?,? such that ? ?..? + ? = ?[?..? + ?) ? 1 ? ? 1 ? ? ? + ? ? ? + ? Reduces to List Disjointness each lexicographical comparison takes ? ?(?2/3 ?) time via Ambainis algorithm (comparison-based version) ? quantum time Considering all ?(?2) possible pairs is too wasteful
Anchoring for LCS [SV13, CCIKPRRW18, ACPR19, ACPR20, BGKK20, CGP20, CKPR21, ] ( Decision problem: LCS ?,? ?? ) Construct anchor sets??,?? [?]such that: If LCS ?,? ?, then exists anchored length-? common substring. ? b a n a b a n a n a h h = anchor n a b a n a b a b a b a ?
Anchoring for LCS [SV13, CCIKPRRW18, ACPR19, ACPR20, BGKK20, CGP20, CKPR21, ] ( Decision problem: LCS ?,? ?? ) Construct anchor sets ??,?? [?]such that: If LCS ?,? ?, then exists anchored length-? common substring. ? b a n a b a n a n a h h = anchor n a b a n a b a b a b a ?
Anchoring for LCS [SV13, CCIKPRRW18, ACPR19, ACPR20, BGKK20, CGP20, CKPR21, ] ( Decision problem: LCS ?,? ?? ) Construct anchor sets ??,?? [?]such that: If LCS ?,? ?, then exists an anchored length-? common substring. (common substring ? ?..? + ? = ?[?..? + ?) with 0 < ? such that ? + ??,? + ??) Task: Find ? ??,? ??(a witness pair ) such that LCP(? ? ?..? 1???,? ? ?..? 1???) + LCP ?[?..? + ?),?[?..? + ?) ?. ? = ? ? + ? b a n a b a n a n a h h = anchor n a b a n a b a b a b a ? ? ? + = ?
Algorithm outline ( Decision problem: LCS ?,? ?? ) ? = ?(?/?3/4) ? = ? 1. Design anchor sets ??,?? Small size ? = ??+ |??| ? Easy to compute: Strongly explicitly -?-time computable Given 1 ? ?, output the ?-th anchor in ? quantum time. 2. Find a witness pair ? ??,? ?? (Compare) warm-up LCS algorithm: ? ? ?2/3 quantum time 1. Design anchor sets ??,?? Small size ? = ??+ |??| ? Easy to compute ? Quantum Walk [Ambainis 04] ? Quantum Walk [Magniez-Nayak-Roland-Santha 11] ? + ? ?2/3 quantum time with dynamic data structures (Skip Lists, 2D Segment Trees, ) to maintain Lexicographical order & LCP array & other stuff Overall ? ?2/3 quantum time
Difference Cover [Mae85, BK03] (Construct anchor sets ??,?? [?]such that: If LCS ?,? ?, then exists anchored length-? common substring) ??= { ?,2 ?,3 ?, } ??= ? ?/?{?? + 1,?? + 2, ,?? + ?} ? ? ? ? Pro: ?(1)-time-computable (input-oblivious) Con: Size ? = ?(?/ ?) too large There are ?(?/?) size constructions (e.g. [Ben-Nun, Golan, Kociumaka, and Kraus 20]) (not easily computable) (input-dependent)
Approximate Difference Cover Proximity parameter 1 ? ? (previously ? = 1) ??= { ??,2 ??,3 ??, } ??= ? ?/?{?? + ?,?? + 2?, ,?? + Size = ?(?/ ??) ?? ? ? ?/? ?} ? ? ? ? ? ? Only guarantees ? ? ? ? Next: Align them based on their ?(?)context String Synchronizing Sets [Kempa-Kociumaka STOC 19] with parameter ? < ? (Finally we will set ? = ?)
?-string synchronizing set [KK19] (concatenated input ? ?$?) A (randomized) subset of positions ? [1.. ? ]: Consistency: whether ? ? depends only on ?[?..? + 2?)(and random coins) Density: ?..? + ? ? is nonempty, for all ? Sparsity: ?[ ?..? + ? ? ] ?(1), for all ? For every ? in the approximate difference cover, add synchronizing positions from [?,? + 2?) into anchor set (computable in ? ? time) #anchors = |apx diff cover| ? 1 = ?(?/ ??) ? ? + 2? bananabananabbananaana < ? > ? ? bbbanabananabbanananaa ? ? + 2? (common substr length ? ?) (? = 3) ? bold : sync. positions : apx diff cover
Periodic case No synchronizing positions in highly-periodic regions Extend the period to both directions (up to ? distance) in ? quantum time (binary search + Grover search) Add the starting positions of the Lyndon roots near the boundaries into anchor set ? aanabababababababababanab ? : apx diff cover bnnbbababababababababanaa ? Lyndon root = ab (defn: lexico minimal rotation of the length-p substr, where p = minimal period) #anchors = |apx diff cover| ? 1 = ?(?/ ??) Reporting time = ?(? + ?) ? ? ? ?/?3/4 Set ? = ?
Algorithm outline (Decision problem: LCS ?,? ??) ? = ?(?/?3/4) ? = ? 1. Design anchor sets Small size ? = ??+ |??| ? Easy to compute: Strongly explicitly -?-time computable Given 1 ? ?, output the ?-th anchor in ? quantum time. 2. Find a witness pair ? ??,? ?? ? ? ? + ? ?2/3 quantum time Overall ? ?2/3 quantum time
Open Problem LCS(?,?) can be computed in ?(?2/3) quantum time, but.. LCS ?,? ?? could be decided faster for large ? Le Gall-Seddighin: ?(?/ ?) quantum time What s the optimal dependency on ?? Thanks!
Finding a witness pair using quantum walks (Anchor set ? = ?? ??: size ? = ?) Quantum Walk Search on Johnson Graph [Ambainis 04, Szegedy 04, Magniez-Nayak-Roland-Santha 11, ] Quantum analogue of the following Markov chain State: ?-subset ? ? ( ? = ? ?) Transition: ? ? ? {?} (where ? ?,? ?). Search for an ?-subset ? ? that contains a witness pair ?2 ?2fraction of ?-subsets are marked (if a witness pair exists). Associate every ?-subset ? with a data structure data ? ? = - Setup step: - Update step: - Checking step: check if ? contains a witness pair, given ????(?) Total complexity: S + ? (C + ? U) initialize the data structure update data(?) when ? ? ? {?} 1
Finding a witness pair using quantum walks (Anchor set ? = ?? ??: size ? = ?, ?-time computable) (Witness pair ? ??,? ??: LCP ?[?..? + ?),?[?..? + ?) + LCP(? ? ?..? 1???,? ? ?..? 1???) ?. ) For ?-subset ? ?, data ? includes Lexicographical order of length-? substrings associated with all anchors in ? ( ? ?..? + ? ,? ? ?..? 1???,?[?..? + ?), ?[? ?..? 1]??? for all ? ? ?? and all ? ? ?? ) LCPs of lexicographically adjacent pairs ( together determine LCP of every pair) Update step: In ? time, compute the anchor ? ? to be inserted. Binary search on the lexicographical order, where each comparison takes ? Query & Time complexity ? ?(using suitable data structures to maintain the order) Setup step: ? insertions Checking step ( does ?contain a witness pair? ): Query complexity = 0. Time complexity?? back band bona bond ? time. LCP=2 LCP=1 LCP=3
Finding a witness pair using quantum walks (Anchor set ? = ?? ??: size ? = ?, ?-time computable) (Witness pair ? ??,? ??: LCP ?[?..? + ?),?[?..? + ?) + LCP(? ? ?..? 1???,? ? ?..? 1???) ?. ) For ?-subset ? ?, data ? includes Lexicographical order of length-? substrings associated with all anchors in ? ( ? ?..? + ? ,? ? ?..? 1???,?[?..? + ?), ?[? ?..? 1]??? for all ? ? ?? and all ? ? ?? ) LCPs of lexicographically adjacent pairs ( together determine LCP of every pair) Update step: In ? time, compute the anchor ? ? to be inserted. Binary search on the lexicographical order, where each comparison takes ? Query & Time complexity ? ?(using suitable data structures to maintain the order) Setup step: ? insertions Checking step does ?contain a witness pair? Query complexity = 0. Time complexity?? Linear ?(?)time is too slow ? time. (also called Two string families LCP problem)
Time complexity of checking step Task: Maintain a dynamic anchor subset ? ? to support Check whether ? contains a witness pair Insert/delete an anchor The data structure must be quantum-walk-friendly : (1) History independence (2) Worst-case time complexity per update (w.p. 1 (This precludes the amortized polylog(?) data structure of [Charalampopoulos-Gawrychowski-Pokorski 20]) Grover search for ? ? ?? and 0..? : Query: Does there exist ? ?? such that LCP ?[?..? + ?),?[?..? + ?) and This can be reduced to 2D orthogonal range search 1 poly ? ) LCP ? ? ?..? 1???,? ? ?..? 1??? ? Checking time complexity ?( ??) Total cost = S +? ? (C + ? U) ? ? + ? ? + ? ?2/3 ? + ? ??