Approximation Algorithms for NP-complete Problems
Dive into the world of approximation algorithms for NP-complete problems like Min Vertex Cover with a focus on providing good but not optimal solutions. Explore various approximation techniques and algorithms to tackle these challenging computational problems efficiently.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
CMPT 405/705 Design & Analysis of Algorithms March 14, 2022
Plan for today Approximaion algorithms for NP complete problems 2-approximation for min-vertex-cover ln(n)-approximation for set-cover Some basic probability theory Linearity of expectation Markov s inequality Chebyshev s inequality Chernoff inequality More approximation algorithms 7/8 approximation for Max-3-SAT
Coping with NP-hardness We don t know efficient algorithms for NP-complete problems. Best algorithms are essentially exponential time. Q: What can we do? We have several options 1. Give up 2. Algorithms that work for some inputs, but not worst case 3. Algorithms that give a solution that is good, but not optimal In this course will focus on option 3
2-approximation algorithm for Minimum Vertex Cover
The Min Vertex Cover Problem Input: A graph G = (V,E). Output: A minimal collection of vertices touching all edges of G.
The Min Vertex Cover Problem Input: A graph G = (V,E). Output: A minimal VC minimal subset of V touching all edges. Fact: The problem of finding a smallest collection is NP-complete In particular, we don t know a polynomial time algorithm for this problem. and we don t believe there exists a polynomial time algorithm for this problem. Theorem: The Minimum Vertex Cover Problem has a polynomial time 2-approximation algorithm. That is, if the algorithm gets a graph with a vertex cover of size k, then the output of the algorithm is a vertex cover of size at most 2k.
The Min Vertex Cover Problem Theorem: The Minimum Vertex Cover Problem has a polynomial time 2-approximation algorithm. That is, if the algorithm gets a graph with a vertex cover of size k, then the output of the algorithm is a vertex cover of size at most 2k. Algorithm: 1. Set C = empty set What is the runtime of the algorithm? 2. While G contains edges do Pick any edge (u,v) in G Add u and v to C Remove the u and v from G, and remove all edges touching them 3. Return C
The Min Vertex Cover Problem Theorem: If G has a vertex cover of size k, then the algorithm returns a vertex cover of size at most 2k. That is, the algorithm gives a 2-approximation for min vertex cover problem. Proof: It is clear that C is a vertex cover: in the end no edge is left uncovered. Fix any vertex cover C* in G of size k. We show that |C| <= 2k. 1. Observe that for any edge the algorithm chooses, at least one of its end points must be in C*. (why?) 2. Therefore, since all edges chosen by the algorithm are disjoint, it follows that the number of edge we choose is at most |C*| = k. 3. For each such edge, we add 2 vertices to our solution. 4. Therefore, the output contains at most 2k vertices.
More on approximation algorithms
More on approximation algorithms Definition: Let L be a minimization problem. (E.g., min vertex cover, min set cover, traveling salesman problem) An algorithm ALG is said to be an -approximation for L (for >1) if for any input it returns a solution that it at most *OPT Same for maximization: Definition: Let L be a maximization problem (max-sat, max clique, max-cut) An algorithm ALG is said to be an -approximation for L (for 0< <1) if for any input it returns a solution that it at least *OPT
More on approximation algorithms We focus on poly-time approximation algorithms for NP complete problems. Examples: Min Vertex Cover has a poly-time 2-approximation algorithm On the other hand, better than 1.99-approximation is NP-complete. Traveling salesman problem (TSP) in general graphs cannot be approximated within any factor in poly time TSP on metric graphs has 1.5-approximation algorithm Max-cut has a 0.878-approximation algorithm. On the other hand, better than 0.95-approximation is NP-complete. Max-clique on n vertices has a log2(n)/n approximation algorithm A better 1/n0.99-approximation is NP-complete. More on this next
ln(n)-approximation for Set Cover
ln(n) approximation for Set Cover Input: A universe of n elements U = {a1 an}, and S1 Sm - m subsets of U. Output: Find a smallest collection of sets that cover all elements U={a1 an}.
ln(n) approximation for Set Cover Input: A universe of n elements {a1 an}, and S1 Sm - m subsets of {a1 an} Output: Find a smallest collection of sets that cover all elements {a1 an} Fact: The problem of finding a smallest collection is NP-complete In particular, we don t know a polynomial time algorithm solving this problem. and we don t believe there exists a polynomial time algorithm for this problem. We can ask for an almost optimal solution. Goal : Design a poly-time algorithm that outputs a solution that is close to OPT?
ln(n) approximation for Set Cover Input: A universe of n elements {a1 an}, and S1 Sm - m subsets of {a1 an} Output: Find a smallest collection of sets that cover all elements {a1 an} Greedy Algorithm: 1. Let C = empty set // elements covered so far 2. Let SOL = empty set What is the runtime of the algorithm? 3. While C U do find Si SOL such that Si covers maximal number of points in U\C Add Si to SOL Add the elements of Si to C 4. Return SOL
ln(n) approximation for Set Cover Input: A universe of n elements {a1 an}, and S1 Sm - m subsets of {a1 an} Output: Find a smallest collection of sets that cover all elements {a1 an} What can we guarantee about the quality of the solution? Theorem: If there are k sets that cover all elements, then the algorithm returns at most k ln(n) sets that cover all elements. That is, our greedy algorithm is a ln(n) approximation algorithm for Set Cover.
ln(n) approximation for Set Cover Theorem: If there are k sets that cover all elements, then the algorithm returns at most k ln(n) sets that cover all elements. Proof: Suppose there are k sets that cover all n elements. Then one of the sets must be of size at least n/k. Therefore, since in the first step we pick the largest set, we cover at least n/k elements in the first step. So after the first step we have at most n-n/k = (1-1/k)n elements left. The optimal set cover consists of k sets. Hence, there is a set that covers at least 1/k fraction on the remaining elements, i.e., at least 1/k* (1-1/k)n elements. Therefore, after two steps we are left with at most (1-1/k)* (1-1/k)n elements.
ln(n) approximation for Set Cover Proof cont: After the first step we have at most n-n/k = n*(1-1/k) elements left. The optimal set cover (still) consists of k sets. Hence, there is a set that covers at least 1/k fraction on the remaining elements, i.e., at least 1/k* (1-1/k)n elements. Therefore, after two steps the number of elements left is at most n*(1-1/k)* (1-1/k) = n*(1-1/k)2. After three steps the number of elements left is at most n*(1-1/k)3, and so on After t steps the number of elements left is at most n*(1-1/k)t, and so on
ln(n) approximation for Set Cover Proof cont: After t steps the number of elements left is at most n*(1-1/k)t, and so on Consider the algorithm after t = k ln(n) steps. Then the number of elements left not covered after t = k ln(n) steps is at most n*(1-1/k)t = n*(1-1/k)k ln(n) < n*e-ln(n) = 1. [using the fact that (1-1/k)k < 1/e for all k>1] Conclusion: after t = k ln(n) steps the number of elements left is less than 1, and therefore, all elements are already covered.
ln(n) approximation for Set Cover Input: A universe of n elements {a1 an}, and S1 Sm - m subsets of {a1 an} Output: Find a smallest collection of sets that cover all elements {a1 an} Greedy Algorithm: 1. Let C = empty set // elements covered so far 2. Let SOL = empty set 3. While C U do find Si SOL such that Si covers maximal number of points in U\C Add Si to SOL Add the elements of Si to C 4. Return SOL
ln(n) approximation for Set Cover Remarks: This algorithms is essentially optimal Theorem: It is NP-hard to find a 0.99*ln(n)-approximation for Set Cover.
log(n)/n approximation for Max Clique
log(n)/n approximation for Max Clique Input: A graph G = (V,E) on n vertices Goal: Find a clique in G of maximum size.
log(n)/n approximation for Max Clique Input: A graph G = (V,E) on n vertices Goal: Find a clique in G of maximum size. Fact: The problem of finding a maximum clique is NP-complete What about an almost optimal solution? Theorem: For any constant >0 it is NP-hard to find an -approximation for Max-Clique. For example, if G contains a clique of size k=n0.9, it is NP-hard to find a clique of size k/100 Theorem: Given a graph on n vertices that has a clique of size n0.99, it is NP- hard to find a clique of size n0.01
log(n)/n approximation for Max Clique Theorem: There exists a poly time log(n)/n-approximation algorithm for Max- Clique. That is, if a graph contains a clique of size k, then the algorithm outputs a clique of size at least k log(n)/n. For example: if G has a clique of size n/10, the algorithm will find a clique of size > log(n)/10. Q: For which values of k is this algorithm interesting? A: it is interesting for k>n/log(n) In HW4: you will design an algorithm that given G that has a clique of size n/log2(n), will find a clique of size > log(n)/log log(n).
log(n)/n approximation for Max Clique Theorem: There exists a poly time log(n)/n-approximation algorithm for Max- Clique. Algorithm: 1. Partition V into t=n/log(n) sets each of size log(n). 2. That is V = V1 V2 Vt for t=n/log(n) 3. In each Vi find a maximal clique using brute force (in time 2O(log(n))=poly(n)) 4. Output the maximal clique found in step 3. What is the runtime of the algorithm?
log(n)/n approximation for Max Clique Theorem: There is a polytime log(n)/n-approximation algorithm for Max-Clique. Analysis: Suppose the maximum clique in G has size k. If k < n/log(n), then the algorithm returns a clique of size at least 1, which is trivially at least k log(n)/n. Suppose now that k >=n/log(n). Since there are t=n/log(n) subsets Vi, one of the Vi s must contain a clique of size at least k/t = k log(n)/n, as required.
More on approximation algorithms We focus on poly-time approximation algorithms for NP complete problems. Examples: Min Vertex Cover has a poly-time 2-approximation algorithm Set cover we saw ln(n)-approximation algorithm Max-clique on n vertices has a log(n)/n approximation algorithm Max-clique on n vertices has a log(n)2/n approximation algorithm. Ask me if you are interested
Some basic facts in probability theory
Expectation of a random variable Expectation: Let X 0 be a random variable taking integer values. The expectation of X, is defined as E[X] = i 0 i * Pr[X = i]. *Do not confuse with the median Med[X], this is a number M such that Pr[X<= M] >= and Pr[X>= M] >= . Example: X is the random variable with Pr[X=1] = 1/4, Pr[X=2] = 1/3 , Pr[X=3] = 5/12 Then E[X] = 1*(1/4) + 2*(1/3) + 3*(5/12) = 13/6 M[X] = 2
Linearity of expectation Linearity of expectation: Let X1, X2 Xn random variables taking integer values. Then E[X1 + X2+ + Xn] = E[X1] + E[X2] + + E[Xn] In particular, E[k*X1] = k*E[X1] This is true even if the variables are dependent. Example, choose x1 {0,1} random w.p. , and let x2= x3=x4=1-x1. Then E[x1+x2+x3+x4] = E[x1] + E[x2]+E[x3]+E[x4] =0.5+0.5+0.5+0.5=2
Markovs inequality Markov s inequality: Let X be a random variables taking non-negative values. Then, for all t>0 Pr[X t] E[X]/t. In particular, for all ?>1 Pr[X ?*E[X]] 1/?. Proof (for integer values X): Write E[X] = i>=0 i * Pr[X = i] = 0<=i<t i * Pr[X = i] + i ti * Pr[X = i] i ti * Pr[X = i] i tt * Pr[X = i] = t * i tPr[X = i] = t*Pr[X t]
Variance of a random variable Variance: Let X be a random variable taking integer/real values. The variance of X, is defined as Var[X] = E[(X-E[X])2] Note that Var[X] 0 with equality if and only if X is constant Claim: Var[X] = E[X2] - E[X]2 Proof: By definition we have Var[X] = E[(X-E[X])2] = E[X2 -2X*E[X] + E[X]2] = E[X2] 2E[X*E[X]] + E[X]2 = E[X2] - E[X]2 Fact: Let X1, X2 Xn be independent random variables. Then Var[X] = Var[X1] + Var[X2] + + Var[Xn]
Chebyshevs inequality Chebyshev inequality: Let X be a random variables taking real values. Then, for all t>0 Pr[| X - E[X] | > t] < Var[X]/t2 = (?/t)2 Here ? =(Var[X])1/2 is the standard deviation of X. Proof: Pr[| X - E[X] | > t] = Pr[(X - E[X])2 > t2] [By Markov] E[(X - E[X])2] / t2 = Var[X] / t2
Chernoff bound Theorem [Chernoff bound]: Suppose ?1,?2, ?? are independent 0-1 random variables, such that Pr ??= 1 = ?. Let S = ?1+ ?2+ + ??. Pr ?/? ? > ? < 2 ? ?2?/3?. Then Pr ? ?? > ?? < 2 ? ?2?/3?. Equivalently Think of p as a constant, say p =1/3. If we take n Bernoulli(p) samples, and let ? = ?/ ?, for ? constant ?2 3?~ ???????? Then Pr ? ?? > ? ? < 2 ?
Questions? Comments?