Randomized Algorithms for Approximating Parameters

Randomized Algorithms
Randomized Algorithms
CS648
Lecture 9
Random Sampling
part-I
(Approximating a parameter)
 
1
Overview
 of the Lecture
 
 
 
Randomization Framework for 
estimation
 of 
a parameter
1.
Number of balls 
from a bag
2.
Size of transitive closure 
of a directed graph
 
 
An 
Inspirational Problem 
from Continuous probability
 
AN
 INSPIRATIONAL 
PROBLEM FROM
CONTINUOUS PROBABILITY
 
 
Sampling points on a 
line segment
0
1
Sampling points on a 
Circle 
(of circumference 
1
)
1
Transforming 
a 
line segment 
to a 
circle
(just a different perspective)
The knot formed by
joining the ends of
the line segment
Give the 
knot
 a
uniformly random
rotation around
the circle
Transforming 
a 
line segment 
to a 
circle
(just a different perspective)
First uniformly
random point is the
knot
.
We have got the answer of the problem
(without any knowledge of continuous probability theory)
0
1
 
 
 
ESTIMATING THE NUMBER 
OF
BALLS 
IN A BAG
 
Estimating the number 
of 
Balls
 in a BAG
 
Estimating the number 
of 
Balls
 in a BAG
 
Can we use it to design
an  algorithm ?
Estimating the number 
of 
Balls
 in a BAG
 
How 
good
 is the 
estimate ?
multiple sampling.
Multiple samplings 
to
improve 
accuracy
 and reduce error probability
A better algorithm for 
estimating the number of balls:
Final result
Randomized framework 
for
estimating a 
parameter
 
ESTIMATING THE SIZE 
OF
TRANSITIVE CLOSURE 
OF A DIRECTED GRAPH
 
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
Randomized Monte Carlo Algorithm for
estimating the size of transitive closure of directed graph
MIN-Label
 Problem
MIN-Label
 Problem
MIN-Label
 Problem
Inference
 from the inspirational problem
 
RANDOMIZED MONTE CARLO ALGORITHM
FOR ESTIMATING THE SIZE OF
TRANSITIVE CLOSURE OF A DIRECTED GRAPH
 
 
 
 
 
 
 
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
 
Can you answer 
Question 2
 now ?
Estimating size 
of 
Transitive Closure 
of
a Directed Graph
Homework
 
 
 
Use 
Chernoff
 bound to get a high probability bound on the error.
Hint:
Proceed along similar lines as in the case of estimating number of balls in a bag.
 
Make sincere 
attempts
 to do this homework. I shall discuss the same briefly in
the beginning of the next class.
Slide Note
Embed
Share

Lecture 9 of CS648 covers the random sampling approach for estimating parameters, such as determining the number of balls in a bag and the size of the transitive closure of a directed graph. An inspirational problem from continuous probability theory is discussed, involving selecting points uniformly and independently from an interval. Various scenarios, such as sampling points on a line segment and a circle, are explored to understand expected values. The transformation of a line segment into a circle is also examined, providing insights without requiring advanced knowledge of continuous probability theory.

  • Randomized Algorithms
  • Parameters Estimation
  • Sampling
  • Probability Theory

Uploaded on Sep 22, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Randomized Algorithms CS648 Lecture 9 Random Sampling part-I (Approximating a parameter) 1

  2. Overview of the Lecture Randomization Framework for estimation of a parameter 1. Number of balls from a bag 2. Size of transitive closure of a directed graph An Inspirational Problem from Continuous probability

  3. AN INSPIRATIONAL PROBLEM FROM CONTINUOUS PROBABILITY

  4. Question:? points are selected randomly uniformly and independently from interval [0,1]. What is the expected value of the smallest number ? ? ? ? 1 0 We shall solve many problems dealing with random points in an interval [0,1] in this course. But we won t require any knowledge of continuous probability theory . All we shall require is the following fact which is quite obvious: P(point ? belongs to an interval of length )=

  5. Sampling points on a line segment Question: What is E[??] ? Answer: It appears to depend upon ?. 1 1 0 0 ?? ??+1 ?3 ?1 ?2 ?4

  6. Sampling points on a Circle (of circumference 1) Question: What is E[??] ? ?? ?1 ?? 1 ?2 ?3 ?4 By symmetry of the circle, each ?? has identical probability distribution. E[?1]=E[?2]= = E[??] E[?1+?2+ ??]= ?? E[?1]= ?? 1/? 1

  7. Transforming a line segment to a circle (just a different perspective) The knot formed by joining the ends of the line segment Give the knot a uniformly random rotation around the circle

  8. Transforming a line segment to a circle (just a different perspective) ??+1 ?1 ?? The next ? points are the usual points on the line segment. ?2 First uniformly random point is the knot. ?3 ?4 Selecting ? points randomly uniformly on a unit line Selecting ? + ? points randomly uniformly on a unit circle.

  9. We have got the answer of the problem (without any knowledge of continuous probability theory) Question: What is E[??] ? 1 1 0 0 ?? ??+1 ?3 ?1 ?2 ?4 1 E[?1] = = E[??] = = E[??+1] = ?+1

  10. ESTIMATING THE NUMBER OF BALLS IN A BAG

  11. Estimating the number of Balls in a BAG There is a bag containing balls. ?, the number of balls is unknown. Each ball has a unique label from [1,?]. AIM: To estimate ? accurately and with high probability. For example: Report a number ? such that with probability at least 99%, ? ?.? ? ? (? + ?.?)? n l : c : : t q : : l 5 4 2 j 1 3 i : : TOOL:Sampling :

  12. Estimating the number of Balls in a BAG IDEA: The label of a sample ball provides some info. X: random variable for the label of a ball sampled randomly uniformly from bag. Question: What is E[X] ? Answer: E[X]= ?? ?(? = ?) = ?? 1 ? 1 ? ?? = ?+1 ? 2? ?+1 2 = = n l : c : : t q : : l 5 Can we use it to design an algorithm ? 4 2 j 1 3 i : : :

  13. Estimating the number of Balls in a BAG A simple algorithm: 1. Pick a ball randomly and uniformly from the bag. 2. Let ? be its label. 3. ? 2? 1. 4. Report ? . n l : c : : t q : : l 5 4 2 j 1 3 i : : :

  14. How good is the estimate ? N-1 N ?/4 1 2 Question: What is P( ?< ?/2) ? Answer: 1 4 Question: How to reduce the error probability ? Answer: multiple sampling.

  15. Multiple samplings to improve accuracy and reduce error probability 1 2 N Question: Which ball among the ? sampled balls will have label closest to (? + 1)/2? Question: How many of ? balls are expected to have label (? + 1)/2? Answer: ?/2 .

  16. A better algorithm for estimating the number of balls: 1. ? ?; // ? is a multiset 2. Repeat? times { Pick a ball randomly uniformly from the bag. Let ? be the its label. ? ? {?}; return the ball into the bag; } 1. Let ? be the ? 2th largest label from ?. 2. ? 2? 1. 3. Report ?.

  17. Question: What is P( ?< ?/2) ? 1 2 N ?/? ? : number of balls sampled from [1 ?/4] P( ?< ?/2) = ?? ? is sum of ? Bernoulli random variables ?1, , ?? such that ??= 1 0 P(??=1) = ?? E[?] = ??[??] = ? 4 ?? s are independent. Applying Chernoff Bound, P(? ?/2) = P(? >?/2) if ?th ball has label from [1..?/4] otherwise ? 4) ? ? P(? 1 + 1 16

  18. Final result Theorem: The randomized Monte Carlo performs ? sampling and reports a number ? such that with probability at least ? ? ?/? ? ?? 16 ,

  19. Randomized framework for estimating a parameter Let ? be a parameter which needs to be estimated. Design a randomized experiment such that there is a random variable ? such that ? ? = ? ? If ? takes value ?, then return ?? as the estimate for ?. ? ?? To improve accuracy in estimation: repeat the experiment ? times. Let ? has taken value ??, ,??. Calculate ? such that ?? is most likely to be closest to ? ? . Return ? ??? .

  20. ESTIMATING THE SIZE OF TRANSITIVE CLOSURE OF A DIRECTED GRAPH

  21. Estimating size of Transitive Closure of a Directed Graph Let ? = (?,?) be a directed graph on ? vertices and ? edges, For any ? ?, Reach(?) = {? | ? is reachable from ?}. ?(?)= |Reach(?)| Problem: Given a directed graph ? = (?,?) on ? vertices and ? edges, compute ? ? for each ? ?. Applications: (Graph based Data bases) 1. Query requires collecting information stored at nodes reachable from a given node. 2. An estimate on the number of nodes reachable can be used to get an estimate on the time (or processing) required to answer the query. 3. This estimate can be used for optimizing a set of queries to be answered.

  22. Estimating size of Transitive Closure of a Directed Graph Problem: Given a directed graph ? = (?,?) on ? vertices and ? edges, compute ? ? for each ? ?. Deterministic Algorithm 1. Perform DFS/BFS from each ? ? to compute Reach(?). 2. 2. ? ? |Reach(?)|; 3. Return ?. Time complexity: O(??)

  23. Estimating size of Transitive Closure of a Directed Graph Problem: Given a directed graph ? = (?,?) on ? vertices and ? edges, compute ? ? for each ? ?. Randomized Monte Carlo Algorithm 1. For any ? > 0 and every vertex ? ?, computes ?(?) such that (1 ?)? ? ? ? (1 + ?)? ? Error Probability < 1/?? for any constant ? Time complexity: O((? + ?) log? ) 2. 3.

  24. Randomized Monte Carlo Algorithm for estimating the size of transitive closure of directed graph Ingredients A DeterministicO(? + ?) time algorithm for a problem MinLabel . 1. 2. Inference from the inspirational probability problem we discussed today.

  25. MIN-Label Problem Given a directed graph ? = (?,?) on ? vertices and ? edges, where each each ? ? stores a real number ? ? for each ? ?, minL ? = ??? ? ? ? Reach(?)} Problem: Given a directed graph ? = (?,?) on ? vertices and ? edges, and array ?(), compute minL ? for each ? ?.

  26. MIN-Label Problem Algorithm1 1. Compute ??: the graph obtained by reversing all edge directions. 2. Sort vertices in the increasing order of their ?() value. 3. Repeat until ?? { Pick vertex of least ?() value; Let it be ?; Perform DFS/BFS to compute Reach(?); For each vertex ? Reach(?), minL ? ?(?); Remove Reach(?)from ?? } ??Is empty Time complexity: O(? + ?log?)

  27. MIN-Label Problem Algorithm2 (usually many problems are easier on Directed acyclic graphs) 1. Compute Strongly connected components of ?. 2. Build DAG (directed acyclic graph) from ? after converting each SCC to a vertex. 3. Solve the problem on this DAG using DFS/BFS. Time complexity: O(? + ?)

  28. Inference from the inspirational problem If ? numbers are selected randomly uniformly and independently from [0,1], the expected value of the smallest number is = 1 ?+1. Question: If some numbers were selected randomly uniformly and independently from [0,1], and the smallest among them is ?, then what is a right guess for the numbers selected ? 1 ? 1 Answer: ??

  29. RANDOMIZED MONTE CARLO ALGORITHM FOR ESTIMATING THE SIZE OF TRANSITIVE CLOSURE OF A DIRECTED GRAPH

  30. ?

  31. 0.83 0.38 ? 0.53 0.22 0.45 0.71

  32. 0.65 0.901 0.265 0.28 0.81 0.34 0.49 0.63 0.54 0.83 0.38 0.14 0.74 0.53 0.22 0.45 0.71

  33. Estimating size of Transitive Closure of a Directed Graph A simple algorithm: 1. Assign to each ? a random no. ?(?) selected uniformly and independently from [0,1]. 2. Compute minL(?) for each ? ?; ? minL(?) ? 3. ?(?) ?? 4. Return ?.

  34. Estimating size of Transitive Closure of a Directed Graph A better algorithm: For? = 1 to ?do { 1. Assign to each ? a random no. selected uniformly and independently from [0,1]. 2. Compute minL(?) for each ? ?; 3.?[?,?] minL(?); } ?(?) ?? Return ?.

  35. ? Question 1: Which value among {?[1,?], , ?[?,?]} is likely to be closest to ?(?)+?? ? Question 2: How many of {?[1,?], , ?[?,?]} are likely to have value ?(?)+?? ? Question 3: What is the probability that ?[?,?] for any fixed ?is ?(?)+? ? Answer: (Hint: for this to happen all vertices in Reach(?) must get ?() ?(?)+?) ? ? ? ? This probability is = ? ? ? +? ? ? ? Can you answer Question 2 now ? 1 0 ? ?(?) + ?

  36. Estimating size of Transitive Closure of a Directed Graph A better algorithm: For? = 1 to ?do { 1. Assign to each ? a random no. selected uniformly and independently from [0,1]. 2. Compute minL(?) for each ? ?; 3.?[?,?] minL(?); } min*(?) (?/?)th largest value among {?[1,?], , ?[?,?]} ; ? ?(?) min (?) ?; Return ?.

  37. Homework Use Chernoff bound to get a high probability bound on the error. Hint: Proceed along similar lines as in the case of estimating number of balls in a bag. Make sincere attempts to do this homework. I shall discuss the same briefly in the beginning of the next class.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#