Online Performance Guarantees for Sparse Recovery

undefined
Online Performance Guarantees
for Sparse Recovery
Raja Giryes
ICASSP 2011
Volkan Cevher
Agenda
The sparse approximation problem
Algorithms and pre-run guarantees
Online performance guarantees
Performance bound
Parameter selection
2
Sparse approximation
 
Measurement process:
 
n
 is an white Gaussian noise: 
n
i
~N(0,
2
)
Sparse approximation deals with determining 
x
*
We assume that 
x
*
 is
either a 
K
-sparse vector – has 
K
 non-zero
elements
or a compressible signal –obey a certain decay law
on its elements.
The reconstruction result is denoted by    .
3
Sparse approximation
 
Bound the Root Mean Squared Error:
 
 
 
We will first bound it from below
 
After some definitions…
 
 
 
 
 
 
 
4
Restricted Isometry Property (RIP)
 
We say that a matrix     satisfies the Restricted
Isometry Property (RIP)
 
 
 
 
for all K-sparse vectors 
x
.
5
Oracle reconstruction
 
The least squares oracle estimator is:
 
 
It has a full knowledge of the support of
       is a vector with the 
K
 largest elements of 
x
*
       is the submatrix of      that contains the
columns in the support of
 
6
Oracle reconstruction
 
The oracle error is bounded by
where
and                                                                       is the
irrecoverable energy
7
[Blumensath and Davies; Giryes and Elad]
Exhaustive solution
 
A direct solution is achieved by solving
 
 
 
This problem is NP hard
In the exact 
K
-spare case its error is of order the
oracle error with a factor of              
[Candes,
2006]
:
8
Agenda
The sparse approximation problem
Algorithms and pre-run guarantees
Online performance guarantees
Performance bound
Parameter selection
9
Approximation algorithms
 
The 
l
0
 
norm can be relaxed to 
l
1
 norm:
 
 
 
Or equivalently
10
Approximation algorithms
 
The Dantzig Selector (DS) seeks to minimize the
following objective:
 
 
 
Other algorithms such as CoSaMP 
[Needell and
Tropp 2009]
, SP 
[Dai and Milenkovic 2009]
and IHT 
[Blumensath and Davies 2008, Herrity,
Gilbert and Tropp 2006].
11
Pre-run guarantees
 
Under the RIP condition the DS, BP, CoSaMP,
SP and IHT solution     obeys*:
 
with high probability 
[Candes and Tao 2007,
Bickel, Ritov and Tsybakov 2009 , Giryes and
Elad 2010]
.
 
* Exact K-sparse case.
12
Pre-run guarantees
13
Agenda
The sparse approximation problem
Algorithms and pre-run guarantees
Online performance guarantees
Performance Bound
Parameter Selection
Novel Part
14
What we have till now?
 
We have bounds for existing methods that
guarantees near oracle performance.
Each bound was developed in a separate work.
Those performance guarantees do not take into
account the output of the algorithms.
We will look for a general scheme:
General for any method
Takes into account the output of the algorithms
15
Some notation (again)
K-
term approximation error:
Irrecoverable energy  (reminder):
Measurement error:
16
Online bound 1 (assumed sparsity)
Theorem 4
: Let us assume that  we have a
reconstruction algorithm that returns      as the
estimate of       and satisfies                                  .
The reconstruction result of the algorithm
satisfies
with probability exceeding
17
Online bound 2
Theorem 5
: Let us assume that  we have a
reconstruction algorithm that returns a      sparse
reconstruction     as the estimate of      and
satisfies                                  . The reconstruction
result of the algorithm satisfies
with probability exceeding
18
Concentration-of-measure
 
Part of the proof relies on concentration of
measure property 
[Candes and Tao 2007]
:
19
So how what we can do with these
bounds?
Pre-run performance bounds:
 enforce      algorithmically and use Theorem 4.
Post-run performance bounds:
    can be approximated using            .
since                    concentration of measure
property can be used for a better approximation.
having      and approximating     after running an
algorithm, we can use Theorem 5.
The last can serve for parameter selection.
20
Agenda
The sparse approximation problem
Approximation Algorithms
Online Performance Guarantee
Performance Bound
Parameter Selection
Novel Part
21
FLIHT method
Fast Liphschitz Iterative Hard  Thresholding
(FLIHT)  - a fast version of IHT:
      is a hard thresholding operation that takes
the 
K
 largest elements.
22
FISTA method
Fast Iterative  Soft Thresholding Algorithm
(FISTA)  - a solver for BP.
Same as FLIHT but with soft thresholding
instead of Hard Thresholding and the Lipschitz
constant                                of the gradient
instead of          .
23
Experiments - setup
 
m=512,N=1024
     and the columns of      are normalized and
drawn from the canonical Gaussian distribution.
The support of 
x
*
 is chosen uniformly at random
and 
K=10
.
The      value ranges from high SNR to low SNR
of 
0dB
.
FLIHT uses the real value of 
K
.
24
Experiments – bounds calculation
 
          is zero for FLIHT and FISTA.
    is negligible compared to the noise term in the
bound
We use  
a=0.
      and       are approximated using 
[Candes and
Tao 2007]
:
25
FLIHT bound
26
FISTA bound
27
Agenda
The sparse approximation problem
Approximation algorithms
Online performance guarantee
Performance bound
Parameter selection
Novel Part
28
Recovering compressible signals
Reconstructing the 
K 
dominant elements in
 x.
The rest of the elements should be thrown since
they are lost by the noise.
Choosing the right value of 
K 
is not an easy task.
Theorem 5 can be used for this task.
29
Experiment- setup
x
* 
is generated using the generalized Pareto
distribution:
                                            .
                     .
      is selected as before.
FLIHT
 
is applied with various values of 
K
ranging from 1 to 100
30
Experiments – bound calculation
 
    and     cannot be neglected.
    is predicted using                           .
     is estimated using the statistics of the signal 
x
*
 
 
 
The noise term is calculated as in the previous
experiment.
31
FLIHT parameter selection
32
Conclusion
 
Sparse recovery algorithms obtain great recovery
under RIP conditions.
Using online information can improve the
prediction of performance.
This can be used for parameter selection for
those algorithms, e.g., while working with
compressible signals.
33
 
34
Slide Note
Embed
Share

Greedy algorithms play a crucial role in problem-solving by making locally optimal choices at each stage. Despite not always producing optimal solutions, these algorithms provide locally optimal solutions that approximate a globally optimal solution efficiently. Explore concepts like Huffman Coding Tree, Shortest Path (Dijkstra algorithm), and Minimum Cost Spanning Tree (Kruskal's algorithm) to understand how greedy strategies can be applied effectively in various computational problems.

  • Greedy Algorithms
  • Huffman Coding
  • Shortest Path
  • Minimum Cost Spanning Tree
  • Computer Science

Uploaded on Feb 24, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Online Performance Guarantees for Sparse Recovery Raja Giryes Volkan Cevher ICASSP 2011

  2. 2 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance bound Parameter selection

  3. 3 Sparse approximation Measurement process: u = + * x n n is an white Gaussian noise: ni~N(0, 2) Sparse approximation deals with determining x* We assume that x* is either a K-sparse vector has K non-zero elements or a compressible signal obey a certain decay law on its elements. The reconstruction result is denoted by . x

  4. 4 Sparse approximation Bound the Root Mean Squared Error: We will first bound it from below After some definitions

  5. 5 Restricted Isometry Property (RIP) We say that a matrix satisfies the Restricted Isometry Property (RIP) 2 2 2 x x L x K K 2 2 2 for all K-sparse vectors x.

  6. 6 Oracle reconstruction The least squares oracle estimator is: oracle x = x u * K * K x It has a full knowledge of the support of is a vector with the K largest elements of x* is the submatrix of that contains the columns in the support of * K x * K x * K x

  7. 7 Oracle reconstruction The oracle error is bounded by E x K + oracle x * 2 , x K where K + * * K x x L K K * * , x K x 2 1 K = + * * K * * K x x x x and is the * 2 x 1 irrecoverable energy [Blumensath and Davies; Giryes and Elad]

  8. 8 Exhaustive solution A direct solution is achieved by solving m 2 min x s.t. x y Dx 0 2 This problem is NP hard In the exact K-spare case its error is of order the oracle error with a factor of [Candes, 2006]: * E x x log N NK log C 2

  9. 9 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance bound Parameter selection

  10. 10 Approximation algorithms The l0norm can be relaxed to l1 norm: ( ) 1 1 :min s.t. x 2 2 BP P x y Dx 2 Or equivalently 1 ( ) 2 + :min2 BP y Dx x BP 2 1 x

  11. 11 Approximation algorithms The Dantzig Selector (DS) seeks to minimize the following objective: ( ) 1 :min s.t. x ( ) * DS x D y Dx DS Other algorithms such as CoSaMP [Needell and Tropp 2009], SP [Dai and Milenkovic 2009] and IHT [Blumensath and Davies 2008, Herrity, Gilbert and Tropp 2006].

  12. 12 Pre-run guarantees Under the RIP condition the DS, BP, CoSaMP, SP and IHT solution obeys*: ( ( 2 2 1 x x C x ) ) 2 + N K 2 2 log a with high probability [Candes and Tao 2007, Bickel, Ritov and Tsybakov 2009 , Giryes and Elad 2010]. * Exact K-sparse case.

  13. 13 Pre-run guarantees Method Constant RIP condition Probability ( N ( ( ( ) + + = 1 1 C DS 4 ( ) + N N a 1 1 log a DS 2 3 1 2 32 K K 3 K 3 1 BP a 1 C 2 3 k K BP ) ) ) 1 0.1 0.139 ( ) CoSaMP 34.2 CoSaMP C C C + N N a 1 1 log a 4 K 21.41 9 = 1 SP ( ) + N N a 1 1 log a 3 K SP IHT 1 1/ 32 ( ) + N N a 1 1 log a IHT 3 K

  14. 14 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance Bound Parameter Selection Novel Part

  15. 15 What we have till now? We have bounds for existing methods that guarantees near oracle performance. Each bound was developed in a separate work. Those performance guarantees do not take into account the output of the algorithms. We will look for a general scheme: General for any method Takes into account the output of the algorithms

  16. 16 Some notation (again) K-term approximation error: + * * K x x L 2 K K * * , x K x 2 Irrecoverable energy (reminder): 1 K = + * * K * * K x x x x * x 2 1 Measurement error: ( ) f x 2 = u x 2

  17. 17 Online bound 1 (assumed sparsity) Theorem 4: Let us assume that we have a reconstruction algorithm that returns as the estimate of and satisfies . The reconstruction result of the algorithm satisfies x ( ) f x ( ) f x * 2 *x ( ) + 4 1 log a K N + + + * x x , x K * , x K 2 2 K 2 K ( ) 1 ( ) with probability exceeding + N N a 1 1 log a

  18. 18 Online bound 2 Theorem 5: Let us assume that we have a reconstruction algorithm that returns a sparse reconstruction as the estimate of and satisfies . The reconstruction result of the algorithm satisfies K *x x ( ) f x ( ) f x * 2 ( ) + 4 1 log a K N + + * x x * , x K 2 K 2 K ( ) 1 ( ) with probability exceeding + N N a 1 1 log a

  19. 19 Concentration-of-measure Part of the proof relies on concentration of measure property [Candes and Tao 2007]: ( ( ) ( ) + * i sup 2 1 log P d e a N i ) 1 ( ) + N N a 1 1 log a

  20. 20 So how what we can do with these bounds? Pre-run performance bounds: enforce algorithmically and use Theorem 4. Post-run performance bounds: can be approximated using . since concentration of measure property can be used for a better approximation. having and approximating after running an algorithm, we can use Theorem 5. The last can serve for parameter selection. ( ) f x ( ) f x 2 = n 2 K

  21. 21 Agenda The sparse approximation problem Approximation Algorithms Online Performance Guarantee Performance Bound Parameter Selection Novel Part

  22. 22 FLIHT method Fast Liphschitz Iterative Hard Thresholding (FLIHT) - a fast version of IHT: 1 a ( ) = + i y x x x + 1 1 i i i i a + 1 i 1 L ( ) f y = x y + 1 i i i 2 3 K K K is a hard thresholding operation that takes the K largest elements.

  23. 23 FISTA method Fast Iterative Soft Thresholding Algorithm (FISTA) - a solver for BP. Same as FLIHT but with soft thresholding instead of Hard Thresholding and the Lipschitz constant of the gradient instead of . 3 2 K L ( ) = T 2 L f max

  24. 24 Experiments - setup m=512,N=1024 and the columns of are normalized and drawn from the canonical Gaussian distribution. The support of x* is chosen uniformly at random and K=10. The value ranges from high SNR to low SNR of 0dB. FLIHT uses the real value of K. *x

  25. 25 Experiments bounds calculation = is zero for FLIHT and FISTA. is negligible compared to the noise term in the bound We use a=0. and are approximated using [Candes and Tao 2007]: ( ( 1 / K L K M 0 L K K ) ) 2 1 / 2 / t M K M K 2 + + 2 / t M

  26. 26 FLIHT bound

  27. 27 FISTA bound

  28. 28 Agenda The sparse approximation problem Approximation algorithms Online performance guarantee Performance bound Parameter selection Novel Part

  29. 29 Recovering compressible signals Reconstructing the K dominant elements in x. The rest of the elements should be thrown since they are lost by the noise. Choosing the right value of K is not an easy task. Theorem 5 can be used for this task.

  30. 30 Experiment- setup x* is generated using the generalized Pareto distribution: * ix r U = ~ 0,1 , 1,w.p 0.5 r=0.8, =100 ) ( 1 1 r = . . is selected as before. FLIHTis applied with various values of K ranging from 1 to 100 U U

  31. 31 Experiments bound calculation and cannot be neglected. is predicted using . is estimated using the statistics of the signal x* ( ) f x 2 = x u 2 1 p r 1 1 + 1 * * K r p x x N K r p r p The noise term is calculated as in the previous experiment.

  32. 32 FLIHT parameter selection

  33. 33 Conclusion Sparse recovery algorithms obtain great recovery under RIP conditions. Using online information can improve the prediction of performance. This can be used for parameter selection for those algorithms, e.g., while working with compressible signals.

  34. 34 Questions?

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#