Online Performance Guarantees for Sparse Recovery

undefined

Online Performance Guarantees

for Sparse Recovery

Raja Giryes

ICASSP 2011

Volkan Cevher

Agenda

•

The sparse approximation problem

•

Algorithms and pre-run guarantees

•

Online performance guarantees

•

Performance bound

•

Parameter selection

Sparse approximation

•

Measurement process:

•

 is an white Gaussian noise:

~N(0,



•

Sparse approximation deals with determining

•

We assume that

is

▫

either a

-sparse vector – has

 non-zero

elements

▫

or a compressible signal –obey a certain decay law

on its elements.

•

The reconstruction result is denoted by    .

Sparse approximation

•

Bound the Root Mean Squared Error:

•

We will first bound it from below

▫

After some definitions…

Restricted Isometry Property (RIP)

•

We say that a matrix     satisfies the Restricted

Isometry Property (RIP)

for all K-sparse vectors

Oracle reconstruction

•

The least squares oracle estimator is:

•

It has a full knowledge of the support of

•

       is a vector with the

 largest elements of

•

       is the submatrix of      that contains the

columns in the support of

Oracle reconstruction

•

The oracle error is bounded by

where

and                                                                       is the

irrecoverable energy

[Blumensath and Davies; Giryes and Elad]

Exhaustive solution

•

A direct solution is achieved by solving

•

This problem is NP hard

•

In the exact

-spare case its error is of order the

oracle error with a factor of

[Candes,

2006]

Agenda

•

The sparse approximation problem

•

Algorithms and pre-run guarantees

•

Online performance guarantees

•

Performance bound

•

Parameter selection

Approximation algorithms

•

The

norm can be relaxed to

 norm:

•

Or equivalently

Approximation algorithms

•

The Dantzig Selector (DS) seeks to minimize the

following objective:

•

Other algorithms such as CoSaMP

[Needell and

Tropp 2009]

, SP

[Dai and Milenkovic 2009]

and IHT

[Blumensath and Davies 2008, Herrity,

Gilbert and Tropp 2006].

Pre-run guarantees

•

Under the RIP condition the DS, BP, CoSaMP,

SP and IHT solution     obeys*:

with high probability

[Candes and Tao 2007,

Bickel, Ritov and Tsybakov 2009 , Giryes and

Elad 2010]

* Exact K-sparse case.

Pre-run guarantees

Agenda

•

The sparse approximation problem

•

Algorithms and pre-run guarantees

•

Online performance guarantees

•

Performance Bound

•

Parameter Selection

Novel Part

What we have till now?

•

We have bounds for existing methods that

guarantees near oracle performance.

•

Each bound was developed in a separate work.

•

Those performance guarantees do not take into

account the output of the algorithms.

•

We will look for a general scheme:

▫

General for any method

▫

Takes into account the output of the algorithms

Some notation (again)

•

K-

term approximation error:

•

Irrecoverable energy  (reminder):

•

Measurement error:

Online bound 1 (assumed sparsity)

•

Theorem 4

: Let us assume that  we have a

reconstruction algorithm that returns      as the

estimate of       and satisfies                                  .

The reconstruction result of the algorithm

satisfies

with probability exceeding

Online bound 2

•

Theorem 5

: Let us assume that  we have a

reconstruction algorithm that returns a      sparse

reconstruction     as the estimate of      and

satisfies                                  . The reconstruction

result of the algorithm satisfies

with probability exceeding

Concentration-of-measure

•

Part of the proof relies on concentration of

measure property

[Candes and Tao 2007]

So how what we can do with these

bounds?

•

Pre-run performance bounds:

▫

 enforce      algorithmically and use Theorem 4.

•

Post-run performance bounds:

▫

    can be approximated using            .

▫

since                    concentration of measure

property can be used for a better approximation.

▫

having      and approximating     after running an

algorithm, we can use Theorem 5.

•

The last can serve for parameter selection.

Agenda

•

The sparse approximation problem

•

Approximation Algorithms

•

Online Performance Guarantee

•

Performance Bound

•

Parameter Selection

Novel Part

FLIHT method

•

Fast Liphschitz Iterative Hard  Thresholding

(FLIHT)  - a fast version of IHT:

•

      is a hard thresholding operation that takes

the

 largest elements.

FISTA method

•

Fast Iterative  Soft Thresholding Algorithm

(FISTA)  - a solver for BP.

•

Same as FLIHT but with soft thresholding

instead of Hard Thresholding and the Lipschitz

constant                                of the gradient

instead of          .

Experiments - setup

•

m=512,N=1024

•

     and the columns of      are normalized and

drawn from the canonical Gaussian distribution.

•

The support of

 is chosen uniformly at random

and

K=10

•

The      value ranges from high SNR to low SNR

of

0dB

•

FLIHT uses the real value of

Experiments – bounds calculation

•

          is zero for FLIHT and FISTA.

•

    is negligible compared to the noise term in the

bound

•

We use

a=0.

•

      and       are approximated using

[Candes and

Tao 2007]

FLIHT bound

FISTA bound

Agenda

•

The sparse approximation problem

•

Approximation algorithms

•

Online performance guarantee

•

Performance bound

•

Parameter selection

Novel Part

Recovering compressible signals

•

Reconstructing the

dominant elements in

x.

•

The rest of the elements should be thrown since

they are lost by the noise.

•

Choosing the right value of

is not an easy task.

•

Theorem 5 can be used for this task.

Experiment- setup

•

is generated using the generalized Pareto

distribution:

•

•

•

      is selected as before.

•

FLIHT

is applied with various values of

ranging from 1 to 100

Experiments – bound calculation

•

    and     cannot be neglected.

•

    is predicted using                           .

•

     is estimated using the statistics of the signal

•

The noise term is calculated as in the previous

experiment.

FLIHT parameter selection

Conclusion

•

Sparse recovery algorithms obtain great recovery

under RIP conditions.

•

Using online information can improve the

prediction of performance.

•

This can be used for parameter selection for

those algorithms, e.g., while working with

compressible signals.

Slide Note

Embed Share

Download

Greedy algorithms play a crucial role in problem-solving by making locally optimal choices at each stage. Despite not always producing optimal solutions, these algorithms provide locally optimal solutions that approximate a globally optimal solution efficiently. Explore concepts like Huffman Coding Tree, Shortest Path (Dijkstra algorithm), and Minimum Cost Spanning Tree (Kruskal's algorithm) to understand how greedy strategies can be applied effectively in various computational problems.

hham Follow

Uploaded on Feb 24, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Online Performance Guarantees for Sparse Recovery Raja Giryes Volkan Cevher ICASSP 2011

2 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance bound Parameter selection

3 Sparse approximation Measurement process: u = + * x n n is an white Gaussian noise: ni~N(0, 2) Sparse approximation deals with determining x* We assume that x* is either a K-sparse vector has K non-zero elements or a compressible signal obey a certain decay law on its elements. The reconstruction result is denoted by . x

4 Sparse approximation Bound the Root Mean Squared Error: We will first bound it from below After some definitions

5 Restricted Isometry Property (RIP) We say that a matrix satisfies the Restricted Isometry Property (RIP) 2 2 2 x x L x K K 2 2 2 for all K-sparse vectors x.

6 Oracle reconstruction The least squares oracle estimator is: oracle x = x u * K * K x It has a full knowledge of the support of is a vector with the K largest elements of x* is the submatrix of that contains the columns in the support of * K x * K x * K x

7 Oracle reconstruction The oracle error is bounded by E x K + oracle x * 2 , x K where K + * * K x x L K K * * , x K x 2 1 K = + * * K * * K x x x x and is the * 2 x 1 irrecoverable energy [Blumensath and Davies; Giryes and Elad]

8 Exhaustive solution A direct solution is achieved by solving m 2 min x s.t. x y Dx 0 2 This problem is NP hard In the exact K-spare case its error is of order the oracle error with a factor of [Candes, 2006]: * E x x log N NK log C 2

9 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance bound Parameter selection

10 Approximation algorithms The l0norm can be relaxed to l1 norm: ( ) 1 1 :min s.t. x 2 2 BP P x y Dx 2 Or equivalently 1 ( ) 2 + :min2 BP y Dx x BP 2 1 x

11 Approximation algorithms The Dantzig Selector (DS) seeks to minimize the following objective: ( ) 1 :min s.t. x ( ) * DS x D y Dx DS Other algorithms such as CoSaMP [Needell and Tropp 2009], SP [Dai and Milenkovic 2009] and IHT [Blumensath and Davies 2008, Herrity, Gilbert and Tropp 2006].

12 Pre-run guarantees Under the RIP condition the DS, BP, CoSaMP, SP and IHT solution obeys*: ( ( 2 2 1 x x C x ) ) 2 + N K 2 2 log a with high probability [Candes and Tao 2007, Bickel, Ritov and Tsybakov 2009 , Giryes and Elad 2010]. * Exact K-sparse case.

13 Pre-run guarantees Method Constant RIP condition Probability ( N ( ( ( ) + + = 1 1 C DS 4 ( ) + N N a 1 1 log a DS 2 3 1 2 32 K K 3 K 3 1 BP a 1 C 2 3 k K BP ) ) ) 1 0.1 0.139 ( ) CoSaMP 34.2 CoSaMP C C C + N N a 1 1 log a 4 K 21.41 9 = 1 SP ( ) + N N a 1 1 log a 3 K SP IHT 1 1/ 32 ( ) + N N a 1 1 log a IHT 3 K

14 Agenda The sparse approximation problem Algorithms and pre-run guarantees Online performance guarantees Performance Bound Parameter Selection Novel Part

15 What we have till now? We have bounds for existing methods that guarantees near oracle performance. Each bound was developed in a separate work. Those performance guarantees do not take into account the output of the algorithms. We will look for a general scheme: General for any method Takes into account the output of the algorithms

16 Some notation (again) K-term approximation error: + * * K x x L 2 K K * * , x K x 2 Irrecoverable energy (reminder): 1 K = + * * K * * K x x x x * x 2 1 Measurement error: ( ) f x 2 = u x 2

17 Online bound 1 (assumed sparsity) Theorem 4: Let us assume that we have a reconstruction algorithm that returns as the estimate of and satisfies . The reconstruction result of the algorithm satisfies x ( ) f x ( ) f x * 2 *x ( ) + 4 1 log a K N + + + * x x , x K * , x K 2 2 K 2 K ( ) 1 ( ) with probability exceeding + N N a 1 1 log a

18 Online bound 2 Theorem 5: Let us assume that we have a reconstruction algorithm that returns a sparse reconstruction as the estimate of and satisfies . The reconstruction result of the algorithm satisfies K *x x ( ) f x ( ) f x * 2 ( ) + 4 1 log a K N + + * x x * , x K 2 K 2 K ( ) 1 ( ) with probability exceeding + N N a 1 1 log a

19 Concentration-of-measure Part of the proof relies on concentration of measure property [Candes and Tao 2007]: ( ( ) ( ) + * i sup 2 1 log P d e a N i ) 1 ( ) + N N a 1 1 log a

20 So how what we can do with these bounds? Pre-run performance bounds: enforce algorithmically and use Theorem 4. Post-run performance bounds: can be approximated using . since concentration of measure property can be used for a better approximation. having and approximating after running an algorithm, we can use Theorem 5. The last can serve for parameter selection. ( ) f x ( ) f x 2 = n 2 K

21 Agenda The sparse approximation problem Approximation Algorithms Online Performance Guarantee Performance Bound Parameter Selection Novel Part

22 FLIHT method Fast Liphschitz Iterative Hard Thresholding (FLIHT) - a fast version of IHT: 1 a ( ) = + i y x x x + 1 1 i i i i a + 1 i 1 L ( ) f y = x y + 1 i i i 2 3 K K K is a hard thresholding operation that takes the K largest elements.

23 FISTA method Fast Iterative Soft Thresholding Algorithm (FISTA) - a solver for BP. Same as FLIHT but with soft thresholding instead of Hard Thresholding and the Lipschitz constant of the gradient instead of . 3 2 K L ( ) = T 2 L f max

24 Experiments - setup m=512,N=1024 and the columns of are normalized and drawn from the canonical Gaussian distribution. The support of x* is chosen uniformly at random and K=10. The value ranges from high SNR to low SNR of 0dB. FLIHT uses the real value of K. *x

25 Experiments bounds calculation = is zero for FLIHT and FISTA. is negligible compared to the noise term in the bound We use a=0. and are approximated using [Candes and Tao 2007]: ( ( 1 / K L K M 0 L K K ) ) 2 1 / 2 / t M K M K 2 + + 2 / t M

26 FLIHT bound

27 FISTA bound

28 Agenda The sparse approximation problem Approximation algorithms Online performance guarantee Performance bound Parameter selection Novel Part

29 Recovering compressible signals Reconstructing the K dominant elements in x. The rest of the elements should be thrown since they are lost by the noise. Choosing the right value of K is not an easy task. Theorem 5 can be used for this task.

30 Experiment- setup x* is generated using the generalized Pareto distribution: * ix r U = ~ 0,1 , 1,w.p 0.5 r=0.8, =100 ) ( 1 1 r = . . is selected as before. FLIHTis applied with various values of K ranging from 1 to 100 U U

31 Experiments bound calculation and cannot be neglected. is predicted using . is estimated using the statistics of the signal x* ( ) f x 2 = x u 2 1 p r 1 1 + 1 * * K r p x x N K r p r p The noise term is calculated as in the previous experiment.

32 FLIHT parameter selection

33 Conclusion Sparse recovery algorithms obtain great recovery under RIP conditions. Using online information can improve the prediction of performance. This can be used for parameter selection for those algorithms, e.g., while working with compressible signals.

34 Questions?

Online Performance Guarantees for Sparse Recovery

Download Presentation

Presentation Transcript

Related

More Related Content