Generalization Bounds and Algorithms in Machine Learning

Slide Note
Embed
Share

Generalization bounds play a crucial role in assessing the performance of machine learning algorithms. Uniform stability, convex optimization, and error analysis are key concepts in understanding the generalization capabilities of algorithms. Stability in optimization, gradient descent techniques, and the impact of different factors on generalization error are important considerations in machine learning theory.


Uploaded on Aug 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Generalization bounds for uniformly stable algorithms Vitaly Feldman Brain with Jan Vondrak

  2. Uniform stability Domain ? (e.g. ? ?) Dataset ? = ?1, ,?? ?? Learning algorithm ?:?? ? Loss function :? ? + Uniform stability [Bousquet,Elisseeff 02]: ? has uniform stability ? w.r.t. if For all neighboring ?,? ,? ? ? ? ,? ? ? ,? ? 2

  3. Generalization error Probability distribution ? over ?, ? ?? Population loss: ?? ? Empirical loss: = ?? ? ?,? ? =1 ?? ? ? ?=1 ?,?? Generalization error/gap: ? ? = ?? ? ? ?? ? ? 3

  4. Stochastic convex optimization ?1 ? ? = ?2 For all ? ?, (?,?) is convex 1-Lipschitz in ? Minimize ?? ? ?? ?[ (?,?)] over ? ? = min ?2 1} ? ??? ? For all ?, ? being SGD with rate ? = 1/ ? : log 1/? ? ? + ? Pr ? ???? ?(?) ? ? ? Uniform convergence error: ERMmight have generalization error: ? [Shalev-Shwartz,Shamir,Srebro,Sridharan 09; Feldman 16] ? 4

  5. Stable optimization Strongly convex ERM [BE 02, SSSS 09] +? 2 ??(?) = argmin? ? ?? ? ?2 2 ?? is 1 ?? uniformly stable and minimizes ?? ? within ? 2 Gradient descent on smooth losses [Hardt,Recht,Singer 16] 1 ??(?) : ? steps of GD on ?? ? with ? = ? ? uniformly stable and minimizes ?? ? within 2 ? ?? is ? 5

  6. Generalization bounds 1 ?,1 1 ? For ?, w/ range 0,1 and uniform stability ? ? ?? ? (? ) ? ? [Rogers,Wagner 78] ?? ? ?? ? (?) ? ? log 1/? [Bousquet,Elisseeff 02] Vacuous when ? 1/ ? ? ?? ? ?? ? (?) ? log 1/? ? 6

  7. Comparison [BE 02] Generalization error This work ? = 100 1 ? 7

  8. Second moment ? ?? ? (?)2 ?2+1 ? ? ?+1/ ? Chebyshev: ?? ? ?? ? ? ? ? Previously [Devroye,Wagner 79; BE 02] ? ?? ? (?)2 ? +1 ? Generalization error ? [BE 02] This work 1 ? 8

  9. Implications 1 ? + ? ?? ? ???? ?(?) ?1/4? log 1/? ?1/3 ? ? + ? ?? ? ???? ?(?) ? Differentially-private prediction [Dwork,Feldman 18] ? = ? ?. A randomized algorithm ?(?,?) has ?-DP prediction if for all ?,? ,? ? ? ? ?,? ||? ? ,? ? For any loss ?:? ? [0,1], ??[?(? ?,? ,?)] has uniform stability ?? 1 ? Stronger generalization bounds for DP prediction algorithms 9

  10. Data-dependent functions Consider ?: ?? ? . E.g. ? ?,? (? ? ,?) ? has uniform stability ? if For all neighboring ?,? ,? ? ? ?,? ? ? ,? ? ?, ? ? , ? ? Generalization error/gap: ?? = ??? ? ??? ? 10

  11. Generalization in expectation Goal: ? ?????(?) ???(?) ? ? ?????(?) For all ? and ? ?, ? ?,?? ? ?? ?,?? + ? ? ?,?? ? ? ? ??,? [?]?(?,??) = ? ? ??,? ?? ?,? = ? ? ?????(?) ? ? = ? ??,? [?]?(?,??) ? ? ?? ?? ?,?? ? + ? ? ??,? ?,? ?? ?? ?,?? + ? + ? + ? ?? ?,?? ??+1 ?,? 11

  12. Concentration via McDiarmid For all neighboring ?,? 2? +1 ?? ? ? ? 1. ? ? ?? ?,? ? ? ?? ? ,? ?? ? ? ,? ??? ? ,? ? 2. ??? ? ??? ? ? +1 + ??? ? ?? ? ? ,? ? ? ?? ?? ? + 2? +1 ? ? ?? ?? McDiarmid: ?? ? log 1/? ? ? where ? = ? 12

  13. Proof technique Based on [Nissim,Stemmer 15; BNSSSU 16] Let ? be a distribution over : ?1, ,?? ?max 0,?1, ,?? ln 2 Pr ? ?? 2 ? ? Let ?1, ,??~?? for ? =ln 2 ? Need to bound ?1, ,??~??max? ? ? ??? ?1, ,??~??, =argmax?{ ??? }?? ? ? ? ? ?1, ,??~??, =argmax?{ ??? }??? ? ? 13

  14. Stable max Exponential mechanism [McSherry,Talwar 07] EM?: sample ? ?? ??? 2? +1 1. Stable: 2? ?- differentially private ?1, ,??~??, =EM??? ? ? ? 2? +1 ?1, ,??~??, =EM???? ? ? + exp 2? 1 + ? ? 2. Approximates max ln? ?1, ,??~??, =EM? ? ? ? ?1, ,??~??max? ? ? ??? ? 14

  15. Game over ?1, ,??~??max? ? ln? ? ? ??? 2? +1 + exp 2? 1 + ? ? ln ? ?+1/? get ? ? +1 1 ? Pick ? = ?ln Let ? be a distribution over : ?1, ,?? ?max 0,?1, ,?? ln 2 Pr ? ?? 2 ? ? 15

  16. Conclusions Better understanding of uniform stability New technique Open o Gap between upper and lower bounds o High probability generalization without strong uniformity 16

Related


More Related Content