Foundations of Parameter Estimation and Decision Theory in Machine Learning

Slide Note
Embed
Share

Explore the foundations of parameter estimation and decision theory in machine learning through topics such as frequentist estimation, properties of estimators, Bayesian parameter estimation, and maximum likelihood estimator. Understand concepts like consistency, bias-variance trade-off, and the Bayesian approach to parameter estimation.


Uploaded on Sep 17, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Parameter Estimation and Decision Theory Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

  2. Example Observe whether the sky is cloudy or not cloudy on n successive days Predict whether the sky will be cloudy on the n+1th day Step 1: Parameter estimation Model as a random variable with a known distribution but unknown parameter Guess the unknown parameter Step 2: Decision making Use guess about unknown parameter to find probability of event of interest Decide based on the probability

  3. Frequentist Estimation Problem Problem: find the true value of a parameter based on data sample Estimator: function from sample space to parameter space Estimate: specific point in sample space. Loss: measure of error wrt true value of parameter

  4. Properties of Estimators Consistency Whether true value is recovered for infinite sample size Bias: Expected deviation of estimate from true value Variance Mean squared error Bias variance trade-off Properties of Maximum Likelihood Estimator Asymptotically Unbiased Consistent Smallest variance among unbiased estimators

  5. Bayesian Parameter Estimation Model parameter as a random variable Prior distribution ? Parameter estimation problem Find posterior distribution ? ? of given observed data ? ? ?(?| ) ? ? ? ? ? ? = Likelihood: ? = ?(?| )

  6. Bayesian Parameter Estimation Point estimation Maximum likelihood estimator ??= argmax ? = argmax ?(?| ) Maximum a posterior estimator ???= argmax ?( |D)= argmax ? ?(?| ) Bayesian estimator ?????= ? = ?( |?)?

  7. Maximum Likelihood Estimator: Illustration Given seq. of coin tosses, guess probability of H Model ? ??? ???(?) Likelihood ? ? = ?(?1,?2, ;?) Loglikelihood ? ? log? ? = ? ??;? = ??log? + ??log(1 ?) ? Where ?? is #heads and ?? is #tails in N tosses Maximum Likelihood Estimate ?(?) =?? ???= argmax ? p

  8. MAP Estimator: Illustration Model p as random variable with a prior distribution ? ???? ?,? ;? ? ?? 11 ?? 1 Formulate posterior distribution ? ??;? = ??+ ?? 11 ??+ ?? 1 ? ? ? ? ? ? Maximum A Posteriori Estimate ??+ ? 1 ? + ? + ? 2 ????= argmax ?(?|?) = p

  9. Bayes Estimator: Illustration Model p as random variable with a prior distribution ? ???? ?,? ;? ? ?? 11 ?? 1 Formulate posterior distribution ? ??;? = ??+ ?? 11 ??+ ?? 1 ? ? ? ? ? ? = ????(? + ??,? + ??) Bayes Estimate ??+ ? ? + ? + ? ??= ?[?|?1, ,??] =

  10. Bayes Estimator: Analysis ??= ? ? ?1, ,?? = ? + ? + ? = ?+?+? = ?+?+?? ? + ??+ ? ?+? ?? ?+?+ ? ?? ? ?+?+? ? ?+?+? p?? ?+? Weighted average of prior mean and MLE Weight of MLE proportional to no of observations

  11. Role of priors Uniform prior vs Beta prior With Uniform prior ? ? 1 ? ??;? = ???+11 ???+1 ? ? ? ? ? ? ?(?|?) =??+ 1 ????= argmax ? + 2 p

  12. Decision Theory Choose a specific point estimate under uncertainty Loss functions measure extent of error Choice of estimate depends on loss function

  13. Loss functions 0 1 loss ? ?,? = ? ? ? =0 ?? ? = ? Minimized by MAP estimate (posterior mode) 1 ?? ? ? ?2 ???? ? ?,? = ? ?2 Expected loss: ?[ ? ?2|?] (Min mean squared error) Minimized by Bayes estimate (posterior mean) ?1loss ? ?,? = ? ? Minimized by posterior median

  14. Predictive distribution Find the probability of the outcome of the n+1th experiment given outcomes of previous n experiments ? ??+1?1, ,?? Frequentist Construct point estimate of parameter from n outcomes ? ??+1?1, ,?? ?(??+1; ) Bayesian Consider the entire posterior distribution of ? ??+1?1, ,?? = ? ? ? ?1, ,???

  15. Summary Parameter estimation problem Frequentist vs Bayesian MLE, MAP and Bayes estimators for Ber trials Optimal estimators for different loss functions Prediction using estimated parameters

Related


More Related Content