Understanding Properties of OLS Estimators in Econometrics
Exploring the concept of sampling error, deriving properties of OLS estimators, and examining the accuracy of sample estimates in regression analysis. The focus is on unbiasedness, consistency, and standard error calculations in estimating population parameters using random samples. Real-life examples from school data illustrate these key econometric concepts.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Chapter 5 Properties of our Estimators
Terminology: These two things are the same 1. the OLS estimator 2. b
Learning Objectives Demonstrate the concept of sampling error Derive the Best, Linear and Unbiased properties of the ordinary least-squares (OLS) estimator Develop the formula for the standard error of the OLS coefficient Describe the consistency property of the OLS estimator
20 More Schools API and Free Lunch Eligibility (FLE) for 20 California Elementary Schools = + API 951.87 2.11 FLE e i i i 950 This relationship is different for a different 20 schools What if we draw random samples of 20 schools 10,000 times and use them to estimate our model 10,000 times? 900 y = -2.11x + 951.87 850 y = -1.29x + 885.38 API 800 750 700 650 20 40 60 80 100 FLE (%)
The Population: All 5,765 Schools = 925.3 1.80 + API FLE i i i 1000 950 900 850 800 750 y = -1.80 x + 925.3 700 650 600 550 500 0 10 20 30 40 50 60 70 80 90 100
Slope Estimates from 10,000 Random Samples of 20 Schools Each Average = -1.80 (this is also the slope from a regression using all 5,765 schools!) Most of the time we get close to the true population parameter but not always! (On average we get the true population parameter.)
The Econometric Property of Unbiasedness in Action Choosing a sample of 20 schools and running a regression gives us an unbiased estimate of the population coefficient The average estimate across samples of 20 equals the population value We are not systematically overestimating or underestimating the value of the parameter The expected value of our estimate is the population parameter value In this illustration, the key to unbiasedness is random sampling from the population
Accuracy: How Close Can We Expect Our Sample Estimate to Be to the Population Parameter? The smaller is the standard deviation across samples, the more accurate a given sample estimate is likely to be The standard deviation across samples is known as a standard error We want to use statistical procedures that give us the smallest possible standard error One way to get a small standard error is to use a large sample.
Larger Samples Generally Give Smaller Standard Errors If we had used samples of 100 instead of 20 in our experiment, it turns out that we would have obtained a standard error of 0.18 instead of 0.41 and a more accurate estimate.
Another Way to Get a Small Standard Error Is to Use an Efficient Estimation Procedure If you want to estimate the population mean, the most efficient (best) linear unbiased estimator (BLUE) is: 1 N = i = Y iY N 1 If you want to estimate the slope of a regression of Y on X, the BLUE turns out to be: Next: N = i x y i i (1) Show these are BLUE = 1 b 1 N = i 2 (2) Along the way, estimate Var(b1), which we use in Ch 6 for hypothesis tests. x i 1
Sample Mean Estimator is Linear and Unbiased (Given CR1) 1 N N = i [ ] E Y (by definition) E Y i = 1 1 N N = i [ ] E Y (expectation of sum = sum of expectation) i = 1 1 N N N N = i (assume CR1) = 1 = =
Deriving the Variance Requires CR2 & CR3 1 N N i = [ ] (by definition) V Y V Y i = 1 1 N i = [ ] V Y (assume CR3) i 2 N = 1 1 N i = 2 (assume CR2) 2 N = 1 1 = 2 N 2 N N 2 = Estimate this with: 1 N = i = 2 2 Y ( ) s Y Y i 1 N 1 , N 2 N ~ Y N = . .( ) se Y
Classical Regression Assumptions Classical Regression Assumptions = = + + + + + Sample Model Population Model ... + Y Y b b X + b X + b X + e + 0 1 1 2 2 X i i i K Ki X i ... X 0 1 1 2 2 i i i K Ki i CR1: Representative sample. Our sample is representative of the population we want to say something about. 2 = [ ] Var CR2: Homoscedastic Errors. The variance of the error is constant over all of our data, i.e., it is homoskedastic. i [ , = ] 0 Cov CR3: Uncorrelated Errors. The errors are uncorrelated across observations. i j CR4: Normally distributed errors. This assumption is only required of small samples. CR5: Exogenous X. The values of the explanatory variable are exogenous, or given (there are no errors in the X-direction). Only required for causality.
OLS Estimator is Linear and Unbiased OLS Estimator is Linear and Unbiased (Given CR1) (Given CR1) N x = = b wY i w 1 i i i N i 2 i x = 1 i = 1 N N N i i i ( ) x X X X NX N i i i 0 N = [ ] E b E wY i = = = = = N = = 1 1 1 w 1 i i i N N N i i i i = 1 i 2 i 2 i 2 i 2 i = 1 x x x x = = = = N 1 1 1 1 = + + ( ) (by definition of ) E w X Y 0 1 i i i i N N = 1 i = i = i x X x x i i i i N N N N = i = = = 1 N 1 N 1 w X = + + E w w X w i i = i = i 0 1 i i i i i 2 i 2 i 1 x x = = = 1 1 1 i i i 1 1 N = + + [ ] E b *0 *1 E w 1 0 1 i i = 1 i N = + [ ] E E w 1 i i So = 1 i N + = [ ] (expectation of sum =sum of expectation) E w 1 i i = 1 i = 1 Remember: X is uncorrelated with in the population model!
Deriving Its Variance Requires CR2 & CR3 N N = [ ] (by definition of ) V b V wY b = 2 2 i [ ] V b w 1 1 i i 1 = 1 i = 1 i N = N (assume CR3) V wY 2 i x i i = 1 i = 2 = 1 i N 2 = 2 i [ ] (we condition on , i.e., take it as given) X w V Y N i 2 i x = 1 i N = 1 i = [ ] 2 i (we condition on , i.e., take it as given) X w V 2 i = = 1 i N N 2 i x = 2 2 i (assume C R2) w = = 1 1 i i , i 2 Estimate this: with this ~ b 1 1 N 2 i x N 1 K Sum of squared errors = = i 1 ) Y i = = 2 2 ( s Y i 1 1 N N K 1
Properties of OLS Estimator b , i 2 Given assumptions CR1, CR2, CR3, and CR4 ( ) ~ 0,1 N 1 . .[ ] se b 1 ~ b N or 1 1 N CR4 not necessary if N is large, thanks to the CLT 2 i x 1 = 1 Don t know 2, so have to estimate it with: N 1 K Sum of squared errors = i ) Y i = = 2 2 ( s Y i 1 1 N N K 1 s = . .[ ] est se b 1 N 2 i x = 1 i
How to Get a Smaller Standard Error = . .[ ] se b 1 N 2 i x = 1 i Sample drawn from a population with a smaller Larger sample size (so denominator sums over more observations) Lots of variation in X (so big xis Why?)
Properties of b 1. If CR1 holds, then b is unbiased (i.e., E[b] = ) 2. If CR1-CR3 hold, then b is BLUE (i.e., smaller standard error than any other linear unbiased estimator) 3. If CR1-CR3 hold, then the standard error formula is = . .[ ] se b 1 N 2 i x = 1 i 4. If CR1-CR4, then b is BUE (i.e., has a smaller standard error than any other unbiased estimator)
Showing the Best in BLUE (N = 2; X1is person 1 s income and X2is person 2 s income) Which of the following is an unbiased estimator of mean income? (Circle the best answer.) 1 1 1 3 Both Neither X + X + X X 1 2 1 2 4 4 2 2 Which of the following is the best (that is, least-variance) estimator of mean income? 1 1 1 3 Both Neither X + X + X X 1 2 1 2 4 4 2 2 Generally, for the class of linear unbiased estimator (LUE) of the mean: N ~ X = c iX i = 1 i 1 N = i = Y iY only the weights ci = 1/N make it BLUE. That s why N 1
Same Logic Applies to OLS N = b wY 1 i i = 1 i The OLS estimator has the lowest standard error of all linear unbiased estimators. Given assumptions CR1-CR3, it is BLUE. This is the Gauss-Markov Theorem. If CR2 or CR3 breaks down, we lost the B in BLUE If CR1 is violated, we lose the U too!
Multiple Regression = + + + + + ... Y b b X b X b X e 0 1 1 2 2 i i i K Ki i = = [ ] k E b and . .[ ] se b k k N 2 ki i v = 1 vk is the part of variable Xk that is not correlated with any of the other right- hand variables, i.e., the residual from a regression of Xk on all the other explanatory variables For two RHS variables: = + + X c c X v 1 0 1 2 1 i i i
Consistent Estimator Consistency means that the estimator is more likely to be close to the population parameter as the sample size increases Only need to assume CR1. When not consistent? Silly estimator (estimate average height and count all the tall people twice) Weird distribution (estimate average wages but some people have zero hours worked, so divide by zero) CR1 fails (increasing the sample size will not improve your estimator if you are not using representative data)
Features of the Sample OLS Regression (you can check these on a spreadsheet or in Stata) (1) The least-squares regression residuals sum to zero (2) The actual and predicted values of Yi have the same mean, since residuals sum to zero and = + Y Y e i i i (3) The sample residuals are uncorrelated with Xi (nothing more in Yi can be explained by Xi or by a linear function of Xi) iY (4) The covariance between predicted and the residuals is zero; this follows from (3)
What We Learned An unbiased estimator gets the right answer in an average sample. Larger samples produce more accurate estimates (smaller standard error) than smaller samples. Under assumptions CR1-CR3, OLS is the best, linear unbiased estimator it is BLUE. We can use our sample data to estimate the accuracy of our sample coefficient as an estimate of the population coefficient. Consistency means that the estimator will get the right answer if applied to the whole population