Properties of OLS Estimators in Econometrics

 
Chapter 5
 
Properties of our Estimators
 
Terminology: These two things are
the same
 
1.
“the OLS estimator”
 
2.
 
b
 
Learning Objectives
 
Demonstrate the concept of sampling error
 
Derive the Best, Linear and Unbiased properties of
the ordinary least-squares (OLS) estimator
 
Develop the formula for the standard error of the OLS
coefficient
 
Describe the consistency property of the OLS
estimator
 
20 More Schools
 
This relationship is different
for a different 20 schools
What if we draw random
samples of 20 schools 10,000
times
…and use them to estimate
our model 10,000 times?
 
The Population: All 5,765 Schools
 
Slope Estimates from 10,000 Random
Samples of 20 Schools Each
 
Most of the time we get close to the true population parameter…but not
always! (
On average
 we get the true population parameter.)
The Econometric Property of
Unbiasedness
 in Action
 
Choosing a sample of 20 schools and running a regression
gives us an 
unbiased
 estimate of the population coefficient
The average estimate across samples of 20 equals the population
value
We are not systematically overestimating or underestimating
the value of the parameter
The 
expected value of our estimate is the population
parameter value
In this illustration, the key to unbiasedness is 
random
sampling 
from the population
 
Accuracy: How Close Can We Expect
Our Sample Estimate to Be to the
Population Parameter?
 
The smaller is the standard deviation across samples,
the more accurate a given sample estimate is likely to be
 
The standard deviation across samples is known as a
standard error
 
We want to use statistical procedures that give us the
smallest possible standard error
 
One way to get a small standard error is to use a large
sample.
 
Larger Samples Generally Give
Smaller Standard Errors
 
If we had used samples of 100 instead of 20 in our experiment, it turns out that we would
have obtained a standard error of 0.18 instead of 0.41—and a more accurate estimate.
Another Way to Get a Small Standard
Error Is to Use an Efficient Estimation
Procedure
If you want to estimate the population mean, the most
efficient (best) linear unbiased estimator (BLUE) is:
 
Next:
(1) Show these are BLUE
(2) Along the way, estimate Var(b
1
), which
we use in Ch 6 for hypothesis tests.
 
If you want to estimate the slope of a regression of 
Y
 on 
X
, the BLUE
turns out to be:
 
Sample Mean Estimator is Linear
and Unbiased (Given CR1)
 
Deriving the Variance Requires CR2 & CR3
 
C
l
a
s
s
i
c
a
l
 
R
e
g
r
e
s
s
i
o
n
 
A
s
s
u
m
p
t
i
o
n
s
 
 
CR1: Representative sample. 
Our sample is representative of the population we want to say
something about.
 
CR2: Homoscedastic Errors.
The variance of the error is constant over all of our data, i.e., it is homoskedastic.
 
CR3: Uncorrelated Errors.
The errors are uncorrelated across observations.
 
CR4: Normally distributed errors.
 This assumption is only required of small samples.
 
CR5: Exogenous 
X
.
 The values of the explanatory variable are exogenous, or given (there are
no errors in the X-direction). Only required for causality.
 
O
L
S
 
E
s
t
i
m
a
t
o
r
 
i
s
 
L
i
n
e
a
r
 
a
n
d
 
U
n
b
i
a
s
e
d
(
G
i
v
e
n
 
C
R
1
)
 
 
So…
Remember: X is uncorrelated
with 
 in the population model!
 
Deriving Its Variance Requires CR2 & CR3
 
 
 
Properties of OLS Estimator
 
Don’t know 
σ
2
, so have to estimate it with:
 
Given assumptions CR1, CR2, CR3, and CR4
 
CR4 not necessary if 
N
 is large, thanks to the CLT
 
or
 
How to Get a Smaller Standard Error
 
Sample drawn from a population with a smaller 
σ
 
Larger sample size (so denominator sums over more
observations)
 
Lots of variation in 
X
 (so big 
x
i
s—Why?)
 
Properties of 
b
 
1.
If CR1 holds, then 
b
 is unbiased (i.e., E[
b
] = 
β
)
 
2.
If CR1-CR3 hold, then 
b
 is 
B
LUE (i.e., 
smaller standard
error
 than any other linear unbiased estimator)
 
3.
If CR1-CR3 hold, then the 
standard error formula 
is
 
 
 
4.
 If CR1-CR4, then 
b
 is BUE (i.e., has a 
smaller standard
error
 than any other unbiased estimator)
 
 
 
 
 
Showing the “Best” in “BLUE”
 
Which of the following is an unbiased estimator of mean income?  (Circle the best answer.)
 
 
 
Which of the following is the “best” (that is, least-variance) estimator of mean income?
 
 
 
Generally, for the class of linear unbiased estimator (LUE) of the mean:
 
 
 
 
 
…only the weights 
c
i
 = 1/N
  make it BLUE. 
That’s why
 
 
(N = 2; X
1
 is person 1’s income and X
2
 is person 2’s income)
 
Same Logic Applies to OLS
 
 
The OLS estimator has the lowest standard error of all
linear unbiased estimators. Given assumptions CR1-CR3, it
is BLUE.
 
This is the 
Gauss-Markov Theorem
.
 
If CR2 or CR3 breaks down, we lost the “B” in “BLUE”
 
If CR1 is violated, we lose the “U” too!
 
Multiple Regression
 
v
k
 is the part of variable 
X
k
 that is 
not
 correlated with any of the other right-
hand variables, i.e., the residual from a regression of 
X
k
 on all the other
explanatory variables
 
For two RHS variables:
 
Consistent Estimator
 
Consistency means that the estimator is more likely to be
close to the population parameter as the sample size
increases
 
Only need to assume CR1.
 
When not consistent?
Silly estimator (estimate average height and count all the tall
people twice)
 
Weird distribution (estimate average wages but some people have
zero hours worked, so divide by zero)
 
CR1 fails (increasing the sample size will not improve your estimator
if you are not using representative data)
 
Features of the Sample OLS
Regression
(you can check these on a spreadsheet or in Stata)
 
(1)
The least-squares regression residuals sum to zero
 
(2) The actual and predicted values of 
Y
i
 have the same
mean, since residuals sum to zero and
 
(3) The sample residuals are uncorrelated with 
X
i
 (nothing
more in 
Y
i
 can be explained by 
X
i
 or by a linear function of
X
i
)
 
(4) The covariance between predicted     and the residuals is
zero; this follows from (3)
 
 
What We Learned
 
An unbiased estimator gets the right answer in an average
sample.
 
Larger samples produce more accurate estimates (smaller
standard error) than smaller samples.
 
Under assumptions CR1-CR3, OLS is the best, linear unbiased
estimator  —it is BLUE.
 
We can use our sample data to estimate the accuracy of our
sample coefficient as an estimate of the population coefficient.
 
Consistency means that the estimator will get the right answer if
applied to the whole population
Slide Note
Embed
Share

Exploring the concept of sampling error, deriving properties of OLS estimators, and examining the accuracy of sample estimates in regression analysis. The focus is on unbiasedness, consistency, and standard error calculations in estimating population parameters using random samples. Real-life examples from school data illustrate these key econometric concepts.

  • Econometrics
  • OLS Estimators
  • Sampling Error
  • Unbiasedness
  • Regression Analysis

Uploaded on Jul 22, 2024 | 2 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Chapter 5 Properties of our Estimators

  2. Terminology: These two things are the same 1. the OLS estimator 2. b

  3. Learning Objectives Demonstrate the concept of sampling error Derive the Best, Linear and Unbiased properties of the ordinary least-squares (OLS) estimator Develop the formula for the standard error of the OLS coefficient Describe the consistency property of the OLS estimator

  4. 20 More Schools API and Free Lunch Eligibility (FLE) for 20 California Elementary Schools = + API 951.87 2.11 FLE e i i i 950 This relationship is different for a different 20 schools What if we draw random samples of 20 schools 10,000 times and use them to estimate our model 10,000 times? 900 y = -2.11x + 951.87 850 y = -1.29x + 885.38 API 800 750 700 650 20 40 60 80 100 FLE (%)

  5. The Population: All 5,765 Schools = 925.3 1.80 + API FLE i i i 1000 950 900 850 800 750 y = -1.80 x + 925.3 700 650 600 550 500 0 10 20 30 40 50 60 70 80 90 100

  6. Slope Estimates from 10,000 Random Samples of 20 Schools Each Average = -1.80 (this is also the slope from a regression using all 5,765 schools!) Most of the time we get close to the true population parameter but not always! (On average we get the true population parameter.)

  7. The Econometric Property of Unbiasedness in Action Choosing a sample of 20 schools and running a regression gives us an unbiased estimate of the population coefficient The average estimate across samples of 20 equals the population value We are not systematically overestimating or underestimating the value of the parameter The expected value of our estimate is the population parameter value In this illustration, the key to unbiasedness is random sampling from the population

  8. Accuracy: How Close Can We Expect Our Sample Estimate to Be to the Population Parameter? The smaller is the standard deviation across samples, the more accurate a given sample estimate is likely to be The standard deviation across samples is known as a standard error We want to use statistical procedures that give us the smallest possible standard error One way to get a small standard error is to use a large sample.

  9. Larger Samples Generally Give Smaller Standard Errors If we had used samples of 100 instead of 20 in our experiment, it turns out that we would have obtained a standard error of 0.18 instead of 0.41 and a more accurate estimate.

  10. Another Way to Get a Small Standard Error Is to Use an Efficient Estimation Procedure If you want to estimate the population mean, the most efficient (best) linear unbiased estimator (BLUE) is: 1 N = i = Y iY N 1 If you want to estimate the slope of a regression of Y on X, the BLUE turns out to be: Next: N = i x y i i (1) Show these are BLUE = 1 b 1 N = i 2 (2) Along the way, estimate Var(b1), which we use in Ch 6 for hypothesis tests. x i 1

  11. Sample Mean Estimator is Linear and Unbiased (Given CR1) 1 N N = i [ ] E Y (by definition) E Y i = 1 1 N N = i [ ] E Y (expectation of sum = sum of expectation) i = 1 1 N N N N = i (assume CR1) = 1 = =

  12. Deriving the Variance Requires CR2 & CR3 1 N N i = [ ] (by definition) V Y V Y i = 1 1 N i = [ ] V Y (assume CR3) i 2 N = 1 1 N i = 2 (assume CR2) 2 N = 1 1 = 2 N 2 N N 2 = Estimate this with: 1 N = i = 2 2 Y ( ) s Y Y i 1 N 1 , N 2 N ~ Y N = . .( ) se Y

  13. Classical Regression Assumptions Classical Regression Assumptions = = + + + + + Sample Model Population Model ... + Y Y b b X + b X + b X + e + 0 1 1 2 2 X i i i K Ki X i ... X 0 1 1 2 2 i i i K Ki i CR1: Representative sample. Our sample is representative of the population we want to say something about. 2 = [ ] Var CR2: Homoscedastic Errors. The variance of the error is constant over all of our data, i.e., it is homoskedastic. i [ , = ] 0 Cov CR3: Uncorrelated Errors. The errors are uncorrelated across observations. i j CR4: Normally distributed errors. This assumption is only required of small samples. CR5: Exogenous X. The values of the explanatory variable are exogenous, or given (there are no errors in the X-direction). Only required for causality.

  14. OLS Estimator is Linear and Unbiased OLS Estimator is Linear and Unbiased (Given CR1) (Given CR1) N x = = b wY i w 1 i i i N i 2 i x = 1 i = 1 N N N i i i ( ) x X X X NX N i i i 0 N = [ ] E b E wY i = = = = = N = = 1 1 1 w 1 i i i N N N i i i i = 1 i 2 i 2 i 2 i 2 i = 1 x x x x = = = = N 1 1 1 1 = + + ( ) (by definition of ) E w X Y 0 1 i i i i N N = 1 i = i = i x X x x i i i i N N N N = i = = = 1 N 1 N 1 w X = + + E w w X w i i = i = i 0 1 i i i i i 2 i 2 i 1 x x = = = 1 1 1 i i i 1 1 N = + + [ ] E b *0 *1 E w 1 0 1 i i = 1 i N = + [ ] E E w 1 i i So = 1 i N + = [ ] (expectation of sum =sum of expectation) E w 1 i i = 1 i = 1 Remember: X is uncorrelated with in the population model!

  15. Deriving Its Variance Requires CR2 & CR3 N N = [ ] (by definition of ) V b V wY b = 2 2 i [ ] V b w 1 1 i i 1 = 1 i = 1 i N = N (assume CR3) V wY 2 i x i i = 1 i = 2 = 1 i N 2 = 2 i [ ] (we condition on , i.e., take it as given) X w V Y N i 2 i x = 1 i N = 1 i = [ ] 2 i (we condition on , i.e., take it as given) X w V 2 i = = 1 i N N 2 i x = 2 2 i (assume C R2) w = = 1 1 i i , i 2 Estimate this: with this ~ b 1 1 N 2 i x N 1 K Sum of squared errors = = i 1 ) Y i = = 2 2 ( s Y i 1 1 N N K 1

  16. Properties of OLS Estimator b , i 2 Given assumptions CR1, CR2, CR3, and CR4 ( ) ~ 0,1 N 1 . .[ ] se b 1 ~ b N or 1 1 N CR4 not necessary if N is large, thanks to the CLT 2 i x 1 = 1 Don t know 2, so have to estimate it with: N 1 K Sum of squared errors = i ) Y i = = 2 2 ( s Y i 1 1 N N K 1 s = . .[ ] est se b 1 N 2 i x = 1 i

  17. How to Get a Smaller Standard Error = . .[ ] se b 1 N 2 i x = 1 i Sample drawn from a population with a smaller Larger sample size (so denominator sums over more observations) Lots of variation in X (so big xis Why?)

  18. Properties of b 1. If CR1 holds, then b is unbiased (i.e., E[b] = ) 2. If CR1-CR3 hold, then b is BLUE (i.e., smaller standard error than any other linear unbiased estimator) 3. If CR1-CR3 hold, then the standard error formula is = . .[ ] se b 1 N 2 i x = 1 i 4. If CR1-CR4, then b is BUE (i.e., has a smaller standard error than any other unbiased estimator)

  19. Showing the Best in BLUE (N = 2; X1is person 1 s income and X2is person 2 s income) Which of the following is an unbiased estimator of mean income? (Circle the best answer.) 1 1 1 3 Both Neither X + X + X X 1 2 1 2 4 4 2 2 Which of the following is the best (that is, least-variance) estimator of mean income? 1 1 1 3 Both Neither X + X + X X 1 2 1 2 4 4 2 2 Generally, for the class of linear unbiased estimator (LUE) of the mean: N ~ X = c iX i = 1 i 1 N = i = Y iY only the weights ci = 1/N make it BLUE. That s why N 1

  20. Same Logic Applies to OLS N = b wY 1 i i = 1 i The OLS estimator has the lowest standard error of all linear unbiased estimators. Given assumptions CR1-CR3, it is BLUE. This is the Gauss-Markov Theorem. If CR2 or CR3 breaks down, we lost the B in BLUE If CR1 is violated, we lose the U too!

  21. Multiple Regression = + + + + + ... Y b b X b X b X e 0 1 1 2 2 i i i K Ki i = = [ ] k E b and . .[ ] se b k k N 2 ki i v = 1 vk is the part of variable Xk that is not correlated with any of the other right- hand variables, i.e., the residual from a regression of Xk on all the other explanatory variables For two RHS variables: = + + X c c X v 1 0 1 2 1 i i i

  22. Consistent Estimator Consistency means that the estimator is more likely to be close to the population parameter as the sample size increases Only need to assume CR1. When not consistent? Silly estimator (estimate average height and count all the tall people twice) Weird distribution (estimate average wages but some people have zero hours worked, so divide by zero) CR1 fails (increasing the sample size will not improve your estimator if you are not using representative data)

  23. Features of the Sample OLS Regression (you can check these on a spreadsheet or in Stata) (1) The least-squares regression residuals sum to zero (2) The actual and predicted values of Yi have the same mean, since residuals sum to zero and = + Y Y e i i i (3) The sample residuals are uncorrelated with Xi (nothing more in Yi can be explained by Xi or by a linear function of Xi) iY (4) The covariance between predicted and the residuals is zero; this follows from (3)

  24. What We Learned An unbiased estimator gets the right answer in an average sample. Larger samples produce more accurate estimates (smaller standard error) than smaller samples. Under assumptions CR1-CR3, OLS is the best, linear unbiased estimator it is BLUE. We can use our sample data to estimate the accuracy of our sample coefficient as an estimate of the population coefficient. Consistency means that the estimator will get the right answer if applied to the whole population

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#