Endogeneity and Instrumental Variable Estimation Methods

undefined
Endogeneity and Instrumental variable estimation
method
OBID A.KHAKIMOV
Revew
 
Four complications that induce correlation between 
X
 and 
1.
Omitted Variables Bias
2.
Measurement Error
3.
Simultaneous Causality
4.
Using Lagged Values of the Dependent Variable as Explanators, in
the presence of serial correlation
Endogeneity
 
1. Omission of relevant variables
Measurement error
Measurement error: independent
variable
Simultaneity
Consider the simultaneous system
Reduced forms
 
 
7
Responses to Endogeneity
Remedial-1: IV estimation method
Remedial-2: Differencing methods:
Limitations:
- will not eliminate selection bias.
- only eliminate 
fixed
 variables; sometimes endogenous
variables change values over time
8
Responses to Endogeneity
Approach #3:  Difference it out -- continued
Limitations:
- DD models will not eliminate selection bias.
- DD models only eliminate 
fixed
 variables; sometimes
endogenous variables change values over time
9
Omitted Variables
1.  Find additional data so that every relevant variable is included.
2.
Ignore it: if omitted variable is uncorrelated with all included
variables
3.
Find proxy variable.
          Proxy z must be redundant (= ignorable)
      E (y | x, q, z) = E (y | x, q
)
10
Measurement Error
1. Improve measurement
2. Argue that the degree of error is small
 
- Use outside data for validation
3. Argue that error is uncorrelated with included
variables
Instrumental Variables
 
An 
Instrumental Variable
 is a variable that is
correlated with 
X
 but uncorrelated with 
.
 
If 
Z
i
 is an instrumental variable:
1.
E( 
Z
i 
X
i 
) 
≠ 0
2.
E( 
Z
i
 
i 
) = 0
Instrumental Variables
 
The econometrician can use an instrumental variable 
Z
 to estimate the effect on
Y
 of only that part of 
X
 
that is correlated with 
Z.
 
Because 
Z
 is uncorrelated with 
, any part of 
X
 that is correlated with 
Z
 must
also be uncorrelated with 
.
 
An instrumental variable lets the econometrician find a part of 
X
 that behaves as
though it had been randomly assigned
Instrumental Variables
One way to see this is in terms of two regression equations
 
Y
i
 = 
β
0
 + 
β
1
X
i
 + 
ε
i
 
X
i
 = 
γ
0
 + 
γ
1
Z
i
 + 
η
i
Note that, in this model 
X
 is endogenous (may be correlated with 
ε
)
The instrumental variables model requires that:
1. 
γ
1
0 so that 
Z
 predicts 
X
, and
2. Z
 uncorrelated with 
ε
 (
Z
 is exogenous) [Cov{
ε
, Z
} = 0]
Implication
 
Estimate model by OLS and by IV, and compare estimates
 
If
 
If
 
But test INDIRECTLY using Wu-Hausman test.
undefined
THE INSTRUMENTAL VARIABLES (IV) ESTIMATOR
Suppose that one or more of the regressors in X is not independent of
the equation error term, even in the limit as the sample size goes
to infinity. That is,  X is correlated with u, the equation
disturbance.
However, suppose we have another variable, Z, (an instrument for X)
that has the properties:
(1)
Z and X  are correlated
 
 (2
) Z and u are uncorrelated
Now define the IV estimator as:
 
undefined
 
 
 
 
Generalised IV estimator (GIVE)
 
A more general form of the IV estimator where we have more instrumental variables than
“endogenous” X variables
 
GIVE is potentially more efficient than simple IV, if instruments are well-chosen
 
Test whether instruments are ‘valid’ using Sargan’s test
 
How do we find an instrumental variable?
There are two methods:
Arbitrary search and test.
Two stage least squares
.
Two Stage Least Squares (2SLS) offers an excellent direct
estimation method in the case of exactly or over-identified
equations.
18
 
Finding a suitable Instrumental Variable
19
Strong IVs
A strong instrument has a high correlation with the endogenous
variable.
 
How strong a correlation?  Staiger & Stock (1997) recommend a
partial F statistic of 5 or greater.
 
- Run 1
st
 stage with and without the IV.
 
- Compare the overall F statistics: a difference of 5 or
  
  
more is sufficient evidence of strength.
20
Weak IVs
 
If the IVs are weak,
2SLS and 2SRI are consistent, but there can be considerable bias
even in large samples
standard errors are too small
2SLS and 2SRI perform poorly
21
Weak IVs
What to do if IVs are weak?
If there is a single endogenous variable, use a 
conditional likelihood ratio
(CLR) test:
 
* perform a regular likelihood ratio test
 
* adjust the critical values
 
* available in Stata; see Stata Journal, 3, 57-70
 
    and  
http://elsa.berkeley.edu/wp/marcelo.pdf
  by Moreira
         and Poi
 
The 
first stage
 involves the creation of an instrument.
Use the reduced from equation for P to get its fitted
value, Phat.
The 
second stage
 involves a variant of instrumental
variables estimation. Replace P by Phat in the supply
equation and use OLS in this second stage of the
estimation process
So it is in fact a special way and perhaps less arbitrary
way of doing instrumental variables estimation.
22
 
Two stage least squares as IV estimation
 
Although one could undertake 2SLS estimation manually, running
the reduced form regression, saving the fitted values and then
running the second stage (structural form) regression, modern
software allows you to get the results automatically with one set
of instructions.
You need to tell the software which RHS variable is endogenous
and which other variables should be used as regressors in the
reduced form (first stage) of the regression.
Using the automatic IV procedure will also guarantee appropriate
estimates of the second stage standard errors.
23
 
Two stage least squares estimation with modern econometric
software
 
IV over identification test
Hausman specification test
 
Test compares two coefficients and follows 
chi-square
 distribution
Paper replication (IV method)
 
The Colonial Origins of Comparative Development: An Empirical Investigation
Author(s): Daron Acemoglu, Simon Johnson, James A. Robinson Source: The
American Economic Review, Vol. 91, No. 5 (Dec., 2001), pp. 1369-1401
Introduction 
 
 
Research question:
 
                                     
How accurately to measure the effect of institutions on
 
                              economic development process?
 
Countries with better "institutions," more secure property rights, and less
distortionary policies will invest more in physical and human capital, and will
use these factors more efficiently to achieve a greater level of income (e.g.,
Douglass C. North and Robert P. Thomas, 1973; Eric L. Jones, 1981; North,
1981).”
Introduction 
 
 
Theory:
 
1. There were different types of colonization policies:
 
A) 
European powers set up "extractives states"
 
B) Setting new colonies with European institutions with strong emphasis on
private property and checks against government power
 
2. 
States where disease environment was not favorable lead to less
migration of colonizators and more likely development "extractives
states“
 
3. 
Colonial institutions persisted even after independence
Literature and theory
 
William H. McNeill (1976), Crosby (1986), and Jared M. Diamond (1997) have discussed the
influence of diseases on human history.
 
Diamond (1997), in particular, emphasizes 
comparative development, but his theory is based on
the geographical determinants of the incidence of the neolithic revolution
.
 
Ronald E. Robinson and John Gallagher (1961), Lewis H. Gann and Peter Duignan (1962), Donald
Denoon (1983), and Philip J. Cain and Anthony G. Hopkins (1993) emphasizes that 
settler
colonies such as the United States and New Zealand are different from other colonies, and
point out that these differences were important for their economic success.
 
Frederich A. von Hayek (1960) argued that the 
British common law tradition was superior to the
French civil law, which was developed during the Napoleonic 
era to restrain judges'
interference with state policies
Literature and theory
 
Curtin (1964) documents
 how early British expectations for settlement in West Africa were
dashed by very high mortality among early settler.
 
Pilgrim decided to migrate to the United States rather than Guyana because of the high
mortality rates in Guyana (see Crosby, 1986 pp. 143-44).
 
Robinson and Gallagher (1961), Gann and Duignan (1962), Denoon (1983), and Cain and Hopkins
(1993),
 have documented the development of "settler colonies," where Europeans settled in
large numbers, and life was modeled after the home country.
 
When the establishment of European like institutions did not arise naturally, the
settlers were ready to fight for them against the wishes of the home country.
 
Literature and theory
 
There are a number of economic mechanisms that will lead to institutional
persistence:
 
1. 
Setting up institutions that place restrictions on government power and enforce
property rights is costly. It may not pay the elites at independence to switch to
extractive institution.
 
2. The gains to an extractive strategy may depend on the size of the ruling elite.
 
3. If agents make irreversible investments that are complementary to a particular
set of institutions, they will be more willing to support them, making these
institutions per-sist (see, e.g., Acemoglu, 1995)
 
 
 
 
 
Nigeria, which has approximately the 25th percentile of the institutional measure in this 
sample, 5.6, 
and Chile, which has
approximately the 75th percentile of the institutions 
index, 7.8
. The estimate in column (1), 0.52, indicates that there should be
on average a 1.14- log-point difference between the log GDPs of the corresponding countries (or approximately a 2-fold
difference-e1 
. 14- 
1 2.1). In practice, this GDP gap is 253 log points (approximately 1-fold)
R - Current institutions (protection against expropriation between 1985 and 1995),
C - Early (circa 1900) institutions,
S - European settlements in the colony (fraction of the population with European descent in 1900),
M - mortality rates faced by settlers.
X - vector of covariates that affect all variables.
where Y is income per capita in country i, R is the protection against expropriation measure, X i 
is a vector of other
covariates
 
 
Robustness check 
 
 
 
 
 
Slide Note
Embed
Share

Endogeneity in econometrics can create challenges such as omitted variables bias, measurement error, simultaneous causality, and using lagged values. This can affect the accuracy of models. One way to address this is through instrumental variable estimation methods. These methods help deal with endogeneity issues by finding suitable instrumental variables that are correlated with the endogenous variable but not the error term. The content also covers complications inducing correlation between variables and responses to endogeneity issues, like IV estimation and differencing methods.

  • Endogeneity
  • Instrumental Variable
  • Estimation Methods
  • Econometrics
  • IV Estimation

Uploaded on Sep 26, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Endogeneityand Instrumental variable estimation method OBID A.KHAKIMOV

  2. Revew Four complications that induce correlation between X and 1. Omitted Variables Bias 2. Measurement Error 3. Simultaneous Causality 4. Using Lagged Values of the Dependent Variable as Explanators, in the presence of serial correlation

  3. Endogeneity 1. Omission of relevant variables = + the model e t which you estimate Y X u 1 1 t t = + + Y X X 1 1 2 2 t = 1 ' ( ' ) X X X Y 1 1 1 = X + + = 1 ' ( X ' ) ( ) = X X X X X e 1 ) 1 1 1 ( 1 2 ) 2 X t X + + = 1 ' 1 ' 1 ' ( ' ' ( ' ) X X X X X X X X e 1 + X 1 1 X 1 ' v 1 1 1 1 = 2 2 X 1 ) 1 1 t + X 1 ' 1 ' ( ) where e X ( ' X e X X X 1 2 1 1 1 1 1 1 2 t = 1 + + ' or and if E( ) 0 2 1 t ) = ( E 1 2

  4. Measurement error = + * Y Y v y = + + the * model which you estimate Y X u v t y = 1 ' ( ' ) X X X Y = + + 1 ' ( ' ) ( ) X X X X u v 1 1 t y ) = + + 1 ' 1 ' 1 ' ( ( ' ) ( ' ) ( ) ( ' ) ( ) E X X X X X X E X u X X E X v 1 1 t y = ' ' E if ( ) then 0 and in most cases ( ) 0 X u E X v t y ) = + 1 ' ( ( ' ) ( ) E X X E X v y

  5. Measurement error: independent variable = + * X X v x = + ) u + = * ( X Y X v u x t = + + = = + * * { } Y v Y X e t X x = + + 1 ' * ( ' ) X X X u v t x '* ) ( = + * * 1 * * * * 1 ( ' ) ' ( ' ) ( ) E X X X X X X E X u t '* '* + ) = * * 1 ( ' ) ( and if ( ) 0 X X E X v E X u x t '* ) ( = + ) * * 1 ( ' ) ( E X X E X v x

  6. Simultaneity Consider the simultaneous system Y X t = = + + + + X Y 1 Z W u v ( ) A B ( ) t 1 t 2 t t t 2 t t Reduced forms ( 1 ) ( ) ( ) 1 1 + W Z v u 2 t 2 t 1 t t = + + Y ( ) C ( ) ( ) ( ) v t 2 1 1 1 1 1 1 1 ( 1 ) ( ) ( ) 1 + Z W u 1 t 2 t 1 t 1 t = + + X ( ) D ( ) ( ) ( ) t 1 1 1 1 1 1

  7. Responses to Endogeneity Remedial-1: IV estimation method Remedial-2: Differencing methods: Limitations: - will not eliminate selection bias. - only eliminate fixed variables; sometimes endogenous variables change values over time 7

  8. Responses to Endogeneity Approach #3: Difference it out -- continued Limitations: - DD models will not eliminate selection bias. - DD models only eliminate fixed variables; sometimes endogenous variables change values over time 8

  9. Omitted Variables 1. Find additional data so that every relevant variable is included. 2. Ignore it: if omitted variable is uncorrelated with all included variables 3. Find proxy variable. Proxy z must be redundant (= ignorable) E (y | x, q, z) = E (y | x, q) 9

  10. Measurement Error 1. Improve measurement 2. Argue that the degree of error is small - Use outside data for validation 3. Argue that error is uncorrelated with included variables 10

  11. Instrumental Variables An Instrumental Variable is a variable that is correlated with X but uncorrelated with . If Zi is an instrumental variable: 1. E( Zi Xi ) 0 2. E( Zi i ) = 0

  12. Instrumental Variables The econometrician can use an instrumental variable Z to estimate the effect on Y of only that part of Xthat is correlated with Z. Because Z is uncorrelated with , any part of X that is correlated with Z must also be uncorrelated with . An instrumental variable lets the econometrician find a part of X that behaves as though it had been randomly assigned

  13. Instrumental Variables One way to see this is in terms of two regression equations Yi = 0 + 1Xi + i Xi = 0 + 1Zi + i Note that, in this model X is endogenous (may be correlated with ) The instrumental variables model requires that: 1. 1 0 so that Z predicts X, and 2. Z uncorrelated with (Z is exogenous) [Cov{ , Z} = 0]

  14. Implication Estimate model by OLS and by IV, and compare estimates OLS OLS use OLS If IV If use IV IV But test INDIRECTLY using Wu-Hausman test.

  15. THE INSTRUMENTAL VARIABLES (IV) ESTIMATOR Suppose that one or more of the regressors in X is not independent of the equation error term, even in the limit as the sample size goes to infinity. That is, X is correlated with u, the equation disturbance. However, suppose we have another variable, Z, (an instrument for X) that has the properties: (1)Z and X are correlated (2) Z and u are uncorrelated Now define the IV estimator as: IV = (Z X) Z Y -1

  16. Y = X + u -1 X Y ( = IV Z Z ) -1 / / ) u + = (Z X Z ( X ) IV -1 -1 / / / / + = (Z X Z X (Z X Z ) u ) ) IV -1 X u ( + = IV Z Z ) ( = IV= If Cor ) u , Z ( , 0 then E )

  17. Generalised IV estimator (GIVE) A more general form of the IV estimator where we have more instrumental variables than endogenous X variables GIVE is potentially more efficient than simple IV, if instruments are well-chosen Test whether instruments are valid using Sargan s test

  18. Finding a suitable Instrumental Variable How do we find an instrumental variable? There are two methods: Arbitrary search and test. Two stage least squares. Two Stage Least Squares (2SLS) offers an excellent direct estimation method in the case of exactly or over-identified equations. 18

  19. Strong IVs A strong instrument has a high correlation with the endogenous variable. How strong a correlation? Staiger & Stock (1997) recommend a partial F statistic of 5 or greater. - Run 1st stage with and without the IV. - Compare the overall F statistics: a difference of 5 or more is sufficient evidence of strength. 19

  20. Weak IVs If the IVs are weak, 2SLS and 2SRI are consistent, but there can be considerable bias even in large samples standard errors are too small 2SLS and 2SRI perform poorly 20

  21. Weak IVs What to do if IVs are weak? If there is a single endogenous variable, use a conditional likelihood ratio (CLR) test: * perform a regular likelihood ratio test * adjust the critical values * available in Stata; see Stata Journal, 3, 57-70 and http://elsa.berkeley.edu/wp/marcelo.pdf by Moreira and Poi 21

  22. Two stage least squares as IV estimation The first stage involves the creation of an instrument. Use the reduced from equation for P to get its fitted value, Phat. The second stage involves a variant of instrumental variables estimation. Replace P by Phat in the supply equation and use OLS in this second stage of the estimation process So it is in fact a special way and perhaps less arbitrary way of doing instrumental variables estimation. 22

  23. Two stage least squares estimation with modern econometric software Although one could undertake 2SLS estimation manually, running the reduced form regression, saving the fitted values and then running the second stage (structural form) regression, modern software allows you to get the results automatically with one set of instructions. You need to tell the software which RHS variable is endogenous and which other variables should be used as regressors in the reduced form (first stage) of the regression. Using the automatic IV procedure will also guarantee appropriate estimates of the second stage standard errors. 23

  24. IV over identification test

  25. Hausman specification test Test compares two coefficients and follows chi-square distribution

  26. Paper replication (IV method) The Colonial Origins of Comparative Development: An Empirical Investigation Author(s): Daron Acemoglu, Simon Johnson, James A. Robinson Source: The American Economic Review, Vol. 91, No. 5 (Dec., 2001), pp. 1369-1401

  27. Introduction Research question: How accurately to measure the effect of institutions on economic development process? Countries with better "institutions," more secure property rights, and less distortionary policies will invest more in physical and human capital, and will use these factors more efficiently to achieve a greater level of income (e.g., Douglass C. North and Robert P. Thomas, 1973; Eric L. Jones, 1981; North, 1981).

  28. Introduction Theory: 1. There were different types of colonization policies: A) European powers set up "extractives states" B) Setting new colonies with European institutions with strong emphasis on private property and checks against government power 2. States where disease environment was not favorable lead to less migration of colonizators and more likely development "extractives states 3. Colonial institutions persisted even after independence

  29. Literature and theory William H. McNeill (1976), Crosby (1986), and Jared M. Diamond (1997) have discussed the influence of diseases on human history. Diamond (1997), in particular, emphasizes comparative development, but his theory is based on the geographical determinants of the incidence of the neolithic revolution. Ronald E. Robinson and John Gallagher (1961), Lewis H. Gann and Peter Duignan (1962), Donald Denoon (1983), and Philip J. Cain and Anthony G. Hopkins (1993) emphasizes that settler colonies such as the United States and New Zealand are different from other colonies, and point out that these differences were important for their economic success. Frederich A. von Hayek (1960) argued that the British common law tradition was superior to the French civil law, which was developed during the Napoleonic era to restrain judges' interference with state policies

  30. Literature and theory Curtin (1964) documents how early British expectations for settlement in West Africa were dashed by very high mortality among early settler. Pilgrim decided to migrate to the United States rather than Guyana because of the high mortality rates in Guyana (see Crosby, 1986 pp. 143-44). Robinson and Gallagher (1961), Gann and Duignan (1962), Denoon (1983), and Cain and Hopkins (1993), have documented the development of "settler colonies," where Europeans settled in large numbers, and life was modeled after the home country. When the establishment of European like institutions did not arise naturally, the settlers were ready to fight for them against the wishes of the home country.

  31. Literature and theory There are a number of economic mechanisms that will lead to institutional persistence: 1. Setting up institutions that place restrictions on government power and enforce property rights is costly. It may not pay the elites at independence to switch to extractive institution. 2. The gains to an extractive strategy may depend on the size of the ruling elite. 3. If agents make irreversible investments that are complementary to a particular set of institutions, they will be more willing to support them, making these institutions per-sist (see, e.g., Acemoglu, 1995)

  32. Nigeria, which has approximately the 25th percentile of the institutional measure in this sample, 5.6, and Chile, which has approximately the 75th percentile of the institutions index, 7.8. The estimate in column (1), 0.52, indicates that there should be on average a 1.14- log-point difference between the log GDPs of the corresponding countries (or approximately a 2-fold difference-e1 . 14- 1 2.1). In practice, this GDP gap is 253 log points (approximately 1-fold)

  33. where Y is income per capita in country i, R is the protection against expropriation measure, X i is a vector of other covariates R - Current institutions (protection against expropriation between 1985 and 1995), C - Early (circa 1900) institutions, S - European settlements in the colony (fraction of the population with European descent in 1900), M - mortality rates faced by settlers. X - vector of covariates that affect all variables.

  34. Robustness check

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#