Understanding Endogeneity and Instrumental Variable Estimation Methods

Slide Note
Embed
Share

Endogeneity in econometrics can create challenges such as omitted variables bias, measurement error, simultaneous causality, and using lagged values. This can affect the accuracy of models. One way to address this is through instrumental variable estimation methods. These methods help deal with endogeneity issues by finding suitable instrumental variables that are correlated with the endogenous variable but not the error term. The content also covers complications inducing correlation between variables and responses to endogeneity issues, like IV estimation and differencing methods.


Uploaded on Sep 26, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Endogeneityand Instrumental variable estimation method OBID A.KHAKIMOV

  2. Revew Four complications that induce correlation between X and 1. Omitted Variables Bias 2. Measurement Error 3. Simultaneous Causality 4. Using Lagged Values of the Dependent Variable as Explanators, in the presence of serial correlation

  3. Endogeneity 1. Omission of relevant variables = + the model e t which you estimate Y X u 1 1 t t = + + Y X X 1 1 2 2 t = 1 ' ( ' ) X X X Y 1 1 1 = X + + = 1 ' ( X ' ) ( ) = X X X X X e 1 ) 1 1 1 ( 1 2 ) 2 X t X + + = 1 ' 1 ' 1 ' ( ' ' ( ' ) X X X X X X X X e 1 + X 1 1 X 1 ' v 1 1 1 1 = 2 2 X 1 ) 1 1 t + X 1 ' 1 ' ( ) where e X ( ' X e X X X 1 2 1 1 1 1 1 1 2 t = 1 + + ' or and if E( ) 0 2 1 t ) = ( E 1 2

  4. Measurement error = + * Y Y v y = + + the * model which you estimate Y X u v t y = 1 ' ( ' ) X X X Y = + + 1 ' ( ' ) ( ) X X X X u v 1 1 t y ) = + + 1 ' 1 ' 1 ' ( ( ' ) ( ' ) ( ) ( ' ) ( ) E X X X X X X E X u X X E X v 1 1 t y = ' ' E if ( ) then 0 and in most cases ( ) 0 X u E X v t y ) = + 1 ' ( ( ' ) ( ) E X X E X v y

  5. Measurement error: independent variable = + * X X v x = + ) u + = * ( X Y X v u x t = + + = = + * * { } Y v Y X e t X x = + + 1 ' * ( ' ) X X X u v t x '* ) ( = + * * 1 * * * * 1 ( ' ) ' ( ' ) ( ) E X X X X X X E X u t '* '* + ) = * * 1 ( ' ) ( and if ( ) 0 X X E X v E X u x t '* ) ( = + ) * * 1 ( ' ) ( E X X E X v x

  6. Simultaneity Consider the simultaneous system Y X t = = + + + + X Y 1 Z W u v ( ) A B ( ) t 1 t 2 t t t 2 t t Reduced forms ( 1 ) ( ) ( ) 1 1 + W Z v u 2 t 2 t 1 t t = + + Y ( ) C ( ) ( ) ( ) v t 2 1 1 1 1 1 1 1 ( 1 ) ( ) ( ) 1 + Z W u 1 t 2 t 1 t 1 t = + + X ( ) D ( ) ( ) ( ) t 1 1 1 1 1 1

  7. Responses to Endogeneity Remedial-1: IV estimation method Remedial-2: Differencing methods: Limitations: - will not eliminate selection bias. - only eliminate fixed variables; sometimes endogenous variables change values over time 7

  8. Responses to Endogeneity Approach #3: Difference it out -- continued Limitations: - DD models will not eliminate selection bias. - DD models only eliminate fixed variables; sometimes endogenous variables change values over time 8

  9. Omitted Variables 1. Find additional data so that every relevant variable is included. 2. Ignore it: if omitted variable is uncorrelated with all included variables 3. Find proxy variable. Proxy z must be redundant (= ignorable) E (y | x, q, z) = E (y | x, q) 9

  10. Measurement Error 1. Improve measurement 2. Argue that the degree of error is small - Use outside data for validation 3. Argue that error is uncorrelated with included variables 10

  11. Instrumental Variables An Instrumental Variable is a variable that is correlated with X but uncorrelated with . If Zi is an instrumental variable: 1. E( Zi Xi ) 0 2. E( Zi i ) = 0

  12. Instrumental Variables The econometrician can use an instrumental variable Z to estimate the effect on Y of only that part of Xthat is correlated with Z. Because Z is uncorrelated with , any part of X that is correlated with Z must also be uncorrelated with . An instrumental variable lets the econometrician find a part of X that behaves as though it had been randomly assigned

  13. Instrumental Variables One way to see this is in terms of two regression equations Yi = 0 + 1Xi + i Xi = 0 + 1Zi + i Note that, in this model X is endogenous (may be correlated with ) The instrumental variables model requires that: 1. 1 0 so that Z predicts X, and 2. Z uncorrelated with (Z is exogenous) [Cov{ , Z} = 0]

  14. Implication Estimate model by OLS and by IV, and compare estimates OLS OLS use OLS If IV If use IV IV But test INDIRECTLY using Wu-Hausman test.

  15. THE INSTRUMENTAL VARIABLES (IV) ESTIMATOR Suppose that one or more of the regressors in X is not independent of the equation error term, even in the limit as the sample size goes to infinity. That is, X is correlated with u, the equation disturbance. However, suppose we have another variable, Z, (an instrument for X) that has the properties: (1)Z and X are correlated (2) Z and u are uncorrelated Now define the IV estimator as: IV = (Z X) Z Y -1

  16. Y = X + u -1 X Y ( = IV Z Z ) -1 / / ) u + = (Z X Z ( X ) IV -1 -1 / / / / + = (Z X Z X (Z X Z ) u ) ) IV -1 X u ( + = IV Z Z ) ( = IV= If Cor ) u , Z ( , 0 then E )

  17. Generalised IV estimator (GIVE) A more general form of the IV estimator where we have more instrumental variables than endogenous X variables GIVE is potentially more efficient than simple IV, if instruments are well-chosen Test whether instruments are valid using Sargan s test

  18. Finding a suitable Instrumental Variable How do we find an instrumental variable? There are two methods: Arbitrary search and test. Two stage least squares. Two Stage Least Squares (2SLS) offers an excellent direct estimation method in the case of exactly or over-identified equations. 18

  19. Strong IVs A strong instrument has a high correlation with the endogenous variable. How strong a correlation? Staiger & Stock (1997) recommend a partial F statistic of 5 or greater. - Run 1st stage with and without the IV. - Compare the overall F statistics: a difference of 5 or more is sufficient evidence of strength. 19

  20. Weak IVs If the IVs are weak, 2SLS and 2SRI are consistent, but there can be considerable bias even in large samples standard errors are too small 2SLS and 2SRI perform poorly 20

  21. Weak IVs What to do if IVs are weak? If there is a single endogenous variable, use a conditional likelihood ratio (CLR) test: * perform a regular likelihood ratio test * adjust the critical values * available in Stata; see Stata Journal, 3, 57-70 and http://elsa.berkeley.edu/wp/marcelo.pdf by Moreira and Poi 21

  22. Two stage least squares as IV estimation The first stage involves the creation of an instrument. Use the reduced from equation for P to get its fitted value, Phat. The second stage involves a variant of instrumental variables estimation. Replace P by Phat in the supply equation and use OLS in this second stage of the estimation process So it is in fact a special way and perhaps less arbitrary way of doing instrumental variables estimation. 22

  23. Two stage least squares estimation with modern econometric software Although one could undertake 2SLS estimation manually, running the reduced form regression, saving the fitted values and then running the second stage (structural form) regression, modern software allows you to get the results automatically with one set of instructions. You need to tell the software which RHS variable is endogenous and which other variables should be used as regressors in the reduced form (first stage) of the regression. Using the automatic IV procedure will also guarantee appropriate estimates of the second stage standard errors. 23

  24. IV over identification test

  25. Hausman specification test Test compares two coefficients and follows chi-square distribution

  26. Paper replication (IV method) The Colonial Origins of Comparative Development: An Empirical Investigation Author(s): Daron Acemoglu, Simon Johnson, James A. Robinson Source: The American Economic Review, Vol. 91, No. 5 (Dec., 2001), pp. 1369-1401

  27. Introduction Research question: How accurately to measure the effect of institutions on economic development process? Countries with better "institutions," more secure property rights, and less distortionary policies will invest more in physical and human capital, and will use these factors more efficiently to achieve a greater level of income (e.g., Douglass C. North and Robert P. Thomas, 1973; Eric L. Jones, 1981; North, 1981).

  28. Introduction Theory: 1. There were different types of colonization policies: A) European powers set up "extractives states" B) Setting new colonies with European institutions with strong emphasis on private property and checks against government power 2. States where disease environment was not favorable lead to less migration of colonizators and more likely development "extractives states 3. Colonial institutions persisted even after independence

  29. Literature and theory William H. McNeill (1976), Crosby (1986), and Jared M. Diamond (1997) have discussed the influence of diseases on human history. Diamond (1997), in particular, emphasizes comparative development, but his theory is based on the geographical determinants of the incidence of the neolithic revolution. Ronald E. Robinson and John Gallagher (1961), Lewis H. Gann and Peter Duignan (1962), Donald Denoon (1983), and Philip J. Cain and Anthony G. Hopkins (1993) emphasizes that settler colonies such as the United States and New Zealand are different from other colonies, and point out that these differences were important for their economic success. Frederich A. von Hayek (1960) argued that the British common law tradition was superior to the French civil law, which was developed during the Napoleonic era to restrain judges' interference with state policies

  30. Literature and theory Curtin (1964) documents how early British expectations for settlement in West Africa were dashed by very high mortality among early settler. Pilgrim decided to migrate to the United States rather than Guyana because of the high mortality rates in Guyana (see Crosby, 1986 pp. 143-44). Robinson and Gallagher (1961), Gann and Duignan (1962), Denoon (1983), and Cain and Hopkins (1993), have documented the development of "settler colonies," where Europeans settled in large numbers, and life was modeled after the home country. When the establishment of European like institutions did not arise naturally, the settlers were ready to fight for them against the wishes of the home country.

  31. Literature and theory There are a number of economic mechanisms that will lead to institutional persistence: 1. Setting up institutions that place restrictions on government power and enforce property rights is costly. It may not pay the elites at independence to switch to extractive institution. 2. The gains to an extractive strategy may depend on the size of the ruling elite. 3. If agents make irreversible investments that are complementary to a particular set of institutions, they will be more willing to support them, making these institutions per-sist (see, e.g., Acemoglu, 1995)

  32. Nigeria, which has approximately the 25th percentile of the institutional measure in this sample, 5.6, and Chile, which has approximately the 75th percentile of the institutions index, 7.8. The estimate in column (1), 0.52, indicates that there should be on average a 1.14- log-point difference between the log GDPs of the corresponding countries (or approximately a 2-fold difference-e1 . 14- 1 2.1). In practice, this GDP gap is 253 log points (approximately 1-fold)

  33. where Y is income per capita in country i, R is the protection against expropriation measure, X i is a vector of other covariates R - Current institutions (protection against expropriation between 1985 and 1995), C - Early (circa 1900) institutions, S - European settlements in the colony (fraction of the population with European descent in 1900), M - mortality rates faced by settlers. X - vector of covariates that affect all variables.

  34. Robustness check

Related


More Related Content