Simulation-Based Estimation Techniques in Econometrics

1 / 36

Embed Share

Explore simulation-based estimation techniques in econometrics through topics like conditional log likelihood, likelihood functions for random effects, and obtaining unconditional likelihood using methods like Butler and Moffitt. Discover applications in innovation studies with a Probit model for German manufacturing firms.

davionasu Follow

Uploaded on Apr 03, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Econometrics I Professor William Greene Stern School of Business Department of Economics 23-1/32 Part 23: Simulation Based Estimation

Econometrics I Part 23 Simulation Based Estimation 23-2/32 Part 23: Simulation Based Estimation

Settings Conditional and unconditional log likelihoods Likelihood function to be maximized contains unobservables Integration techniques Bayesian estimation Prior times likelihood is intractible How to obtain posterior means, which are open form integrals The problem in both cases is how to do the integration? 23-3/32 Part 23: Simulation Based Estimation

A Conditional Log Likelihood Conditional (on random v) density, f(y | , x ,v ) Unconditional density: f(y | , x ) i i i = f(y | , x ,v )h(v | )dv i i i i i i i v i Log likelihood function n log-L( , )= log f(y | , x ,v )h(v | )dv i i i i i i=1 v i Integral does not exist in closed form. How to do the maximization? 23-4/32 Part 23: Simulation Based Estimation

Application - Innovation Sample = 1,270 German Manufacturing Firms Panel, 5 years, 1984-1988 Response: Process or product innovation in the survey year? (yes or no) Inputs: Imports of products in the industry Pressure from foreign direct investment Other covariates Model: Probit with common firm effects (Irene Bertschuk, doctoral thesis, Journal of Econometrics, 1998) 23-5/32 Part 23: Simulation Based Estimation

Likelihood Function for Random Effects Joint conditional (on ui= vi) density for obs. i. ( ,..., | ) ( = i iT i t f y y v g y T T x x = = + | ) v [(2 1)( )] y v 1 it i it it i = 1 1 t Unconditional likelihood for observation i = T 1( g y = | ) ( ) i v h v dv L i it i i t v i How do we do the integration to get rid of the heterogeneity in the conditional likelihood? 23-6/32 Part 23: Simulation Based Estimation

Obtaining the Unconditional Likelihood The Butler and Moffitt (1982) method is used by most current software Quadrature (Stata GLAMM) Works only for normally distributed heterogeneity 23-7/32 Part 23: Simulation Based Estimation

Hermite Quadrature H 2 f(x,v)exp( v )dv f(x,v )W h h h 1 = Adapt to integrating out a normal variable exp( f(x) f(x,v) 2 (v / ) ) 2 1 2 = dv Change the variable to z = (1/( 2))v, v = ( 2)z and , dv=( 2)dz 1 2 = , z)exp( z )dz, = f(x) f(x 2 This can be accurately approximated by Hermite quadrature H f(x, z)W f(x) h h=1 23-8/32 Part 23: Simulation Based Estimation

Example: 8 Point Quadrature Nodes for 8 point Hermite Quadrature Use both signs, + and - 0.381186990207322000, 1.15719371244677990 1.98165675669584300 2.93063742025714410 Weights for 8 point Hermite Quadrature 0.661147012558199960, 0.20780232581489999, 0.0170779830074100010, 0.000199604072211400010 23-9/32 Part 23: Simulation Based Estimation

Butler and Moffitts Approach Random Effects Log Likelihood Function T N ( ) i t it = + 0 x log log , ( ) h v dv L g y v it i i i = = 1 1 Butler and Moffitt: Compute this by Hermite quadrature - z H h = when normal density ) ( ) ( ) f z w h( ) f(v h v dv v i i i h h i = 1 = = quadrature node; = is estimated with i v quadrature weight w h h 0 , z i 23-10/32 Part 23: Simulation Based Estimation

The Simulated Log Likelihood T N ( ) i t it = x + ( ) 0 log log , L g y v v dv it i i i = = 1 1 where v is the normally distributed effect. Use the law of large numbers: let v v a random sample of R draws from the standard normal po pulation. 1 R = t i = ,..., i1 iR T T ( ) ( ) r t R it it + + ( ) v dv 0 0 P x x , , g y v g y v it iR it i i i = 1 = 1 1 23-11/32 Part 23: Simulation Based Estimation

Monte Carlo Integration 1 R (Certain smoothness conditions must be met.) R = P ( ) ( ) ( ) i f u g u du [ ( )] f u f u E ir i i u i u i i = 1 r Drawing u by 'random sampling' ir = ( ), = ~ [0,1] + u E.g., Requires many draws, typically hundreds or thousands t v v U ir ir ir [ , for N 1 2 ( ) ] u v ir ir 23-12/32 Part 23: Simulation Based Estimation

Generating Random Draws Most common approach is the "inverse probability transform" Let u = a random draw from the standard uniform (0,1). Let x = the desired population to draw from Assume the CDF of x is F(x). The random Example: exponential, . f(x)= exp(- x), F(x)=1-exp(- x) Equate u to F(x), x = -(1/ )log(1-u). Example: Normal( , ). Inverse function does not exist in closed form. There are good polynomial approxi- mations to produce a draw from N[0,1] from a U(0,1). Then x = + v. -1 draw is then x = F (u). 23-13/32 Part 23: Simulation Based Estimation

Drawing Uniform Random Numbers Computer generated random numbers are not random; they are Markov chains that look random. The Original IBM SSP Random Number Generator for 32 bit computers. SEED originates at some large odd number + d2 = 2147483655.0 (2 7) d1=16807.0 (a strange number) SEED=Mod(d1*SEED,d3) ! MOD(a,p) = a - INT(a/p) * p X=SEED/d2 is a pseudo-random value Problems: (1) Short period. Based on 32 bits, so recycles after 2 (2) Evidently not very close to random. (Recent tests have discredited this RNG) (3) Current state of t he art is the Mersenne Twister. (Default in R, Matlab, etc.) Period = 2 Passes (DieHard) randomness tests 31 d3 = 2147483647.0 (2 1) 31 between 0 and 1. 31 1 values 20000 23-14/32 Part 23: Simulation Based Estimation

Poisson with mean = 4.1 Table Uniform Draw = .72159 Poisson Draw = 4 23-15/32 Part 23: Simulation Based Estimation

23-16/32 Part 23: Simulation Based Estimation

Quasi-Monte Carlo Integration Based on Halton Sequences Coverage of the unit interval is the objective, not randomness of the set of draws. Halton sequences --- Markov chain p = a prime number, I r= the sequence of integers, decomposed as i i p i b p i = 0 i I i = H(r|p) r = r (e.g., 10,11,12,...) 1 , ,... b 1 = 0 For example, using base p=5, the integer r=37 has b0 = 2, b1 = 2, and b2 = 1; (37=1x52 + 2x51 + 2x50). Then H(37|5) = 2 5-1 + 2 5-2 + 1 5-3 = 0.448. 23-17/32 Part 23: Simulation Based Estimation

Halton Sequences vs. Random Draws Requires far fewer draws for one dimension, about 1/10. Accelerates estimation by a factor of 5 to 10. 23-18/32 Part 23: Simulation Based Estimation

23-19/32 Part 23: Simulation Based Estimation

Model for Count C is Negative Binomial [ | , , ) i i i i E C SNAP l l = = + i = = i x SNAP + + x exp( ) l i C i i 2 unobserved normally distributed (0, ) heterogeneity i i A i i + ( ) ( ) C i C i Negative Binomial Distribution for C = (1 ) i A A i i + ( 1) C i Probit Model for SNAP Prob(SNAP=1| , , ) = ( i i + + x z x z ) l l i i ir S i i 23-20/32 Part 23: Simulation Based Estimation

NegBin(count|SNAP) Probit(SNAP) 23-21/32 Part 23: Simulation Based Estimation

+ + ( ) 1 R C N R C ir ln (1 ) (2 1) i A A SNAP B i ir i ir 1) ( ) + + = = ( C 1 1 i r i i i = + = + x x z a set of standard normal random draws (b ir v = R = 400 draws for each individual. ( ) ( ) A SNAP v B v ir i C i ir ir S i ir ased on Halton). 23-22/32 Part 23: Simulation Based Estimation

Panel Data Estimation A Random Effects Probit Model it = = + + = = x , i u 1,..., , 1,..., , y t T i N it it ( 1 0), (observation mechanism) 0 y y it it 2 x , ,..., ] ~ [ , ], N + ~ [0, ] ( , ) u N 1 2 i i iT i it it 1 1 + 2 = = 2 [...] (1 ) , Var 2 1 1 23-23/32 Part 23: Simulation Based Estimation

Log Likelihood T n x x = + )] ( ) log ( , ) L log [(2 1)( y v v dv it it i i i = = 1 i 1 t 2 = 1+ Quadrature 2 T n H x x + log ( , ) L log [(2 1)( )] W y z h it it h = = = 1 1 i h 1 t = W Simulated quadrature weight, z = quadrature node h h 1 R T n R x x + v log ( , ) L log [(2 1)( )] y it it ir = = = 1 1 i r 1 t v = rth draw from standard normal for individual i. (v ,...,v ) are reused for all co or derivatives. ir mputations of function i1 iR 23-24/32 Part 23: Simulation Based Estimation

Application: Innovation 23-25/32 Part 23: Simulation Based Estimation

Application: Innovation 23-26/32 Part 23: Simulation Based Estimation

(1.17072 / (1 + 1.17072) = 0.578) 23-27/32 Part 23: Simulation Based Estimation

Quadrature vs. Simulation Computationally, comparably difficult Numerically, essentially the same answer. MSL is consistent in R Advantages of simulation Can integrate over any distribution, not just normal Can integrate over multiple random variables. Quadrature is largely unable to do this. Models based on simulation are being extended in many directions. Simulation based estimator allows estimation of conditional means essentially the same as Bayesian posterior means 23-28/32 Part 23: Simulation Based Estimation

A Random Parameters Model + Employment + Productivity) Prob(Innovation)= ( FDI Imports 1i 2i + 3 + logSales 4 5 6 1i 1 11 12 ~ N , 2i 2 12 22 and four fixed (nonrandom) parameters. 23-29/32 Part 23: Simulation Based Estimation

n = log ( , ) L log = 1 i T x x 0 + + + + )] ( ) ( 0 it 1 it 2 it [(2 1)( ( ) ( ) ) y v x v x v v dv dv 1 1 1 2 2 2 2 1 2 1 it i i i i i i = 1 t 2 = 1+ Simulated 2 1 R T n R log ( , ) L log ( ) G v v 1, i r i 2, r = = = 1 1 i r 1 t (v v = rth draw from standard normal for individual i and variable k. ,...,v ,...) are reused for all computations of function or derivatives. ik,r i1,r i1,R 23-30/32 Part 23: Simulation Based Estimation

Estimates of a Random Parameters Model ---------------------------------------------------------------------- Probit Regression Start Values for IP Dependent variable IP Log likelihood function -4134.84707 Estimation based on N = 6350, K = 6 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.30420 8281.69414 --------+------------------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X --------+------------------------------------------------------------- Constant| -2.34719*** .21381 -10.978 .0000 FDIUM| 3.39290*** .39359 8.620 .0000 .04581 IMUM| .90941*** .14333 6.345 .0000 .25275 LOGSALES| .24292*** .01937 12.538 .0000 10.5401 SP| 1.16687*** .14072 8.292 .0000 .07428 PROD| -4.71078*** .55278 -8.522 .0000 .08962 --------+------------------------------------------------------------- 23-31/32 Part 23: Simulation Based Estimation