Introduction to Vector Autoregressions in Econometrics

MSc Time Series Econometrics

Module 2

Lecture 1:  VARs, introduction, motivation, estimation,

preliminaries

Tony Yates

Spring 2014, Bristol

Me

•

New to academia.

•

20 years in Bank of England, directorate

responsible for monetary policy.

•

Various jobs on Inflation Report, Inflation

Forecast, latterly as senior advisor, leading

‘monetary strategy team’

•

Research interests:  VARs, TVP-VARs, monetary

policy design, learning, DSGE, heterogeneous

agents.

•

Also teaching MSc time series on VARs.

Follow my outputs if you are

interested

•

My

research homepage

•

Blog ‘

longandvariable

’ on macro and public

policy

•

Twitter feed

@tonyyates

Topics we will cover

•

Vector autoregressions:

–

motivation

–

Estimation, MLE, OLS, Bayesian using analytical and

Gibbs Sampling MCMC methods

–

Identification [short run restrictions, long run

restrictions, sign restrictions, max share criteria]

–

 interpretation, use, contribution to macroeconomics

•

Factor models in vector autoregressions

•

TVP VAR estimation using kernels.

•

If we have time, for fun:  The Kalman Filter,

Bootstrapping

Course/learning strategy

•

Rudimentary algebra of VARs, estimation,

identification, etc.

•

Some examples of code to i) give an insight into

what might lie in store if you continue ii) can

sometimes help to demystify the algebra.

•

Applications:  their contribution, impact.

•

In the exam, you will be expected to understand

and reproduce the algebra, and to cite and

comment on the applications.

•

You won’t be expected to write code.

VARs  useful sources

•

Chris Sims, 'Macroeonomics and reality‘

•

Lutz Kilian 'Structural Vector Autoregressions‘

•

Fabio Canova:  Methods for Applied Business

Cycle research

•

James Hamilton 'Time Series Analysis‘

•

Helmut Luktepohl 'New introduction to

multiple time series analysis'

Useful sources, ctd

•

Stock and Watson:  implications of dynamic

factor models for VAR analysis

•

Stock and Watson:  'Dynamic factor models'

Matrix/linear algebra pre-requisites

•

Scalar, vector, matrix.

•

Transpose

•

Inverse (matrix equivalent of dividing).

•

Diagonal matrix.

•

Eigenvalues and eigenvectors.

•

Powers of a matrix.

•

Matrix series sums.

  Matrix equivalent of geometric

scalar sums.

•

Variance-covariance matrix.

•

Cholesky factor of a variance-covariance matrix.

•

Givens matrix.

Some applications

•

Christiano, Eichenbaum, Evans:

‘Monetary

policy shocks:  what have we learned and to

what end?’

•

Christiano, Eichenbaum and Evans (2005)

:  ‘Nominal

rigidities and the dynamics effects of a monetary

policy shock’

•

Mountford, Uhlig (2008):

‘what are the effects of

fiscal policy shocks?

’

•

Gali (1999)

:  ’Technology, employment and the

business cycle….’

VAR motivation:  Cowles Commission

models

•

Dominant paradigm was large scale

macroeconometric models, in policy institutions

especially

•

Many estimated equations.

•

Academic origins in foundational work to create

national accounts;  Keynesian formulation of

macroeconomics;  Haavelmo’s notion of

‘probability model’ applied to this.

•

Nice discussion in

Sims’ Nobel acceptance

lecture.

Silly example of a CC model

Sims:  Equation for C excludes U, TU

Equation for Y excludes C, U

And so on....

Lucas:  equation for C sounds like common

sense, but is it the C that reflects the

solution to a consumers’ problem in a

general equilibrium model?  Maybe, or

maybe not

Those u’s look exogenous

and are meant to be, but

are they really primitive

shocks?  Both Sims’ and

Lucas’ critiques would

suggest not

Critiques of Cowles Commission

approach

•

Lucas (1976):

–

Laws of motion have to come from solving

problems of agents in the model

–

If not, correlations will change if policy changes

•

Sims (1981):

–

‘Incredible’ identification restrictions

Response to Sims and Lucas critiques

•

No incredible restrictions.  Everything left in.

•

Reduced form shocks span the structural

shocks.

•

Structural shocks and their effects sought

through identification, reference to classes of

Lucas-Critique proof models

•

‘Modest policy interventions’ [Sims and Zha].

Cowles Commission variables set out

as a VAR model

Potentially, everything is a

function of everything else

lagged

Simultaneity encoded

in the reduced form

errors.  To be

disentangled into

structural shocks

through identification.

VAR topics and contributions to macro

(1)

•

Estimating parameters in a DSGE model:

–

Rotemburg and Woodford (1998)

–

Christiano, Eichenbaum and Evans (2005)

•

Identify monetary policy ‘shock’

•

Choose parameters of the model so that

impulse response

 to this shock in the DSGE

model as close as possible to corresponding

IRF in the VAR.

VAR topics and contributions to macro

(2)

•

Evaluating the RBC claim that technology

shocks cause business cycles

–

Gali (1999)

•

Identified technology shocks as the only thing

that could change N/Y in long run

•

Showed that these caused hours work to fall,

not rise

•

Inconsistent with RBC model (‘make hay while

the sun shines’)

VAR topics and contributions to macro

(3)

•

Cogley and Sargent (2005)

–

VAR with time-varying parameters

–

Multivariate counterpart to persistence,

‘predictability’

–

Bayesian estimation

•

By how much did inflation predictability change

in the post-war period?

•

If it changed a lot, what does that suggest about

its causes?

Misc technical preliminaries to help

you read the papers

•

The lag polynomial operator

‘L

’

Lag / lead operator denoted by positive,

negative powers of L

Misc technical preliminaries...

•

Cholesky decomposition

Sigma is a v-cov matrix,

with elements symmetric

about the diagonal;

Cholesky factor on the

RHS

Here we decompose further

using an orthonormal

matrix P

Diagonals of vcov matrix are positive beause

these correspond to variances…

Misc technical preliminaries

•

Givens matrix

is an example of an orthonormal

matrix

Also known as a  Givens ‘rotation’

Useful theorem:  any orthonormal matrix can be shown to be a product of

Givens matrices with different thetas, the number depending on the

dimension of the orthonormal matrix concerned.

Products of orthonormal matrices

If two matrices are orthonormal, then so is the product of those matrices.

The VAR impulse response function

Impulse response function in a univariate time series model

Take some unit value for e1, then substitute into eq for y repeatedly

AR(1) impulse response function

Y_t=0.85*y_t-1+e_t;  e(1)=1

IRF shrinks back to zero as we multiply 1 by successively higher

powers of 0.85

Matlab code to plot IRF in an AR(1)

Matlab code to plot an AR(1) impulse

response

•

%compute impulse response in an AR(1).

•

%illustration for MSc Time Series Econometrics module.

•

•

%y_t=rho*y_t-1+e_t.

•

•

%calibrate parameters

•

rho=0.85;                    %persistence parameter in the ar(1)

•

samp=100;                    %length of time we will compute y for.

•

y=zeros(samp,1);             %create a vector to store our ys.

•

                             %semi colons suppress output of every command

•

                             %to screen.

•

e=1;                         %value of the shock in period 1.

•

y(1)=e;                      %first period, y=shock.

•

•

for t=2:samp

•

    y(t)=rho*y(t-1);         %loop to project forwards effect of the unit shock.

•

end

•

•

%now plot the impulse response

•

time=[1:samp];               %create a vector of numbers to record the progression of time

•

plot(time,y)

VAR impulse response

 A multivariate example.

For IRF to e1, choose e=[1,0] for first period, then project forwards....

Matlab code to plot VAR(1) impulse

response function

•

%compute impulse response in an VAR(1).

•

%illustration for MSc Time Series Econometrics module.

•

•

%y_t=A*y_t-1+e_t.

•

•

clear all;                   %bit of housekeeping to clear all variables so each time you run program as you are debugging

•

                             %you know you are not adding onto previous

•

                             %values

•

•

%calibrate parameters

•

A=[0.6 0.2;

•

   0.2 0.6];

•

•

samp=100;                    %length of time we will compute y for.

•

y=zeros(samp,2);             %create a matrix this time to store our 2 by samp bivariate time series y={y1,y2}.

•

•

e=[1;0];                     %we will simulate a shock to the first equation. Note shock in first period is now a 2 by 1 vector.

•

y(1,:)=e;                      %first period, y=shock.  the colon ':' means 'corresponding values in this dimension'

•

•

for t=2:samp

•

    y(t,:)=A*y(t-1,:)';         %loop to project forwards effect of the unit shock.

•

end                             %' is transpose

•

•

%now plot the impulse response

•

time=[1:samp];               %create a vector of numbers to record the progression of time

•

subplot(2,1,1)

•

plot(time,y(:,1))

•

subplot(2,1,2)

•

plot(time,y(:,2))

VAR(1) impulse response

Y_t=A*Y_t-1’+e_t,   e_t=[1 0],   A=[0.6 0.2;0.2 0.6]

Note that the eigenvalues of A are 0.4 and 0.8

Using a VAR to forecast

Forecast at future horizon

, conditional

on starting from steady state= IRF to the

latest estimated shock.

Forecast conditional on all the shocks estimated to have occurred:

Sum of IRF to that shock at increasing horizons.

Terms further to the right get smaller as higher powers of

 [=smaller for

stable VAR]

Reflects response to shocks that hit further and further back in time.

Forecasts further and further out shrink back to steady state for the same

reason.  Higher powers of

 has eigenvalues <1 in absolute value.

VAR(1) representation of a VAR(p)

Moving average representation of a

VAR(p)

We have used the VAR(1) form of the VAR(p).

In words, it means Y is the sum of shocks, where each shock taken to higher

and higher powers as we go back in time.

Notice how this is related to the formula for the VAR impulse response we

computed before.

Persistence, memory, predictability,

stability

•

When a shock hits, how long does it takes for its effects to

die out?

•

Applications:

–

Business cycle theory concerned with mechanisms for

propagation.  Consumption and output not white noise:  why?

–

Bad monetary and fiscal policy could be part of the story about

why shocks take time to die out

•

Time series notions of persistence etc are one way to

characterise propagation and bad policy.

•

Why ‘bad’, because more persistence means larger

variance, and larger variance for most utility functions is

bad

Persistence and variance

Persistence, higher rho, means lower denominator, means higher

unconditional variance of y

Most economic models asssume agents don’t like variance

So persistence is interesting economically, since it usually indicates

something ‘bad’ is happening

Persistence and predictability

Multivariate predictability:  if set horizon=2, dimension of

Y_t=1, then this formula delivers rho^2, ie persistence squared.

AR ‘stability’

Univariate model.

Stability=stationarity=series have convergent sums=first and second moments

independent of t, and computable

Effects of shock eventually die

out.  So series converges to

something.

Here we take the perspective that today’s data is the sum of the

effects of shocks going back into the infinite past.

Since today’s data is finite and well-defined, then it must be

that shocks infinitely far back have no effect.  Otherwise today’s

data would be infinitely large.

The contribution of a shock very far

back goes to zero

VAR(1) stability

Stability condition echoes the AR case,

but where does the dependence on

the eigenvalues come from?

Here we write out a vector Y as the sum of contributions from shocks

going back further and further into time.

Explaining VAR(1) eigenvalue stability

condition

The crucial thing is to make this A^n go to zero

as n goes to infinity.

Importance of eigenvalues in this happening

comes from the fact that we can write a

square matrix using the eigenvalue-

eigenvector decomposition.

And then compute the power of A using the

powers of the eigenvales of A.

So to make A^n go to zero, we have to make

all the diagonal elements of D go to zero, and

by analogy with the AR(1) case, this means the

eigenvalues have to be < 1 in absolute value.

VAR(p) stability

VAR estimation

•

Linear, multivariate model

•

Suggests estimation by...

•

OLS!

•

Or MLE, which in these circumstances [linear;

Gaussian errors] is equivalent.

•

MLE cumbersome because you may have

many parameters over which to optimise

•

Why is this cumbersome?  Well, you tell me.

OLS estimation of VAR parameters

Univariate case

Multivariate case for

VAR(1) representation of

VAR(p)

OLS estimate of reduced form residual

variance covariance matrix

Univariate case.

Take data, subtract prediction, multiply

residual vector by transpose....

Analogously for the multivariate

case

The likelihood function for a VAR

•

Why bother if we can use OLS?  Given the

drawbacks of MLE?

•

Log posterior is sum of log likelihood and log

prior:  so we need it for Bayesian estimation

•

Key to understanding many concepts:

–

Origin and derivation of standard errors from slope of

LF

–

Estimation when we can’t evaluate the LF but have to

approximate it by simulation

–

Pseudo-ML when the data are non-Gaussian

Likelihood fn for an AR(1)

LF for an AR(1)/ctd...

Logging turns complicated product into long sum

We can maximise this and ignore constants.

Likelihood for a VAR(p)

Recap

•

Remember likelihood assumes Gaussian errors

•

In some circumstances you can get consistent

estimates of parameters (but not standard

errors) even if this is violated.

Distributions for a VARs impulse

response functions

IRF is easy in an AR1.

It involves powers of

only 1 coefficient.  So

distributions of rho can

be used to compute

distributions of the

elements of the IRFs

vector.  Work it out.

Things harder with a VAR as involves

many, jointly distributed coefficients.

Bootstrapping algorithm for VAR IRFs

Suppose have estimated a VAR(p)

Set

iter

=200 or so.

Algorithm will generate iter vectors h long, h=max chosen horizon of

the IRF.

Q:  how to do step 1 with computer random number generator?

Why estimate VARs

•

Estimate RBC/DSGE model by choosing the

parameters to match the VARs IRF

•

Estimate using indirect inference.

•

Test implications of model by identifying a

shock.

•

Accounting for the business cycle by

identifying a shock.

•

Forecasting.

Indirect inference with VARs

Key reference is

Gourieroux et al

Measure of distance, eg, euclidian norm, perhaps weighted in some way.

VAR impulse response matching [Rotemburg and Woodford] is akin to

indirect inference.

In some cases, DSGE models don’t have VAR representations, though

they have VAR approximations

When they do, more properly called a partial information method, rather

than indirect inference.

IRFs one of the infinite moments summarised by the likelihood

Next topic, in fact most of rest of

course, will be on identification in

VARs

•

Apologies – we are going to do some

economics.

•

These are going to be ‘credible’ identification

restrictions, contrast with Cowles Commission.

•

But still contestable.

•

They will be based on classes of business cycle

models.

•

Enables a test of a theory without making too

many auxiliary assumptions that could fail.

From VARs to SVARs

•

Short-run, ‘zero’ restrictions [Sims, CEE]

•

Long run restrictions. [Blanchard-Quah, Gali]

•

Sign restrictions. [Faust, Uhlig, Mountford and

Uhlig, Canova and de Nicolo]

•

Max share restrictions and news shocks.

[Francis et al, Barsky-Sims, Pinter, Theodoridis

and Yates]

•

Heteroskedasticity-based identification

[Rigobon]

Slide Note

Embed Share

Download

Explore the world of Vector Autoregressions (VARs) in econometrics with Tony Yates. This lecture provides an overview of VARs, including motivation, estimation techniques, and key concepts such as identification and factors models. Learn about the applications of VARs in macroeconomics and the resources recommended by experts in the field. Gain insights into the algebra of VARs and their real-world implications without the need for coding. Follow Tony's academic journey and delve into the intricacies of economic analysis through VARs.

carm_7 Follow

Uploaded on Sep 18, 2024 | 4 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

MSc Time Series Econometrics Module 2 Lecture 1: VARs, introduction, motivation, estimation, preliminaries Tony Yates Spring 2014, Bristol

Me New to academia. 20 years in Bank of England, directorate responsible for monetary policy. Various jobs on Inflation Report, Inflation Forecast, latterly as senior advisor, leading monetary strategy team Research interests: VARs, TVP-VARs, monetary policy design, learning, DSGE, heterogeneous agents. Also teaching MSc time series on VARs.

Follow my outputs if you are interested My research homepage Blog longandvariable on macro and public policy Twitter feed @tonyyates

Topics we will cover Vector autoregressions: motivation Estimation, MLE, OLS, Bayesian using analytical and Gibbs Sampling MCMC methods Identification [short run restrictions, long run restrictions, sign restrictions, max share criteria] interpretation, use, contribution to macroeconomics Factor models in vector autoregressions TVP VAR estimation using kernels. If we have time, for fun: The Kalman Filter, Bootstrapping

Course/learning strategy Rudimentary algebra of VARs, estimation, identification, etc. Some examples of code to i) give an insight into what might lie in store if you continue ii) can sometimes help to demystify the algebra. Applications: their contribution, impact. In the exam, you will be expected to understand and reproduce the algebra, and to cite and comment on the applications. You won t be expected to write code.

VARs useful sources Chris Sims, 'Macroeonomics and reality Lutz Kilian 'Structural Vector Autoregressions Fabio Canova: Methods for Applied Business Cycle research James Hamilton 'Time Series Analysis Helmut Luktepohl 'New introduction to multiple time series analysis'

Useful sources, ctd Stock and Watson: implications of dynamic factor models for VAR analysis Stock and Watson: 'Dynamic factor models'

Matrix/linear algebra pre-requisites Scalar, vector, matrix. Transpose Inverse (matrix equivalent of dividing). Diagonal matrix. Eigenvalues and eigenvectors. Powers of a matrix. Matrix series sums. Matrix equivalent of geometric scalar sums. Variance-covariance matrix. Cholesky factor of a variance-covariance matrix. Givens matrix.

Some applications Christiano, Eichenbaum, Evans: Monetary policy shocks: what have we learned and to what end? Christiano, Eichenbaum and Evans (2005): Nominal rigidities and the dynamics effects of a monetary policy shock Mountford, Uhlig (2008): what are the effects of fiscal policy shocks? Gali (1999): Technology, employment and the business cycle .

VAR motivation: Cowles Commission models Dominant paradigm was large scale macroeconometric models, in policy institutions especially Many estimated equations. Academic origins in foundational work to create national accounts; Keynesian formulation of macroeconomics; Haavelmo s notion of probability model applied to this. Nice discussion in Sims Nobel acceptance lecture.

Silly example of a CC model Sims: Equation for C excludes U, TU Equation for Y excludes C, U And so on.... Ct c0 c1Yt c2Yt 1 uCt Yt y0 y1Nt y2Nt 1 y3Wt uYt Wt w0 w1Ut w2TUt uWt Ut u0 u1Yt u2Yt 1 uUt TUt tu0 tu1Yt tu2Ut uTUt Those u s look exogenous and are meant to be, but are they really primitive shocks? Both Sims and Lucas critiques would suggest not Lucas: equation for C sounds like common sense, but is it the C that reflects the solution to a consumers problem in a general equilibrium model? Maybe, or maybe not

Critiques of Cowles Commission approach Lucas (1976): Laws of motion have to come from solving problems of agents in the model If not, correlations will change if policy changes Sims (1981): Incredible identification restrictions

Response to Sims and Lucas critiques No incredible restrictions. Everything left in. Reduced form shocks span the structural shocks. Structural shocks and their effects sought through identification, reference to classes of Lucas-Critique proof models Modest policy interventions [Sims and Zha].

Cowles Commission variables set out as a VAR model 0 0 0 . . . A16 . . . A11 A21 . A12 . Ct Yt Wt Ut TUt Nt 0 . . . . . . Yt 1 A1Yt 2 ...Zt Yt . . . . . . . . . . . . 0 0 . . . . A66 A61 Simultaneity encoded in the reduced form errors. To be disentangled into structural shocks through identification. Potentially, everything is a function of everything else lagged

VAR topics and contributions to macro (1) Estimating parameters in a DSGE model: Rotemburg and Woodford (1998) Christiano, Eichenbaum and Evans (2005) Identify monetary policy shock Choose parameters of the model so that impulse response to this shock in the DSGE model as close as possible to corresponding IRF in the VAR.

VAR topics and contributions to macro (2) Evaluating the RBC claim that technology shocks cause business cycles Gali (1999) Identified technology shocks as the only thing that could change N/Y in long run Showed that these caused hours work to fall, not rise Inconsistent with RBC model ( make hay while the sun shines )

VAR topics and contributions to macro (3) Cogley and Sargent (2005) VAR with time-varying parameters Multivariate counterpart to persistence, predictability Bayesian estimation By how much did inflation predictability change in the post-war period? If it changed a lot, what does that suggest about its causes?

Misc technical preliminaries to help you read the papers The lag polynomial operator L yt 1yt 1 2yt 2 ... pyt p 0L0yt 1 1L1yt 1 ... p 1Lp 1yt 1 L yt 1 Lyt yt 1,L 1yt yt 1 Lag / lead operator denoted by positive, negative powers of L

Misc technical preliminaries... Cholesky decomposition Sigma is a v-cov matrix, with elements symmetric about the diagonal; Cholesky factor on the RHS . . a 0 0 a . . 11 21 22 31 32 33 . . b 0 0 b . . . c 0 0 c a 0 0 a . . PP . b 0 0 b . . . c 0 0 c 1 0 0 PP I 0 1 0 Here we decompose further using an orthonormal matrix P 0 0 1 a,b,c 0 Diagonals of vcov matrix are positive beause these correspond to variances

Misc technical preliminaries Givens matrix is an example of an orthonormal matrix 1 0 0 0 0 c s 0 ,c cos ,s sin P 0 s 0 c 0 0 0 1 Also known as a Givens rotation Useful theorem: any orthonormal matrix can be shown to be a product of Givens matrices with different thetas, the number depending on the dimension of the orthonormal matrix concerned.

Products of orthonormal matrices I I,PbPb PaPa PaPb PaPb I If two matrices are orthonormal, then so is the product of those matrices.

The VAR impulse response function yt yt 1 et y1 y2 y3 ... e e 2e yirf ... n 1e yn Impulse response function in a univariate time series model Take some unit value for e1, then substitute into eq for y repeatedly

AR(1) impulse response function 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Y_t=0.85*y_t-1+e_t; e(1)=1 IRF shrinks back to zero as we multiply 1 by successively higher powers of 0.85

Matlab code to plot IRF in an AR(1)

Matlab code to plot an AR(1) impulse response %compute impulse response in an AR(1). %illustration for MSc Time Series Econometrics module. %y_t=rho*y_t-1+e_t. %calibrate parameters rho=0.85; %persistence parameter in the ar(1) samp=100; %length of time we will compute y for. y=zeros(samp,1); %create a vector to store our ys. %semi colons suppress output of every command %to screen. e=1; %value of the shock in period 1. y(1)=e; %first period, y=shock. for t=2:samp y(t)=rho*y(t-1); %loop to project forwards effect of the unit shock. end %now plot the impulse response time=[1:samp]; %create a vector of numbers to record the progression of time plot(time,y)

VAR impulse response y1 y2 a11 a12 a21 a22 y1 y2 e1 e2 Yt AYt 1 et t 1 t e1 e2 e,Ae,A2e,... ,e Yirf,1 A multivariate example. For IRF to e1, choose e=[1,0] for first period, then project forwards....

Matlab code to plot VAR(1) impulse response function %compute impulse response in an VAR(1). %illustration for MSc Time Series Econometrics module. %y_t=A*y_t-1+e_t. clear all; %bit of housekeeping to clear all variables so each time you run program as you are debugging %you know you are not adding onto previous %values %calibrate parameters A=[0.6 0.2; 0.2 0.6]; samp=100; %length of time we will compute y for. y=zeros(samp,2); %create a matrix this time to store our 2 by samp bivariate time series y={y1,y2}. e=[1;0]; %we will simulate a shock to the first equation. Note shock in first period is now a 2 by 1 vector. y(1,:)=e; %first period, y=shock. the colon ':' means 'corresponding values in this dimension' for t=2:samp y(t,:)=A*y(t-1,:)'; %loop to project forwards effect of the unit shock. end %' is transpose %now plot the impulse response time=[1:samp]; %create a vector of numbers to record the progression of time subplot(2,1,1) plot(time,y(:,1)) subplot(2,1,2) plot(time,y(:,2))

VAR(1) impulse response 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 80 90 100 0.25 0.2 0.15 0.1 0.05 0 0 10 20 30 40 50 60 70 80 90 100 Y_t=A*Y_t-1 +e_t, e_t=[1 0], A=[0.6 0.2;0.2 0.6] Note that the eigenvalues of A are 0.4 and 0.8

Using a VAR to forecast es 0, s t etAh Forecast at future horizon h, conditional on starting from steady state= IRF to the latest estimated shock. f | Yt h etAh et 1Ah 1 et 2Ah 2 ... f et nAh n Yt h Forecast conditional on all the shocks estimated to have occurred: Sum of IRF to that shock at increasing horizons. Terms further to the right get smaller as higher powers of h [=smaller for stable VAR] Reflects response to shocks that hit further and further back in time. Forecasts further and further out shrink back to steady state for the same reason. Higher powers of h , A has eigenvalues <1 in absolute value.

VAR(1) representation of a VAR(p) Yt A1Yt 1 A2Yt 2 ...ApYt p et Yt Yt 1 ... Yt AYt 1 et Yt p 1 A1 A2 ... Ap Ik 0 0 Ik 0 0 et 0 ... 0 A ,et ... 0 ... 0 0 Ik

Moving average representation of a VAR(p) Yt Aiet i i 0 We have used the VAR(1) form of the VAR(p). In words, it means Y is the sum of shocks, where each shock taken to higher and higher powers as we go back in time. Notice how this is related to the formula for the VAR impulse response we computed before.

Persistence, memory, predictability, stability When a shock hits, how long does it takes for its effects to die out? Applications: Business cycle theory concerned with mechanisms for propagation. Consumption and output not white noise: why? Bad monetary and fiscal policy could be part of the story about why shocks take time to die out Time series notions of persistence etc are one way to characterise propagation and bad policy. Why bad , because more persistence means larger variance, and larger variance for most utility functions is bad

Persistence and variance yt yt 1 et var yt 2var yt 1 var et 2 cov yt 1,et 2var yt 1 var et NB : var yt var yt 1 var yt var et 1 2 Persistence, higher rho, means lower denominator, means higher unconditional variance of y Most economic models asssume agents don t like variance So persistence is interesting economically, since it usually indicates something bad is happening

Persistence and predictability j 1 ei h 0 ei h 0 ei Ath t Ath j 1 Pt ei Ath t Ath Multivariate predictability: if set horizon=2, dimension of Y_t=1, then this formula delivers rho^2, ie persistence squared.

AR stability yt yt 1 et,| | 1 Univariate model. e, e, 2e,... ne limt te 0 Effects of shock eventually die out. So series converges to something. Stability=stationarity=series have convergent sums=first and second moments independent of t, and computable

yt et et 1 2et 2 ... net n set s s 0 lims set s 0 The contribution of a shock very far back goes to zero Here we take the perspective that today s data is the sum of the effects of shocks going back into the infinite past. Since today s data is finite and well-defined, then it must be that shocks infinitely far back have no effect. Otherwise today s data would be infinitely large.

VAR(1) stability Yt AYt 1 et x eig A ;|x| 1, x det IK Ax 0 |x| 1 Stability condition echoes the AR case, but where does the dependence on the eigenvalues come from? Yt A0et A1et 1 A2et 2 ... Anet n A0L0e A Let A2L2et ... LnAnet Here we write out a vector Y as the sum of contributions from shocks going back further and further into time.

Explaining VAR(1) eigenvalue stability condition An PDnP 1 The crucial thing is to make this A^n go to zero as n goes to infinity. Importance of eigenvalues in this happening comes from the fact that we can write a square matrix using the eigenvalue- eigenvector decomposition. And then compute the power of A using the powers of the eigenvales of A. So to make A^n go to zero, we have to make all the diagonal elements of D go to zero, and by analogy with the AR(1) case, this means the eigenvalues have to be < 1 in absolute value. n 0 0 1 0 Dn n 0 2 0 n 0 d P v1 v2 vd

VAR(p) stability Yt Yt 1 ... Yt AYt 1 et Yt p 1 A1 A2 ... Ap Ik 0 0 Ik 0 0 et 0 ... 0 A ,et ... 0 ... 0 0 Ik x eig A ;|x| 0, x

VAR estimation Linear, multivariate model Suggests estimation by... OLS! Or MLE, which in these circumstances [linear; Gaussian errors] is equivalent. MLE cumbersome because you may have many parameters over which to optimise Why is this cumbersome? Well, you tell me.

OLS estimation of VAR parameters yt yt 1 et y y2,...yT x x Univariate case ,x y2,...yT 1 1 x y Multivariate case for VAR(1) representation of VAR(p) 1X Y X A X

OLS estimate of reduced form residual variance covariance matrix yt yt 1 et Univariate case. Take data, subtract prediction, multiply residual vector by transpose.... 1/T e e , e y x y y2...yT ,x y1...yT 1 1/T e e , Analogously for the multivariate case e Y AX

The likelihood function for a VAR Why bother if we can use OLS? Given the drawbacks of MLE? Log posterior is sum of log likelihood and log prior: so we need it for Bayesian estimation Key to understanding many concepts: Origin and derivation of standard errors from slope of LF Estimation when we can t evaluate the LF but have to approximate it by simulation Pseudo-ML when the data are non-Gaussian

Likelihood fn for an AR(1) 2 yt yt 1 et,et N 0, e E y1 0 2 e 1 var y1 E y1 0 2 0 y1,y1 02 y1 2/ 1 2 1 2 1 2 0; , e 2 exp fy1 y1 2 e 2 e y2 y1 e2 y2 y1 y1 0 N y1 0, e 2 0 y1 2/ 1 2 0 2 y2 2 e 1 2 1 2 0 y1 0, , e 2 exp fy2 y1 y2 2 e

LF for an AR(1)/ctd... fy y1,y2... fy1fy2 y1fy3 y2...fyT yT 1 02 y1 logfy y1,y2... 0.5log 2 0.5log 2 1 2 2 2/ 1 2 T 1 /2 log 2 T 1 /2 log 2 T 0 yt 1 2 2 0 2 yt t 2 Logging turns complicated product into long sum We can maximise this and ignore constants.

Likelihood for a VAR(p) Yt Yt 1,Yt 2,...Yt p 1 Yt Yt 1,Yt 2,...Yt p A A1,A2....Ap Yt Yt 1,Yt 2,...Yt p 1 N Yt A1Yt 1 A2Yt 2 ...ApYt p, N A Yt, 0....;A, 2 n/2| 1|0.5exp 0.5 Yt 0,Yt 1 0,Yt 2 0 A Yt 0 1 Yt 0 A Yt 0 fYt Yt 1,Yt 2,...Yt p 1 Yt T log fY Y1,Y2...YT log fY1 ... log fY2 ... ... log fYt Yt 1,... t 1 Tn/2 log 2 T/2 log| 1| T 0.5 0 A Yt 0 1 Yt 0 A Yt 0 Yt t 1

Recap Remember likelihood assumes Gaussian errors In some circumstances you can get consistent estimates of parameters (but not standard errors) even if this is violated.

Distributions for a VARs impulse response functions yt yt 1 et IRF is easy in an AR1. e, e, 2e,... ne It involves powers of only 1 coefficient. So distributions of rho can be used to compute distributions of the elements of the IRFs vector. Work it out. irf Ahe,h 0,1,2... Things harder with a VAR as involves many, jointly distributed coefficients. Yt h

Bootstrapping algorithm for VAR IRFs Yt AYt 1 et Suppose have estimated a VAR(p) 1.Draw, withreplacement, a time series of shocks, ei ei1, ei2,,, 2.Create a new time series of observeables using Yt AYt 1 eit 3.Re-estimate the VARto produce Ai 4. Compute IRFs using a unit shock e0 1,0,0... and powers of Ai 5. set i i 1, returnto step 1. if i iter Set iter=200 or so. Algorithm will generate iter vectors h long, h=max chosen horizon of the IRF. Q: how to do step 1 with computer random number generator?

Introduction to Vector Autoregressions in Econometrics

Download Presentation

Presentation Transcript

Related

More Related Content