
Key Concepts in Multivariate Probability Distributions
Explore the intricacies of multivariate probability distributions, including parameter estimation, different distributions like MVN and MVE, and the importance of considering correlation to avoid biased results in modeling multiple correlated random variables.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Multivariate Probability Distributions Lecture 17 Chapter 7 Study this closely Chapter 16 Sections 3.9.1-3.9.7 and 4.3 Lecture 17 Multivariate Empirical Dist.xlsx Lecture 17 Multivariate Normal Dist.xlsx Lecture 17 Correct Std Dev EMP.xlsx
Multivariate Probability Distributions Multivariate (MV) Distribution --Two or more random variables that are correlated Can be MV Normal Or MV EMP Or MV Beta Or MV Mixed (Normal for X1 and EMP for X2) We have been working with univariate distributions, now we have many distributions and they are assumed to be correlated to one another
Parameter Estimation for MV Dist. Data are generated contemporaneously Price and yield are observed each year for related commodities Corn and sorghum used interchangeably for animal feed so prices are related Steer and heifer prices are related Yields of crops on the same farm have the same weather conditions Supply and demand forces affect prices similarly, bear market or bull market; prices move together Prices for tech stocks move together Prices for an industry or sector s stocks move together
Why go to the Extra Effort for MV? If correlation is ignored when random variables are correlated, results are biased: If Z = 1+ 2 OR Z = 1* 2 and the model is simulated without correlation, so 1,2=0 But the true 1,2> 0 then the model will understate the risk for Z But the true 1,2< 0 then the model will overstate the risk for Z If Z = 1* 2 The Mean of Z is biased, as well ~ ~ ~ ~ ~
Different MV Distributions Multivariate Normal distribution MVN Multivariate Empirical MVE Multivariate Mixed where each variable is distributed differently, such as, a MV Mixed distribution with five variables: X ~ Uniform, Y ~ Normal, Z ~ Empirical, R ~ Beta, and S ~ Gamma
Parameters for a MVN Distribution Deterministic component ij -- a vector of means or predicted values for the period i to simulate all of the j variables, for example: ij = 0+ 1 X1+ 2 X2 OR ij = Vector of Means Stochastic component ji -- a matrix of residuals from the predicted or mean values for periods i and each random variable j ji = Yij ij which are summarized as the Std Dev of the residuals j Multivariate component calculated from residuals Covariance matrix ( ) for all M random variables in the distribution (MxM matrix) or Correlation matrix, estimated using residuals about the forecast (or the means) 211 12 13 = 222 23 24 OR = 1 23 24 233 34 1 34 244 1 14 1 12 13 14 13
Three Variable MVN Distribution Deterministic component for three random variables i = a + b1Ci-1 i = a + b1Ti + b2 Wi-1 i = a + b1Ti Stochastic component Ci = Ci i Wi = Wi i Si = Si i Multivariate component calculated from the residuals 2cc cw cs = 2ww ws 2ss 1 , 1 , , 1 e e e e c w c s OR P= e e w s
Simulating MVN in Simetar One Step procedure for a 4 random variables Highlight 4 cells if the distribution is for 4 variables, type =MVNORM( 4x1Means Vector, 4x4 Covariance Matrix) =MVNORM( A1:A4 , B1:E4) Control Shift Enter where: the 4 means (or forecasted values) are in column A rows 1-4, and the covariance matrix is in columns B-E and rows 1-4 If you use the historical means, the MVN will validate perfectly using Compare Two Series If you use forecasts rather than means, the validation test fails for the mean vector. The reason is that the ij,T+i is always different from historical mean The CV will differ inversely from the historical CV as the means increase or decrease relative to history
Example of Validation Problem for Historical Mean vs. Y-HatT+i 180 160 140 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 X X-Bar Y-Hat
Simulating MVN in Simetar Two Step procedure for a 4 variable MVN Highlight 4 cells, and type =CUSD (Location of the Correlation Matrix) Control Shift Enter =CUSD (B1:E4) for a 4x4 correlation matrix in cells B1:E4 Next use the individual CUSDs to calculate the random values, using Simetar NORM function: For 1 = NORM( Mean1 , 1 , CUSD1 ) For 2 = NORM( Mean2 , 2 , CUSD2 ) For 3 = NORM( Mean3 , 3 , CUSD3 ) For 4 = NORM( Mean4 , 4 , CUSD4 ) Use Two Step to gain more control of the process
Parameters for MV Empirical Deterministic component for three random variables i = a + b1Ci-1 i = a + b1Ti + b2 Wi-1 i = a + b1Ti Stochastic component calculated from residuals Ci = Ci i Wi = Wi i Si = Si i Calculate the stochastic empirical distribution s parameters use F(x) icon SCi = Sorted ( Ci / i) SWi = Sorted ( Wi / i) SSi = Sorted ( Si / i) Multivariate component is a correlation matrix calculated using unsorted residuals in Step II 1 e e w s 1 , 1 , , e e e e c w c s P =
Simulating MVE in Simetar One Step procedure for a 4 variable MVE Highlight 4 cells if the distribution is for 4 variables, then type =MVEMP( Location Actual Data ,,,, Location Y-Hats, Option) Option = 0 use actual data Option = 1 use Percent deviations from Mean Option = 2 use Percent deviations from Trend Option = 3 use Differences from Mean End this function with Control Shift Enter =MVEMP(B5:D14 ,,,, G7:I6, 2) Where the 10 observations for the 3 random variables are in rows 5-14 of columns B-D and simulate as percent deviations from trend
Two Step MVE Highlight 4 cells if the distribution has 4 random variables, type =CUSD( Location of Correlation Matrix) Control Shift Enter =CUSD( A12:A15) This produces correlated uniform standard deviates (CUSD) Next use the CUSDs to calculate the random values BE SURE to maintain the exact order of CUSDs and variables (Mean here could also be ) For 1 = Mean1 *(1+ Empirical(S1, F(Si) , CUSD1) ) For 2 = Mean2 * (1 + Empirical(S2, F(Si) , CUSD2) ) For 3 = Mean3 * (1 + Empirical(S3, F(Si) , CUSD3) ) For 4 = Mean4 * (1 + Empirical(S4, F(Si) , CUSD4) ) Use Two Step if you want more control of the process and for all home works and tests
Parameter Estimation for MVE Highlight all the variables at one in the F(x) menu Notice that I highlighted three variables in EMP Distribution Menu You can highlight as many variables as you want for the MVEMP
If =CUSD() Returns #VALUEs When the Matrix is not Positive Semi-Definite CUSD returns a #VALUE (see below) Highlight cells, press F2, Enter TRUE in Always Calculate Option Control SHIFT Enter
MV Mixed Distributions What if you need to simulate a MV distribution made up of variables that are not all Normal or all Empirical? For example: X is ~ Normal Y is ~ Beta T is ~ Gamma Z is ~ Empirical Develop parameters for each variable Estimate the correlation matrix for the random variables in the MV distribution
MV Mixed Distributions Simulate a vector of Correlated Uniform Standard Deviates using =CUSD() function =CUSD( correlation matrix ) is an array function so highlight the number of cells that matches the number of variables in the MV distribution Use the CUSDi values in the appropriate Simetar functions for each random variable, as: =NORM(Mean, Std Dev, CUSD1) =BETAINV(CUSD2, Alpha, Beta) =GAMMAINV(CUSD3, P1, P2) =Mean*(1+EMP(Si, F(Si), CUSD4))
Validation of MV Distributions Simulate the model and specify the random variables as the KOVs, then test the simulated random values Perform the following tests Use the Compare Two Series Tab in HoHi to: Test means and covariance for historical series vs. simulated Use the Check Correlation Tab to test the correlation matrix used as input for the MV model vs. the implied correlation in the simulated random variables Null hypothesis (Ho) is: Simulated correlationij = Historical correlation coefficientij If Null hypothesis is true the calculated t statistics will exceed test statistic for the Student t tests Use caution on means tests if your forecasted is different from the historical
Validation of MV Distributions 2 Sample Hoteling T2 test tests if the historical means vector equals simulated means vector Box s M Test tests if the historical covariance and the simulated covariance are equal Complete Homogeneity simultaneously tests means vectors and covariance matrices
MVN Distribution Validation Demonstrate & validate a MVN for a distribution with 3 variables Validation test uses Compare Two Series shows the random variables maintained historical covariance and means with Fail to Reject message for three tests