Understanding Three-Way Interactions in Regression Models

Slide Note
Embed
Share

Three-way interactions in regression models add complexity to interpreting the effects of predictors. This article explains how to decompose three-way interactions in Stata, model them effectively, and assess their significance using contrast tests. Practical examples and Stata commands are provided for better understanding.


Uploaded on Jul 15, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Decomposing three-way interactions in Stata IDRE STATISTICAL CONSULTING

  2. Three-way interactions in regression models Although in some ways, three-way interactions are a straightforward extension of two-way interactions, the complexity of interpretation and number of questions that can be posed rapidly increases when three- way interactions are present Recall that when we have two predictor variables, say ? and ?, their two-way interaction, ??, will model how the effect of ? varies with levels of ?, and how the effect of ? varies with levels of ?. A three-way interaction of ?, ?, and ? simulataneouly models: how the effect of ? varies with (or is moderated by) the combination of levels of ? and ? how the effect of ? varies with the combination of levels of ? and ? how the effect of ? varies with the combination of levels of ? and ? how the two-way interaction of ? and ? varies with levels of ? how the two-way interaction of ? and ? varies with levels of ? how the two-way interaction of ? and ? varies with levels of ?

  3. Modeling three-way interaction in Stata When we are modeling the three-way interaction of variables ?, ?, and ?, we recommend including all lower order terms in the model. So the model should include: ?, ?, ? all two-way product terms, ??,??,?? the three-way product term, ??? In Stata, this can be accomplished with the ## shorthand notation Remember to put c. before any continuous predictors in the interaction (and i. in front of categorical predictors is recommended as well) regress loss c.hours##i.gender##i.prog is short for regress loss c.hours i.gender i.prog c.hours#i.gender c.hours#i.prog i.gender#i.prog c.hours#i.gender#i.prog

  4. Assessing the 3-way interaction Execute the 3-way interaction model in Stata: regress loss c.hours##i.gender##i.prog ------------------------------------------------------------------------------------- loss | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------------+---------------------------------------------------------------- | gender#prog#c.hours | female#swim | -10.14613 2.005172 -5.06 0.000 female#read | -3.912888 1.924244 -2.03 0.042 | _cons | 2.969748 2.070931 1.43 0.152 -1.094742 7.034238 ------------------------------------------------------------------------------------- Both 3-way interaction coefficients have ? < 0.05 If you would like to conduct a joint (omnibus) test of the 3-way interaction coefficients, use contrast contrast must follow an estimation command like regress contrast tests linear hypotheses about the coefficients (such as whether they are all zero) -14.08156 -6.210699 -7.689485 -.1362919 The following contrast tests whether any of the 3-way interaction coefficients are different from zero. Specify the 3-way interaction term exactly as it appears in the output table: ------------------------------------------------------- | df F P>F --------------------+---------------------------------- gender#prog#c.hours | 2 13.15 0.0000 | Denominator | 888 ------------------------------------------------------- contrast gender#prog#c.hours

  5. Plotting a 3-way interaction 1. Use margins to estimate predicted values across combinations of levels of the 3 variables involved in the interaction, and then use marginsplot to plot those values 2. Put the variable whose effect is of greatest interest on the x-axis 3. Use separate lines to represent one moderator 4. Use separate graphs to represent the other moderator Execute the following margins to estimate expected weight loss for each combination of gender, program, and hours ranging from 1 to 4 margins gender#prog, at(hours=(1 2 3 4)) Then, for marginsplot: Use the x() option (short for xdimension()) to specify which variable is depicted on the x-axis. Use the by() option to split the graphs by a specific moderator First, try specifying marginsplot alone with no options One plot, hours on the x-axis, 6 lines for each combination of gender and program; too many colors to discern; let s try splitting

  6. Plotting exercise Use the by() option in marginsplot to split the graphs by gender: marginsplot, by(gender) Now it is easier to see: Which lines correspond to which combination of gender and program The 2-way interaction of program and hours within each gender Perhaps we are not interested in the effect of hours, but instead want to focus on program differences. Specify a marginsplot command where program (prog) is on the x-axis, and the graphs are still split by gender: marginsplot, x(prog) by(gender) Now instead focus on gender effects by putting gender on the x-axis and split the graphs by prog: marginsplot, x(gender) by(prog) We can use the byopts() option with specification rows(1) to put all graphs on one row: marginsplot, x(gender) by(prog) byopts(rows(1))

  7. Questions you want to answer with your 3-way interaction model Usually, the questions you ll want to answer with a 3-way interaction in the model will fall into two broad categories (and you don t need to answer them all): 1. Simple effects/slopes analysis: How does the effect of one variable change with combinations of levels of the other two variables? A. How does the effect of hours vary across male joggers vs male swimmers vs female joggers vs i. Is the effect of hours (significantly) different from zero for male joggers? For male swimmers? ii. Is the effect of hours different between male joggers and female joggers? B. How do gender differences vary when looking at joggers who exercise 1 hour vs joggers who exercise 4 hours vs swimmers who exercise 2 hours C. How do the differences between exercise programs vary across males who exercise 1 hour vs females who exercise 1 hours vs females who exercise 3 hours 2. Conditional interaction analysis: How does the interaction of two variables vary with the third variable ? A. How does the interaction of hours and program vary by gender? i. Is the two-way interaction of hours and program (significantly) different from zero for each gender? B. How does the interaction of hours and gender vary by program? C. How does the interaction of program and gender vary by hours?

  8. Simple effect analysis in Stata The margins command will again be our primary tool for simple effects analysis in Stata You ll need to specify: For which variable you want the simple effects estimated Put the simple effect variable inside dydx() At which levels of the other 2 variables, the moderators, you want the simple effects evaluated Put categorical moderators before the comma This will estimate the simple slope at each level of the categorical moderator If there are two categorical moderators, put them both before the comma with # between them This will estimate simple slopes at each combination of levels of the categorical moderators Use at() to specify levels of continuous moderators at which to evaluate the simple effect

  9. Example simple effects analysis Say we want to estimate the simple slope of hours for each combination of gender and program. First, let s orient ourselves with the corresponding graph. Run these 2 commands again: margins gender#prog, at(hours=(1 2 3 4)) marginsplot, by(gender) Now, how will we specify the margins command to estimate the slopes of hours depicted in that graph? What goes inside of dydx()? Are gender and prog categorical or continuous? So if we want the slopes of hours at each combination of gender and prog, how do we specify that? margins gender#prog, dydx(hours) Each of the estimates from that margins command is represented by a line on the graph

  10. Comparing simple effects/slopes We can conveniently use the margins command to test pair-wise differences among the 6 slopes we just estimated Add the pwcompare(effects) option to estimate differences between estimates from margins. The effects specification requests p- values: margins gender#prog, dydx(hours) pwcompare(effects) Not all comparisons will necessarily be of interest or even sensible; highlighted below is an interesting non-significant difference ------------------------------------------------------------------------------------------------- | Contrast Delta-method Unadjusted Unadjusted | dy/dx Std. Err. t P>|t| [95% Conf. Interval] --------------------------------+---------------------------------------------------------------- hours | gender#prog | (male#swim) vs (male#jog) | 4.691721 1.400084 3.35 0.001 1.943862 7.43958 (male#read) vs (male#jog) | -8.143013 1.368196 -5.95 0.000 -10.82829 -5.457739 (female#jog) vs (male#jog) | 5.449994 1.444471 3.77 0.000 2.615019 8.284969 (female#swim) vs (male#jog) | -.0044117 1.404319 -0.00 0.997 -2.760582 2.751759 (female#read) vs (male#jog) | -6.605907 1.319995 -5.00 0.000 -9.196581 -4.015232 (male#read) vs (male#swim) | -12.83473 1.354277 -9.48 0.000 -15.49269 -10.17678 (female#jog) vs (male#swim) | .7582728 1.431294 0.53 0.596 -2.050841 3.567387 (female#swim) vs (male#swim) | -4.696133 1.390762 -3.38 0.001 -7.425696 -1.96657 (female#read) vs (male#swim) | -11.29763 1.305563 -8.65 0.000 -13.85998 -8.735279 (female#jog) vs (male#read) | 13.59301 1.400117 9.71 0.000 10.84508 16.34093 (female#swim) vs (male#read) | 8.138601 1.358655 5.99 0.000 5.472052 10.80515 (female#read) vs (male#read) | 1.537106 1.271306 1.21 0.227 -.9580094 4.032221 (female#swim) vs (female#jog) | -5.454406 1.435437 -3.80 0.000 -8.27165 -2.637161 (female#read) vs (female#jog) | -12.0559 1.353054 -8.91 0.000 -14.71146 -9.400344 (female#read) vs (female#swim) | -6.601495 1.310103 -5.04 0.000 -9.172755 -4.030235 -------------------------------------------------------------------------------------------------

  11. Simple effects exercise Now let s focus on the simple effects of gender First, recreate the appropriate graph where gender is on the x-axis, and the plots are split by program: margins gender#prog, at(hours=(1 2 3 4)) marginsplot, x(gender) by(prog) byopts(rows(1)) Now use margins to estimate the simple effects of gender, across programs and across hours ranging from 1 to 4: margins prog, dydx(gender) at(hours=(1 2 3 4)) And use margins again to estimate differences between those simple effects: margins prog, dydx(gender) at(hours=(1 2 3 4)) pwcompare(effects)

  12. Assessing 2-way interactions in a 3-way interaction model A non-zero 3-way interaction coefficient suggests that the interaction of 2 variables is moderated by a third variable We can thus probe the interaction of those 2 variables across any or all levels of the third variable moderator How does the interaction change as the moderator changes? Is the interaction (significantly) different from zero at different levels of the moderator? For example, we can look a the how the 2-way interaction of program and hours is moderated by gender How does the interaction of program and hours different between the genders? Is the interaction (significantly) different from zero for both genders?

  13. Use graphs to interpret how a 2-way interaction varies with a moderator The fact that we have evidence for a 3-way interaction suggests that the 2-way interaction of program and hours is different between the 2 genders. A 2-way interaction is indicated by non-parallel lines A 3-way interaction is indicated by different patterns of lines across levels of the moderator We can see this in our 3-way interaction plots: margins gender#prog, at(hours=(1 2 3 4)) marginsplot, by(gender)

  14. Use contrast to test for 2-way interactions across levels of the moderator The fact that the lines representing the effect of hours are not parallel across programs within each gender suggests that program and hours interact within each gender We can test for 2-way interactions across levels of a moderator using the contrast command Specify contrast, then the interacting variables separated by #, then an @, and finally the moderator Put c. in front of continuous interactors; continuous variables cannot be specified after the @ (we cannot look at a 2-way interaction across levels of a continuous moderator using contrast) Let s assess the 2-way interaction of program and hours for each gender: contrast prog#c.hours@gender The resulting output suggests that prog and hours interact for each gender ------------------------------------------------------- | df F P>F --------------------+---------------------------------- prog@gender#c.hours | male | 2 46.33 0.0000 female | 2 40.72 0.0000 Joint | 4 43.53 0.0000 | Denominator | 888 -------------------------------------------------------

  15. Assessing 2-way interactions exercise Let s assess the 2-way interaction of gender and hours across programs. First, let s recreate a graph to help us interpret our tests: margins gender#prog, at(hours=(1 2 3 4)) marginsplot, x(gender) by(prog) byopts(rows(1)) Based on the graph, for which programs do you expect to find that gender and hours interact? Use a contrast command to test the interaction of gender and hours across programs:

  16. Assessing 2-way interactions exercise Let s assess the 2-way interaction of gender and hours across programs. First, let s recreate a graph to help us interpret our tests: margins gender#prog, at(hours=(1 2 3 4)) ------------------------------------------------------- | df F P>F --------------------+---------------------------------- gender@prog#c.hours | jog | 1 14.24 0.0002 swim | 1 11.40 0.0008 read | 1 1.46 0.2270 Joint | 3 9.03 0.0000 | Denominator | 888 ------------------------------------------------------- marginsplot, x(gender) by(prog) byopts(rows(1)) Based on the graph, for which programs do you expect to find that gender and hours interact? Use a contrast command to test the interaction of gender and hours across programs: contrast gender#c.hours@prog Not good evidence of interaction in the read condition

  17. Caveats for nonlinear models These methods of estimating simple effects using margins will be different for non-linear models margins estimates marginal estimates Population averaged means and effects We are actually using margins to get conditional rather than marginal estimates, but in linear models (regress, mixed) they are equivalent In non-linear models (e.g. logit/logistic, poisson, etc.) they are not the same, because the outcome undergoes a non-linear transformation through the link function (e.g. logit() or log()) You will not, for example, be able to obtain odds ratios estimates out of margins You can use lincom for this, but it requires a bit more understanding of the regression coefficients The margins command can still be used for non-linear models to estimate simple effects and slopes on the scale of the linear predictor (e.g. log odds for logistic regression, log counts/incidence for poisson regression)

Related


More Related Content