Overview of DAGs in Causal Inference
Understanding Directed Acyclic Graphs (DAGs) in causal inference is crucial for guiding research questions and analyzing causal relationships. This overview covers the basics of DAGs, their requirements, and applications in analyzing causal assumptions. Dive into the world of DAGs to enhance your research methodologies and interpretations.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
To get ready for class: 1. Get births ready as usual 2. Install package dagitty Brad Pitt in Movie Snatch
DAGs (& the class project) EPID 799C Fall 2017
Overview Class project chat Overview / Review of DAGs DAGs in R
Class Project Let s review last year s http://learnr.web.unc.edu/files/2018/10/EPID-799C-Projects- 2017.zip (Find at bottom of schedule by the due date)
Project Themes Use your own data / existing projects Play with new visuals (extensions to ggplot? Maps?) Replicate a previous analysis (718?) Explore a new technique (multi-level) Use new R tools (purrr, advanced dplyr) It s for you! Invest in yourself! We re just making you do it.
Back to DAGs, Confounding & EMM
From last class (and upcoming HW4). These were all CRUDE effects. Still, they re different! DAGs inform our control set.
What is a DAG? DAGs (Directed Acyclic Graphs) document causal assumptions / knowledge from our head or literature. They can be used to guide us to better answer questions like: Does a change in A prompt a change in B? Or the other way around? Or not in either direction, but their association is caused by some third thing?
DAG Requirements Directed: , not -- Acyclic: A B, not A B and B A Graph: Connected, not dangling*. Other disciplines, from engineering to other forms of statistical model, relax these requirements.
Overview Directed: , not -- Acyclic: A B, not A B and B A Graph: Connected, not dangling*.
Colliders Inducing a biased causal association through controlling a collider is: collider stratification bias
EMM: Two notes Note that an EMM may or may not be a confounder (influencing the values of A and B directly, vs. the effect of A on B), so may not be in the DAG node network. EMMs are good for a DAG notes though! Also note that it may be rare that an exposure / intervention switches direction entirely. Effect measure modification worth acknowledging may be a matter of degree or important to report because of context - regardless of statistical interaction, p=whatever.
In Sum: Create a model, throw things in, reduce by p-value / backwards selection, or any of a number of techniques. May be good at predicting outcome from exposure and other variables. There is nuance here, but
In Sum: if we want the causal effect, we have to be intentional about what parts of this flow we block. We leave direct and indirect causal paths, and leave blocked paths with colliders (do not control!)
How do we do this? Encode the nodes and directed edges of the DAG from the literature / content knowledge By eye, hand, or software 1. document all paths between Exposure and Outcome 2. Identify whether they are already blocked (collider), backdoor (confounded), or causal (direct or indirect through a mediator) 3. Select nodes to statistically control, often ideally as few as possible, to block the open backdoor paths without blocking
A note on functional form! Remember maternal age? In order to improve precision of estimates (and acknowledge the linear assumptions of GLMs), it behoves us to model covariates as well as possible balancing parsimony and interpretation. Hence mage2 or splines of some kind. This does not apply as directly to our exposures, which we want to interpret in actionable, communicable ways!
.and a note on interpretation! Relatedly, the Table 2 fallacy suggests Reminder: Table 2 fallacy! (Westrich & Greenland 2013). If those estimates are largely uninterpretable in causal inference context, might as well let them go and model them more precisely (albeit obfuscated). Westreich, D., and S. Greenland. The Table 2 Fallacy: Presenting and Interpreting Confounder and Modifier Coefficients. American Journal of Epidemiology 177, no. 4 (February 15, 2013): 292 98. https://doi.org/10.1093/aje/kws412.
DAG critiques Reality isn t DAGGY DAGs (even if large) are a model, and so a simplified version of reality, in this particular case requiring unidirectionality and acyclic assumptions. Reality is often a system with feedback loops and inter-relationships that may not be modeled well with unidirectional models, perhaps especially with social / network processes. There are other methods! Not all nodes / edges are alike Race-ethnicity in particular is a heavily overloaded construct for causal inference (Vanderweele & Robinson 2014), reaching back to represent historical and current systemic oppression and racism, physical phenotype, experiences of cultural and ascribed identity. Parts of this construct may have different causal relationships. Break apart if you can, and regardless, be mindful / nuanced in your interpretation. Not always interested in causal effects Prediction, association, other techniques have a place in a public health toolbox. VanderWeele, Tyler J., and Whitney R. Robinson. On the Causal Interpretation of Race in Regressions Adjusting for Confounding and Mediating Variables: Epidemiology 25, no. 4 (July 2014): 473 84. doi:10.1097/EDE.0000000000000105.
Our (toy) DAG Directed Acyclic Graphs (DAGs) inform our variable selection and treatment in models (based on their status as mediators, confounders, effect measure modifiers, etc. We will not elaborate in this class! Take the Epi sequence for more. DAG from EPID 716 / Christy Avery
Lets Try: Dagitty Check it out here: www.dagitty.net Premade DAG for you: dagitty.net/moAh6a6
DAGs in R The online dagitty tool (http://www.dagitty.net/dags.html#) exports R code for the dagitty package. Can be directly downloaded into R! The new ggdag package is a beefed-up version using tidy data structures we can recognize. DL the R script from the website
Lets Try: DAGs in R with dagitty dagitty() : makes DAGs adjustmentSets() : minimal / all adjustment sets paths() : paths! children(), etc. downloadGraph design in dagitty, pull down in R instrumentalVariables(), SEM stuff, testing your data against a DAG, etc. Currently beyond me!
Lets Try: DAGs in R with ggdag tidy_dagitty() ggdag(dag_df) + theme_dag() gets you a nice ggplot2 geometries, edit plot as usual See: https://cran.r- project.org/web/packages/ggdag/vignettes/intro- to-ggdag.html And: https://github.com/malcolmbarrett/ggdag
Practical Uses R may be helpful for quickly changing and rerunning for minimal sets a few different DAG scenarios. Probably better than hand. Nice pair with dagitty website. Maybe useful for SEM? I dunno. But you can stick your DAGs in papers / R Markdown, make ggplots of them, etc.
Other Packages If you do this a lot, find your favorite! R skills let you translate data structures across packages as you need to. dagitty * ggdag: Today! DiagrammeR: Prettier, but no adjustment sets? https://donlelek.github.io/2015-03-31-dags-with-r/ dagR: Prettier and adjustment sets, but funny syntax? http://rstudio-pubs- static.s3.amazonaws.com/2609_e3d86d0748c04eb18d5f 56d6a99feb3f.html