Coalescence Times at Two Loci under Markovian Coalescent Models
This presentation discusses coalescence times at two loci using Markovian coalescent approximations and pedigree models. The speaker, Shai Carmi from The Hebrew University of Jerusalem, presents joint work with other researchers, focusing on the ARG, SMC, and the effect of shared pedigree on estimators. The two-locus model is explored for its simplicity in studying drift and recombination, with applications in LD measurement, demographic inference, and mutation/recombination rate estimation.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Coalescence Times at Two Loci under Markovian Coalescent Approximations and Pedigree Models Shai Carmi The Hebrew University of Jerusalem Oxford, September 2016 Joint work with: Peter Wilton, L ndara King, John Wakeley (Harvard) Asger Hobolth (Aarhus)
Outline Background: Two-Locus Models, the ARG, SMC/SMC A Two-Locus Model for SMC and Model Comparison The Effect of the Shared Pedigree On Estimators of
Outline Background: Two-Locus Models, the ARG, SMC/SMC A Two-Locus Model for SMC and Model Comparison The Effect of the Shared Pedigree On Estimators of
Model: Two Loci, Two Sequences Locus 1 Locus 2 Sequence 1 Sequence 2 Genetic distance =4Ner Time (generations scaled by 2Ne) Time to Most Recent Common Ancestor (TMRCA) =Coalescence Time t1 t2 Sequence 1 Sequence 2 Sequence 1 Sequence 2 Present
Why Use the Two-Locus Model? Simplest model for drift + recombination Details can be complex Non-trivial behavior End goal: the joint PDF f(t1,t2| ) o Conditional PDF f(t2|t1) o Cov(t1,t2) o Prob(t1=t2) Useful for: o r2 measure of LD (McVean, 2002) o Demographic inference (PSMC and variants, Harris & Nielsen, 2013, ) o Mutation/recombination rate estimation
The Ancestral Recombination Graph (ARG) Markov process backwards Markov process backwards in time: in time: Griffiths, 1981 Hudson, 1983 Simonsen and Churchill, 1997 Wakeley, 2009 Hobolth and Jensen, 2014 ARG: Cov ?1,?2 = ?+18 ?2+13?+18 Start Start here here Along the Along the sequence of a continuous sequence of a continuous chr Wiuf Wiuf and Hein, and Hein, 1999 1999 Each recombination event breaks the genealogy The new genealogy depends on all previous ones Complicated Complicated chr: :
The Sequentially Markov Coalescent (SMC) McVean McVean and Cardin, and Cardin, 2005 2005 (also Li and Durbin, 2011, and others) Naturally defined along the sequence: ?1~Exp(1); ?SMC (??+1|??), ??+1 ?? Recombination event TMRCA t3 t6 t5 t1 t7 t8 t2 t4 0 Coordinate
The SMC Marjoram and Wall, Marjoram and Wall, 2006 2006 Also in: Eriksson et al., 2009; Harris and Nielsen, 2013, Carmi et al., 2014, Zheng et al., 2014, Schiffels and Durbin, 2014, Palacios et al., 2015, and various simulators ?1~Exp(1); ?SMC (??+1|??) Recombination event TMRCA t3 t5=t6=t7 t1=t2 t8 t4 0 Coordinate
What About Two Fixed Loci? SMC is easy Start here A two-locus model for SMC ? Joint PDF f(t1,t2)? Comparison to ARG/SMC? 1 SMC: Cov ?1,?2 = 1+?
Outline Background: Two-Locus Models, the ARG, SMC/SMC A Two-Locus Model for SMC and Model Comparison The Effect of the Shared Pedigree On Estimators of
Model Rules Comparison Ancestral material Ancestral material Non Non- -ancestral material ancestral material ARG ARG SMC SMC SMC SMC
A Two-Locus Model for SMC State State R Rj j: : j recombination events State State I I: : Independent loci States States C CL L, , C CR R: : Single-locus coalescence State State C CB B: : Two-locus coalescence Peter Wilton
Joint Density of Coalescence Times Used Kolmogorov equations to derive closed-form f(t1,t2)
Covariance of Coalescence Times 2+? 4 2+? 4, ? SMC :Cov ?1,?2 = 2 ? 2? ? 4( ?) 2+? 4 4 Theorem: Theorem: For every two-locus model for which ?~Exp(1), Cov ?1,?2 = ?(?1= ?2). See also Eriksson et al., 2009
SMC and ARG: Deeper Connections Theorem: Theorem: The joint density of coalescence times (s and t) on either side of a recombination site is the same under SMC and the ARG. It is given by (for s t) 3 41 ? 2?? ? ;? < ? 3 41 ? 2?? ? ;? > ? ? ?,? ? ? = Corollary: Corollary: SMC is the canonical first-order Markov approximation to the ARG (see paper for definitions)
Implications to Demographic Inference Generate genealogies under the ARG with a given N Assume the list of distinct TMRCAs is known known Define ?SMC = argmax ??SMC ??+1|??;? ? ?SMC = argmax ??SMC ??+1|??;? ? Theorem: Theorem: ? ?SMC 0.95? Numerically: Numerically:? ?SMC ?
Outline Background: Two-Locus Models, the ARG, SMC/SMC A Two-Locus Model for SMC and Model Comparison The Effect of the Shared Pedigree On Estimators of
Is the ARG a Good Model? In In reality: reality: o Individuals are diploid; can be male or female o Family tree is shared across all loci, even unlinked
The Effect of the Shared Pedigree on Tajimas Estimator Consider two individuals, nunlinked Tajima s/Watterson s estimator: ? =1 number of sequence differences at locus i Theorem: Theorem: for ? , Var ? unlinked loci, and =4N ??, where ?? is the ? ? ?=1 ?212? Var ? /?2 The non-zero variance results from all gene genealogies sharing the same pedigree N
Two-Locus Models Inspired by Real Genealogies We modified the Wright-Fisher model to imitate real human (single-generation) genealogical patterns Var ? /?2 The variance of ? is higher for the human-inspired models L andra King 2N
Acknowledgements Harvard University: Harvard University: Peter Wilton Peter Wilton, L andra King, John Wakeley Aarhus University: Aarhus University: Asger Hobolth Thank you for your attention! Columbia University: Columbia University: Itsik Pe er Publications Publications: : P. R. Wilton, S. Carmi*, and A. Hobolth*. The SMC is a highly accurate approximation to the ancestral recombination graph.Genetics 200 L. King, J. Wakeley, and S. Carmi. A non-zero variance of Tajima's estimator for two sequences even for infinitely many unlinked loci. (bioRxiv 069989). 200, 343 (2015).