Advancements in Statistical Genomics: FarmCPU and Method Development

Slide Note
Embed
Share

Exploring the evolution of statistical genomics techniques, this lecture delves into the history of FarmCPU and BLINK, addressing challenges in GWAS and the development of models like PC+SNP+e and PC+Kinship+e. It also covers popular software packages in the field and the importance of moving beyond traditional tools like PLINK.


Uploaded on Oct 04, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Statistical Genomics Lecture 21: FarmCPU Zhiwu Zhang Washington State University

  2. Outline History of method and software development FarmCPU BLINK

  3. Models y = PC + SNP + e QTNs + y = PC + Kinship + e y = PC + QTNs + e BLINK: -2LL FarmCPU: -2LL QTNs y = PC + Kinship + SNP + e QTNs +

  4. Problems in GWAS Computing difficulties: millions of markers, individuals, and traits False positives, ex: Amgen scientists tried to replicate 53 high-profile cancer research findings, but could only replicate 6 , Nature, 2012, 483: 531 False negatives

  5. Q PC PC+K EMMA EMMAx Q+K GWAS Stream SELECT MLMM CMLM P3D GCTA ECMLM FST-LMM GEMMA FarmCPU GenAbel BLINK

  6. Speed Power t test improvement improvement GLM Computing speed GenABEL FaST-LMM CMLM ECMLM Select GEMMA P3D/EMMAX SUPER EMMA MLMM MLM Power | type I error

  7. Usage of Software Packages Software Leading Authors Corresponding authors Language Released Citation PUMA Gabriel E. Hoffman Jason G. Mezey C++ 2013 27 TATES Sophie van der Sluis Sophie van der Sluis Fortran 2013 76 GAPIT Lipka AE Zhang Z R 2012 284 MLMM Vincent S Nordborg M R/python 2012 226 GEMMA Zhou X FastLMMChristoph L, Listgarten J, Heckerman D Stephens M Christoph L, Listgarten J, Heckerman D C++ 2012 445 C++ 2011 348 Qxpak M. P rez-Enciso M. P rez-Enciso Fortran 2004 141 EMMAX Kang HM Sabatti C & Eskin E C++ 2010 813 GCTA Jian Y Jian Y C++ 2011 1338 GenABEL Aulchenko YS Aulchenko YS R 2007 990 TASSEL Bradbury, Zhang, and Kroon Bradbury PJ Java 2006 1596 PLINK Purcell S Purcell S C++ 2007 12111 65%

  8. Why human geneticists not go beyond PLINK?

  9. MLM was more enriched on Flowering time genes

  10. Model Development Si: Testing marker Adjustment on marker Q: Population structure K: Kinship Adjustment on covariates S: Pseudo QTNs

  11. SUPER algorithm y = PC + SNP + e y = PC + Kinship + e Bins -2LL QTNs y = PC + Kinship + SNP + e

  12. FarmCPU algorithm y = PC + SNP + e y = PC + Kinship + e Bins -2LL QTNs y = PC + QTNs + SNP + e

  13. Speed Power t test improvement improvement GLM Computing speed GenABEL BLINK FarmCPU FaST-LMM CMLM ECMLM Select GEMMA P3D/EMMAX SUPER EMMA MLMM MLM Power | type I error

  14. FARM-CPU (Fixed And Random Model Circuitous Probability Unification) Fixed model y = M1+ + Mt + mi + e Substitution SNP p1 Pt1 NA NA pl Ptl Mt Ptj Ptk Pt M2 M1 P21 P11 m1 P2j P1j mj P2k P1k mk P2l P1l ml P2 P1 Optimization Random model y = u + e with Var(u) SVD(M)

  15. Re-analysis of Arabidopsis data Xiaolei Liu

  16. Flowering time genes enriched

  17. Associations on flowering time

  18. It is time for human geneticists to move forward

  19. Substitution makes difference

  20. Converge fast

  21. FarmCPU is computing efficient Testing 60K SNPs

  22. Half million individuals, half million SNPs three days But, PINK new version is faster

  23. Summary History of method and software development FarmCPU

Related


More Related Content