Exome sequence analysis identifies rare coding variants associated with a machine learning-based marker for coronary artery disease – Nature Genetics
Roth Gregory, A. et al. Global burden of cardiovascular diseases and risk factors, 19902019. J. Am. Coll. Cardiol. 76, 29823021 (2020).
Google Scholar
Khera, A. V. & Kathiresan, S. Genetics of coronary artery disease: discovery, biology and clinical translation. Nat. Rev. Genet. 18, 331344 (2017).
Google Scholar
Chen, Z. & Schunkert, H. Genetics of coronary artery disease in the post-GWAS era. J. Intern. Med. 290, 980992 (2021).
Google Scholar
Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 18031815 (2022).
Google Scholar
Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 16791692 (2022).
Google Scholar
Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 12, 581594 (2013).
Google Scholar
Plenge, R. M. Disciplined approach to drug discovery and early development. Sci. Transl. Med. 8, 349ps15 (2016).
Google Scholar
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942948 (2021).
Google Scholar
Do, R. et al. Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction. Nature 518, 102106 (2015).
Google Scholar
Yao, K. et al. Exome sequencing identifies rare mutations of LDLR and QTRT1 conferring risk for early-onset coronary artery disease in Chinese. Natl Sci. Rev. 9, nwac102 (2022).
Google Scholar
Khera, A. V. et al. Gene sequencing identifies perturbation in nitric oxide signaling as a nonlipid molecular subtype of coronary artery disease. Circ. Genom. Precis. Med. 15, e003598 (2022).
Google Scholar
Martin, S. S. et al. 2024 heart disease and stroke statistics: a report of US and global data from the American Heart Association. Circulation 149, e347e913 (2024).
Google Scholar
Maddox, T. M. et al. Nonobstructive coronary artery disease and risk of myocardial infarction. JAMA 312, 17541763 (2014).
Google Scholar
Park, D. W. et al. Extent, location, and clinical significance of non-infarct-related coronary artery disease among patients with ST-elevation myocardial infarction. JAMA 312, 20192027 (2014).
Google Scholar
Forrest, I. S. et al. Machine learning-based marker for coronary artery disease: derivation and validation in two longitudinal cohorts. Lancet 401, 215225 (2023).
Google Scholar
Petrazzini, B. O. et al. Coronary risk estimation based on clinical data in electronic health records. J. Am. Coll. Cardiol. 79, 11551166 (2022).
Google Scholar
Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53, 10971103 (2021).
Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 21902191 (2010).
Google Scholar
Sveinbjornsson, G. et al. Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat. Genet. 48, 314317 (2016).
Google Scholar
Zhou, W. et al. Efficiently controlling for casecontrol imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 13351341 (2018).
Google Scholar
Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906908 (2018).
Google Scholar
Nikpay, M. et al. A comprehensive 1,000 genomesbased genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 11211130 (2015).
Google Scholar
Tarugi, P. et al. Molecular diagnosis of hypobetalipoproteinemia: an ENID review. Atherosclerosis 195, e19e27 (2007).
Google Scholar
Ference, B. A. et al. Variation in PCSK9 and HMGCR and risk of cardiovascular disease and diabetes. N. Engl. J. Med. 375, 21442153 (2016).
Google Scholar
Schmidt, A. F. et al. PCSK9 genetic variants and risk of type 2 diabetes: a mendelian randomisation study. Lancet Diabetes Endocrinol. 5, 97105 (2017).
Google Scholar
Lotta, L. A. et al. Association between low-density lipoprotein cholesterollowering genetic variants and risk of type 2 diabetes: a meta-analysis. JAMA 316, 13831391 (2016).
Google Scholar
Benn, M., Nordestgaard, B. G., Grande, P., Schnohr, P. & Tybjrg-Hansen, A. PCSK9R46L, low-density lipoprotein cholesterol levels, and risk of ischemic heart disease: 3 independent studies and meta-analyses. J. Am. Coll. Cardiol. 55, 28332842 (2010).
Google Scholar
Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311D1320 (2021).
Google Scholar
Thomas, D. G., Wei, Y. & Tall, A. R. Lipid and metabolic syndrome traits in coronary artery disease: a Mendelian randomization study. J. Lipid Res. 62, 100044 (2021).
Google Scholar
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417425 (2015).
Google Scholar
Schrodi, S. J. The impact of diagnostic code misclassification on optimizing the experimental design of genetic association studies. J. Healthc. Eng. 2017, 7653071 (2017).
Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203209 (2018).
Google Scholar
Klarin, D. et al. Genetic analysis in UK Biobank links insulin resistance and transendothelial migration pathways to coronary artery disease. Nat. Genet. 49, 13921397 (2017).
Google Scholar
Honigberg, M. C. et al. Premature menopause, clonal hematopoiesis, and coronary artery disease in postmenopausal women. Circulation 143, 410423 (2021).
Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 12191224 (2018).
Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
Google Scholar
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 113 (2010).
Google Scholar
Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380, 13471358 (2019).
Google Scholar
Liaw, A. & Wiener, M. Classification and regression by randomForest. R. N. 2, 1822 (2002).
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, 126 (2008).
Google Scholar
Grn, B., Kosmidis, I. & Zeileis, A. Extended beta regression in R: shaken, stirred, mixed, and partitioned. J. Stat. Softw. 48, 125 (2012).
Google Scholar
McCaw, Z. R., Lane, J. M., Saxena, R., Redline, S. & Lin, X. Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics 76, 12621272 (2020).
Google Scholar
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514518 (2019).
Google Scholar
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248249 (2010).
Google Scholar
Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 38123814 (2003).
Google Scholar
Chun, S. & Fay, J. C. Identification of deleterious mutations within three human genomes. Genome Res. 19, 15531561 (2009).
Google Scholar
Schwarz, J. M., Cooper, D. N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361362 (2014).
Google Scholar
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence Kernel association test. Am. J. Hum. Genet. 89, 8293 (2011).
Google Scholar
Liu, Y. et al. ACAT: a fast and powerful P value combination method for rare-variant analysis in sequencing studies. Am. J. Hum. Genet. 104, 410421 (2019).
Google Scholar
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Google Scholar
Mendez, D. et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930D940 (2019).
Google Scholar
Online Mendelian Inheritance in Man, OMIM. McKusick-Nathans Institute of Genetic Medicine. (Johns Hopkins University, 2022); https://omim.org/
R Core Team. R: a language and environment for statistical computing. (R Foundation for Statistical Computing, 2019); https://www.r-project.org/
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 12, 77 (2011).
Google Scholar
Petrazzini, B. O. et al. Exome sequence analysis identifies rare coding variants associated with a machine learning-based marker for coronary artery disease. Zenodo https://doi.org/10.5281/zenodo.11086022 (2024).