@article{4a84513b5e7f465099d3d083fd60e8ea,
title = "Accurate assignment of disease liability to genetic variants using only population data",
abstract = "Purpose: The growing size of public variant repositories prompted us to test the accuracy of pathogenicity prediction of DNA variants using population data alone. Methods: Under the a priori assumption that the ratio of the prevalence of variants in healthy population vs that in affected populations form 2 distinct distributions (pathogenic and benign), we used a Bayesian method to assign probability to a variant belonging to either distribution. Results: The approach, termed Bayesian prevalence ratio (BayPR), accurately parsed 300 of 313 expertly curated CFTR variants: 284 of 296 pathogenic/likely pathogenic variants in 1 distribution and 16 of 17 benign/likely benign variants in another. BayPR produced an area under the receiver operating characteristic curve of 0.99 for 103 functionally confirmed missense CFTR variants, which is equal to or exceeds 10 commonly used algorithms (area under the receiver operating characteristic curve range = 0.54-0.99). Application of BayPR to expertly curated variants in 8 genes associated with 7 Mendelian conditions led to the assignment of a disease-causing probability of ≥80% to 1350 of 1374 (98.3%) pathogenic/likely pathogenic variants and of ≤20% to 22 of 23 (95.7%) benign/likely benign variants. Conclusion: Irrespective of the variant type or functional effect, the BayPR approach provides probabilities of pathogenicity for DNA variants responsible for Mendelian disorders using only the variant counts in affected and unaffected population samples.",
keywords = "Bayesian analysis, Population frequency, Prevalence ratio, Variant classification, Variant interpretation",
author = "Collaco, {Joseph M.} and Raraigh, {Karen S.} and Joshua Betz and Aksit, {Melis Atalar} and Nenad Blau and Jordan Brown and Dietz, {Harry C.} and Gretchen MacCarrick and Nogee, {Lawrence M.} and Sheridan, {Molly B.} and Vernon, {Hilary J.} and Beaty, {Terri H.} and Louis, {Thomas A.} and Cutting, {Garry R.}",
note = "Funding Information: This research was funded by the following organizations. A National Institutes of Health grant to J.M.C. under grant number R01 HL128475. This study was part of the RD-Connect initiative and was supported by an FP7-HEALTH-2021-INNOVATION-1 EU Grant to N.B. under grant number 305444. L.M.N. was supported by grants from the National Institutes of Health, American Thoracic Society, and Eudowood Foundation. G.R.C. was supported by grants from the National Institute of Diabetes and Digestive and Kidney disease (grant number R01 DK44003) and the Cystic Fibrosis Foundation under grant number CUTTIN13A1. The funders had no role in the conceptualization, design, data collection, analysis, decision to publish, or preparation of the manuscript. The rest of the authors report no conflicts of interest. Funding Information: This research would not have been possible without the generous data contributions of patient and healthy subject to the Clinical and Functional Translation of CFTR project, the BIOPKU, and Genome Aggregation Database data sets and the Human TAFAZZIN Gene Variants Database. Conceptualization: J.M.C. T.H.B. T.A.L. G.R.C.; Data Curation: K.S.R. N.B. J.Br. H.C.D. G.M. L.M.N. M.B.S. H.J.V. G.R.C.; Formal Analysis: J.M.C. J.Be.; Funding Acquisition: J.M.C. N.B. L.M.N. G.R.C.; Investigation: J.M.C. K.S.R. J.Be. M.A.A. T.A.L. G.R.C.; Methodology: J.Be. T.A.L.; Project Administration: G.R.C.; Resources: J.Be. K.S.R. N.B. J.Br. H.C.D. G.M. L.M.N. M.B.S. D.J.V. G.R.C.; Software: J.Be. T.A.L.; Supervision: G.R.C.; Validation: J.M.C. K.S.R. G.R.C.; Visualization: J.M.C. J.Be. K.S.R.; Writing-original draft: J.M.C. J.Be. K.S.R. G.R.C.; Writing-review and editing: J.M.C. J.Be. K.S.R. M.A.A. N.B. J.Br. H.C.D. G.M. L.M.N. M.B.S. D.J.V. T.H.B. T.A.L. G.R.C. All data were de-identified. The Clinical and Functional Translation of CFTR project is acknowledged by the Johns Hopkins University Institutional Review Board (NA_00018599) and does not require institutional review board approval. All other data sets were obtained from publicly available resources and/or from researchers who provided de-identified variant information that meets the National Institutes of Health Exemption 4 (§46.104(d)(ii)) and does not qualify as human subject research. This research was funded by the following organizations. A National Institutes of Health grant to J.M.C. under grant number R01 HL128475. This study was part of the RD-Connect initiative and was supported by an FP7-HEALTH-2021-INNOVATION-1 EU Grant to N.B. under grant number 305444. L.M.N. was supported by grants from the National Institutes of Health, American Thoracic Society, and Eudowood Foundation. G.R.C. was supported by grants from the National Institute of Diabetes and Digestive and Kidney disease (grant number R01 DK44003) and the Cystic Fibrosis Foundation under grant number CUTTIN13A1. The funders had no role in the conceptualization, design, data collection, analysis, decision to publish, or preparation of the manuscript. The rest of the authors report no conflicts of interest. https://cftr2.org, https://gnomad.broadinstitute.org/, http://www.biopku.org/home/home.asp, https://www.barthsyndrome.org/research/tafazzindatabase.html, https://useast.ensembl.org/Tools/VEP Publisher Copyright: {\textcopyright} 2021",
year = "2022",
month = jan,
doi = "10.1016/j.gim.2021.08.012",
language = "English (US)",
volume = "24",
pages = "87--99",
journal = "Genetics in Medicine",
issn = "1098-3600",
publisher = "Lippincott Williams and Wilkins",
number = "1",
}