TY - JOUR
T1 - REVEL
T2 - An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants
AU - Ioannidis, Nilah M.
AU - Rothstein, Joseph H.
AU - Pejaver, Vikas
AU - Middha, Sumit
AU - McDonnell, Shannon K.
AU - Baheti, Saurabh
AU - Musolf, Anthony
AU - Li, Qing
AU - Holzinger, Emily
AU - Karyadi, Danielle
AU - Cannon-Albright, Lisa A.
AU - Teerlink, Craig C.
AU - Stanford, Janet L.
AU - Isaacs, William B.
AU - Xu, Jianfeng
AU - Cooney, Kathleen A.
AU - Lange, Ethan M.
AU - Schleutker, Johanna
AU - Carpten, John D.
AU - Powell, Isaac J.
AU - Cussenot, Olivier
AU - Cancel-Tassin, Geraldine
AU - Giles, Graham G.
AU - MacInnis, Robert J.
AU - Maier, Christiane
AU - Hsieh, Chih Lin
AU - Wiklund, Fredrik
AU - Catalona, William J.
AU - Foulkes, William D.
AU - Mandal, Diptasri
AU - Eeles, Rosalind A.
AU - Kote-Jarai, Zsofia
AU - Bustamante, Carlos D.
AU - Schaid, Daniel J.
AU - Hastie, Trevor
AU - Ostrander, Elaine A.
AU - Bailey-Wilson, Joan E.
AU - Radivojac, Predrag
AU - Thibodeau, Stephen N.
AU - Whittemore, Alice S.
AU - Sieh, Weiva
N1 - Funding Information:
This research was funded by NIH grants U01CA089600, R01CA094069, R01LM009722, R01MH105524, K07CA143047, and F32HG008330 and by the Intramural Research Program of the National Human Genome Research Institute, NIH.
Publisher Copyright:
© 2016 American Society of Human Genetics
PY - 2016/10/6
Y1 - 2016/10/6
N2 - The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10−12) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046–0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027–0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.
AB - The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10−12) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046–0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027–0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.
UR - http://www.scopus.com/inward/record.url?scp=84991615407&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991615407&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2016.08.016
DO - 10.1016/j.ajhg.2016.08.016
M3 - Article
C2 - 27666373
AN - SCOPUS:84991615407
SN - 0002-9297
VL - 99
SP - 877
EP - 885
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 4
ER -