TY - JOUR
T1 - A method to predict the impact of regulatory variants from DNA sequence
AU - Lee, Dongwon
AU - Gorkin, David U.
AU - Baker, Maggie
AU - Strober, Benjamin J.
AU - Asoni, Alessandro L.
AU - McCallion, Andrew S.
AU - Beer, Michael A.
N1 - Funding Information:
This research was supported in part by US National Institutes of Health grant R01 NS62972 to A.S.M. and by grant R01 HG007348 to M.A.B.
Publisher Copyright:
© 2015 Nature America, Inc. All rights reserved.
PY - 2015/8/30
Y1 - 2015/8/30
N2 - Most variants implicated in common human disease by genome-wide association studies (GWAS) lie in noncoding sequence intervals. Despite the suggestion that regulatory element disruption represents a common theme, identifying causal risk variants within implicated genomic regions remains a major challenge. Here we present a new sequence-based computational method to predict the effect of regulatory variation, using a classifier (gkm-SVM) that encodes cell type-specific regulatory sequence vocabularies. The induced change in the gkm-SVM score, deltaSVM, quantifies the effect of variants. We show that deltaSVM accurately predicts the impact of SNPs on DNase I sensitivity in their native genomic contexts and accurately predicts the results of dense mutagenesis of several enhancers in reporter assays. Previously validated GWAS SNPs yield large deltaSVM scores, and we predict new risk-conferring SNPs for several autoimmune diseases. Thus, deltaSVM provides a powerful computational approach to systematically identify functional regulatory variants.
AB - Most variants implicated in common human disease by genome-wide association studies (GWAS) lie in noncoding sequence intervals. Despite the suggestion that regulatory element disruption represents a common theme, identifying causal risk variants within implicated genomic regions remains a major challenge. Here we present a new sequence-based computational method to predict the effect of regulatory variation, using a classifier (gkm-SVM) that encodes cell type-specific regulatory sequence vocabularies. The induced change in the gkm-SVM score, deltaSVM, quantifies the effect of variants. We show that deltaSVM accurately predicts the impact of SNPs on DNase I sensitivity in their native genomic contexts and accurately predicts the results of dense mutagenesis of several enhancers in reporter assays. Previously validated GWAS SNPs yield large deltaSVM scores, and we predict new risk-conferring SNPs for several autoimmune diseases. Thus, deltaSVM provides a powerful computational approach to systematically identify functional regulatory variants.
UR - http://www.scopus.com/inward/record.url?scp=84938276507&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84938276507&partnerID=8YFLogxK
U2 - 10.1038/ng.3331
DO - 10.1038/ng.3331
M3 - Article
C2 - 26075791
AN - SCOPUS:84938276507
SN - 1061-4036
VL - 47
SP - 955
EP - 961
JO - Nature genetics
JF - Nature genetics
IS - 8
ER -