TY - JOUR
T1 - Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci
AU - Mousavi, Zeinab
AU - Arvanitis, Marios
AU - Duong, Thuy Vy
AU - Brody, Jennifer A.
AU - Battle, Alexis
AU - Sotoodehnia, Nona
AU - Shojaie, Ali
AU - Arking, Dan E.
AU - Bader, Joel S.
N1 - Publisher Copyright:
© 2025 Mousavi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2025/1
Y1 - 2025/1
N2 - Motivation: Genome-wide association studies (GWAS) have identified genetic variants, usually single-nucleotide polymorphisms (SNPs), associated with human traits, including disease and disease risk. These variants (or causal variants in linkage disequilibrium with them) usually affect the regulation or function of a nearby gene. A GWAS locus can span many genes, however, and prioritizing which gene or genes in a locus are most likely to be causal remains a challenge. Better prioritization and prediction of causal genes could reveal disease mechanisms and suggest interventions. Results: We describe a new Bayesian method, termed SIGNET for significance networks, that combines information both within and across loci to identify the most likely causal gene at each locus. The SIGNET method builds on existing methods that focus on individual loci with evidence from gene distance and expression quantitative trait loci (eQTL) by sharing information across loci using protein-protein and gene regulatory interaction network data. In an application to cardiac electrophysiology with 226 GWAS loci, only 46 (20%) have within-locus evidence from Mendelian genes, protein-coding changes, or colocalization with eQTL signals. At the remaining 180 loci lacking functional information, SIGNET selects 56 genes other than the minimum distance gene, equal to 31% of the information-poor loci and 25% of the GWAS loci overall. Assessment by pathway enrichment demonstrates improved performance by SIGNET. Review of individual loci shows literature evidence for genes selected by SIGNET, including PMP22 as a novel causal gene candidate.
AB - Motivation: Genome-wide association studies (GWAS) have identified genetic variants, usually single-nucleotide polymorphisms (SNPs), associated with human traits, including disease and disease risk. These variants (or causal variants in linkage disequilibrium with them) usually affect the regulation or function of a nearby gene. A GWAS locus can span many genes, however, and prioritizing which gene or genes in a locus are most likely to be causal remains a challenge. Better prioritization and prediction of causal genes could reveal disease mechanisms and suggest interventions. Results: We describe a new Bayesian method, termed SIGNET for significance networks, that combines information both within and across loci to identify the most likely causal gene at each locus. The SIGNET method builds on existing methods that focus on individual loci with evidence from gene distance and expression quantitative trait loci (eQTL) by sharing information across loci using protein-protein and gene regulatory interaction network data. In an application to cardiac electrophysiology with 226 GWAS loci, only 46 (20%) have within-locus evidence from Mendelian genes, protein-coding changes, or colocalization with eQTL signals. At the remaining 180 loci lacking functional information, SIGNET selects 56 genes other than the minimum distance gene, equal to 31% of the information-poor loci and 25% of the GWAS loci overall. Assessment by pathway enrichment demonstrates improved performance by SIGNET. Review of individual loci shows literature evidence for genes selected by SIGNET, including PMP22 as a novel causal gene candidate.
UR - http://www.scopus.com/inward/record.url?scp=85214452260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85214452260&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1012725
DO - 10.1371/journal.pcbi.1012725
M3 - Article
C2 - 39774334
AN - SCOPUS:85214452260
SN - 1553-734X
VL - 21
JO - PLoS computational biology
JF - PLoS computational biology
IS - 1
M1 - e1012725
ER -