TY - JOUR
T1 - A subregion-based burden test for simultaneous identification of susceptibility loci and subregions within
AU - Zhu, Bin
AU - Mirabello, Lisa
AU - Chatterjee, Nilanjan
N1 - Publisher Copyright:
Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
PY - 2018/10
Y1 - 2018/10
N2 - In rare variant association studies, aggregating rare and/or low frequency variants, may increase statistical power for detection of the underlying susceptibility gene or region. However, it is unclear which variants, or class of them, in a gene contribute most to the association. We proposed a subregion-based burden test (REBET) to simultaneously select susceptibility genes and identify important underlying subregions. The subregions are predefined by shared common biologic characteristics, such as the protein domain or functional impact. Based on a subset-based approach considering local correlations between combinations of test statistics of subregions, REBET is able to properly control the type I error rate while adjusting for multiple comparisons in a computationally efficient manner. Simulation studies show that REBET can achieve power competitive to alternative methods when rare variants cluster within subregions. In two case studies, REBET is able to identify known disease susceptibility genes, and more importantly pinpoint the unreported most susceptible subregions, which represent protein domains essential for gene function. R package REBET is available at https://dceg.cancer.gov/tools/analysis/rebet.
AB - In rare variant association studies, aggregating rare and/or low frequency variants, may increase statistical power for detection of the underlying susceptibility gene or region. However, it is unclear which variants, or class of them, in a gene contribute most to the association. We proposed a subregion-based burden test (REBET) to simultaneously select susceptibility genes and identify important underlying subregions. The subregions are predefined by shared common biologic characteristics, such as the protein domain or functional impact. Based on a subset-based approach considering local correlations between combinations of test statistics of subregions, REBET is able to properly control the type I error rate while adjusting for multiple comparisons in a computationally efficient manner. Simulation studies show that REBET can achieve power competitive to alternative methods when rare variants cluster within subregions. In two case studies, REBET is able to identify known disease susceptibility genes, and more importantly pinpoint the unreported most susceptible subregions, which represent protein domains essential for gene function. R package REBET is available at https://dceg.cancer.gov/tools/analysis/rebet.
KW - burden test
KW - disease susceptibility genes
KW - rare variant association studies
KW - subset-based approach
UR - http://www.scopus.com/inward/record.url?scp=85054638896&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054638896&partnerID=8YFLogxK
U2 - 10.1002/gepi.22134
DO - 10.1002/gepi.22134
M3 - Article
C2 - 29931698
AN - SCOPUS:85054638896
SN - 0741-0395
VL - 42
SP - 673
EP - 683
JO - Genetic epidemiology
JF - Genetic epidemiology
IS - 7
ER -