TY - JOUR
T1 - Using Principal Components of Genetic Variation for Robust and Powerful Detection of Gene-Gene Interactions in Case-Control and Case-Only Studies
AU - Bhattacharjee, Samsiddhi
AU - Wang, Zhaoming
AU - Ciampa, Julia
AU - Kraft, Peter
AU - Chanock, Stephen
AU - Yu, Kai
AU - Chatterjee, Nilanjan
N1 - Funding Information:
The authors would like to acknowledge Dr. Jay Sethuraman for helpful discussions on matching algorithms and Dr. B.J. Stone for proof reading and editing the manuscript. The research of S.B., J.C., K.Y., S.C., and N.C. was supported by the National Cancer Institute Intramural Program and a Gene-Environment Initiative (GEI) grant from the National Heart, Lung, and Blood Institute. The research of P.K. was supported by NIH P01 CA87969. We would also like to thank three anonymous reviewers for their helpful comments to improve the manuscript. This study utilized the high-performance computational capabilities of the StatPro Linux cluster at the National Cancer Institute and the Biowulf Linux cluster at the National Institutes of Health, Bethesda, MD.
PY - 2010
Y1 - 2010
N2 - Many popular methods for exploring gene-gene interactions, including the case-only approach, rely on the key assumption that physically distant loci are in linkage equilibrium in the underlying population. These methods utilize the presence of correlation between unlinked loci in a disease-enriched sample as evidence of interactions among the loci in the etiology of the disease. We use data from the CGEMS case-control genome-wide association study of breast cancer to demonstrate empirically that the case-only and related methods have the potential to create large-scale false positives because of the presence of population stratification (PS) that creates long-range linkage disequilibrium in the genome. We show that the bias can be removed by considering parametric and nonparametric methods that assume gene-gene independence between unlinked loci, not in the entire population, but only conditional on population substructure that can be uncovered based on the principal components of a suitably large panel of PS markers. Applications in the CGEMS study as well as simulated data show that the proposed methods are robust to the presence of population stratification and are yet much more powerful, relative to standard logistic regression methods that are also commonly used as robust alternatives to the case-only type methods.
AB - Many popular methods for exploring gene-gene interactions, including the case-only approach, rely on the key assumption that physically distant loci are in linkage equilibrium in the underlying population. These methods utilize the presence of correlation between unlinked loci in a disease-enriched sample as evidence of interactions among the loci in the etiology of the disease. We use data from the CGEMS case-control genome-wide association study of breast cancer to demonstrate empirically that the case-only and related methods have the potential to create large-scale false positives because of the presence of population stratification (PS) that creates long-range linkage disequilibrium in the genome. We show that the bias can be removed by considering parametric and nonparametric methods that assume gene-gene independence between unlinked loci, not in the entire population, but only conditional on population substructure that can be uncovered based on the principal components of a suitably large panel of PS markers. Applications in the CGEMS study as well as simulated data show that the proposed methods are robust to the presence of population stratification and are yet much more powerful, relative to standard logistic regression methods that are also commonly used as robust alternatives to the case-only type methods.
UR - http://www.scopus.com/inward/record.url?scp=77649235526&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77649235526&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2010.01.026
DO - 10.1016/j.ajhg.2010.01.026
M3 - Article
C2 - 20206333
AN - SCOPUS:77649235526
SN - 0002-9297
VL - 86
SP - 331
EP - 342
JO - American journal of human genetics
JF - American journal of human genetics
IS - 3
ER -