Abstract
Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene–environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.
Original language | English (US) |
---|---|
Pages (from-to) | 625-637 |
Number of pages | 13 |
Journal | Biometrics |
Volume | 75 |
Issue number | 2 |
DOIs | |
State | Published - 2019 |
Keywords
- gene–environment interactions
- kernel machine testing
- likelihood ratio test
- multiple variance components
- spectral decomposition
- unidentifiable conditions
ASJC Scopus subject areas
- Statistics and Probability
- General Biochemistry, Genetics and Molecular Biology
- General Immunology and Microbiology
- General Agricultural and Biological Sciences
- Applied Mathematics