Abstract
Splitting extended families into their component nuclear families to apply a genetic association method designed for nuclear families is a widespread practice in familial genetic studies. Dependence among genotypes and phenotypes of nuclear families from the same extended family arises because of genetic linkage of the tested marker with a risk variant or because of familial specificity of genetic effects due to gene-environment interaction. This raises concerns about the validity of inference conducted under the assumption of independence of the nuclear families. We indeed prove theoretically that, in a conditional logistic regression analysis applicable to disease cases and their genotyped parents, the naive model-based estimator of the variance of the coefficient estimates underestimates the true variance. However, simulations with realistic effect sizes of risk variants and variation of this effect from family to family reveal that the underestimation is negligible. The simulations also show the greater efficiency of the model-based variance estimator compared to a robust empirical estimator. Our recommendation is therefore, to use the model-based estimator of variance for inference on effects of genetic variants.
Original language | English (US) |
---|---|
Pages (from-to) | 533-549 |
Number of pages | 17 |
Journal | Statistical applications in genetics and molecular biology |
Volume | 14 |
Issue number | 6 |
DOIs | |
State | Published - Dec 1 2015 |
Externally published | Yes |
Keywords
- Wald test
- conditional logistic regression
- empirical variance estimation
- genetic linkage
- pseudo-controls
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Genetics
- Computational Mathematics