Analysis of case-control studies of genetic and environmental factors with missing genetic information and haplotype-phase ambiguity

Christine Spinka, Raymond J. Carroll, Nilanjan Chatterjee

Research output: Contribution to journalArticlepeer-review

59 Scopus citations


Case-control studies of unrelated subjects are now widely used to study the role of genetic susceptibility and gene-environment interactions in the etiology of complex diseases. Exploiting an assumption of gene-environment independence, and treating the distribution of environmental exposures as completely nonparametric, Chatterjee and Carroll [2005] (Biometrika 92:399-418) recently developed an efficient retrospective maximum-likelihood method for analysis of case-control studies. In this article, we develop an extension of the retrospective maximum-likelihood approach to studies where genetic information may be missing on some study subjects. In particular, special emphasis is given to haplotype-based studies where missing data arise due to linkage-phase ambiguity of genotype data. We use a profile likelihood technique and an appropriate expectation-maximization (EM) algorithm to derive a relatively simple procedure for parameter estimation, with or without a rare disease assumption, and possibly incorporating information on the marginal probability of the disease for the underlying population. We also describe two alternative robust approaches that are less sensitive to the underlying gene-environment independence and Hardy-Weinberg-equilibrium assumptions. The performance of the proposed methods is studied using simulation studies in the context of haplotype-based studies of gene-environment interactions. An application of the proposed method is illustrated using a case-control study of ovarian cancer designed to investigate the interaction between BRCA1/2 mutations and reproductive risk factors in the etiology of ovarian cancer.

Original languageEnglish (US)
Pages (from-to)108-127
Number of pages20
JournalGenetic epidemiology
Issue number2
StatePublished - Sep 2005
Externally publishedYes


  • Case-control studies
  • EM algorithm
  • Gene-environment interactions
  • Haplotype
  • Semiparametric methods

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)


Dive into the research topics of 'Analysis of case-control studies of genetic and environmental factors with missing genetic information and haplotype-phase ambiguity'. Together they form a unique fingerprint.

Cite this