Multiple imputation of missing phenotype data for QTL mapping

Jennifer F. Bobb, Daniel O. Scharfstein, Michael J. Daniels, Francis S. Collins, Samir Kelada

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Missing phenotype data can be a major hurdle to mapping quantitative trait loci (QTL). Though in many cases experiments may be designed to minimize the occurrence of missing data, it is often unavoidable in practice; thus, statistical methods to account for missing data are needed. In this paper we describe an approach for conjoining multiple imputation and QTL mapping. Methods are applied to map genes associated with increased breathing effort in mice after lung inflammation due to allergen challenge in developing lines of the Collaborative Cross, a new mouse genetics resource. Missing data poses a particular challenge in this study because the desired phenotype summary to be mapped is a function of incompletely observed dose-response curves. Comparison of the multiple imputation approach to two naive approaches for handling missing data suggest that these simpler methods may yield poor results: ignoring missing data through a complete case analysis may lead to incorrect conclusions, while using a last observation carried forward procedure, which does not account for uncertainty in the imputed values, may lead to anti-conservative inference. The proposed approach is widely applicable to other studies with missing phenotype data.

Original languageEnglish (US)
Article number29
JournalStatistical applications in genetics and molecular biology
Issue number1
StatePublished - 2011
Externally publishedYes


  • missing data
  • multiple imputation
  • quantitative trait loci

ASJC Scopus subject areas

  • Statistics and Probability
  • Molecular Biology
  • Genetics
  • Computational Mathematics


Dive into the research topics of 'Multiple imputation of missing phenotype data for QTL mapping'. Together they form a unique fingerprint.

Cite this