Multiple model evaluation absent the gold standard through model combination

Edwin S. Iversen, Giovanni Parmigiani, Sining Chen

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


We describe a method for evaluating an ensemble of predictive models given a sample of observations comprising the model predictions and the outcome event measured with error. Our formulation allows us to simultaneously estimate measurement error parameters, true outcome - the "gold standard" - and a relative weighting of the predictive scores. We describe conditions necessary to estimate the gold standard and to calibrate these estimates and detail how our approach is related to, but distinct from, standard model combination techniques. We apply our approach to data from a study to evaluate a collection of BRCA1/BRCA2 gene mutation prediction scores. In this example, genotype is measured with error by one or more genetic assays. We estimate true genotype for each individual in the data set, operating characteristics of the commonly used genotyping procedures, and a relative weighting of the scores. Finally, we compare the scores against the gold standard genotype and find that Mendelian scores are, on average, the more refined and better calibrated of those considered and that the comparison is sensitive to measurement error in the gold standard.

Original languageEnglish (US)
Pages (from-to)897-909
Number of pages13
JournalJournal of the American Statistical Association
Issue number483
StatePublished - Sep 2008


  • Bayesian analysis
  • Breast cancer susceptibility genes
  • Measurement error
  • Model combination
  • Model evaluation

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Multiple model evaluation absent the gold standard through model combination'. Together they form a unique fingerprint.

Cite this