Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests

Chen Hu, Jon Arni Steingrimsson

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


A crucial component of making individualized treatment decisions is to accurately predict each patient’s disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.

Original languageEnglish (US)
Pages (from-to)333-349
Number of pages17
JournalJournal of biopharmaceutical statistics
Issue number2
StatePublished - Mar 4 2018


  • CART
  • Cancer
  • risk prediction
  • survival analysis
  • survival forests
  • survival trees

ASJC Scopus subject areas

  • Statistics and Probability
  • Pharmacology
  • Pharmacology (medical)


Dive into the research topics of 'Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests'. Together they form a unique fingerprint.

Cite this