TY - JOUR
T1 - A methodological comparison of risk scores versus decision trees for predicting drug-resistant infections
T2 - A case study using extended-spectrum beta-lactamase (ESBL) bacteremia
AU - Goodman, Katherine E.
AU - Lessler, Justin
AU - Harris, Anthony D.
AU - Milstone, Aaron M.
AU - Tamma, Pranita D.
N1 - Publisher Copyright:
© 2019 by The Society for Healthcare Epidemiology of America.
PY - 2019/4/1
Y1 - 2019/4/1
N2 - Background: Timely identification of multidrug-resistant gram-negative infections remains an epidemiological challenge. Statistical models for predicting drug resistance can offer utility where rapid diagnostics are unavailable or resource-impractical. Logistic regression-derived risk scores are common in the healthcare epidemiology literature. Machine learning-derived decision trees are an alternative approach for developing decision support tools. Our group previously reported on a decision tree for predicting ESBL bloodstream infections. Our objective in the current study was to develop a risk score from the same ESBL dataset to compare these 2 methods and to offer general guiding principles for using each approach.Methods: Using a dataset of 1,288 patients with Escherichia coli or Klebsiella spp bacteremia, we generated a risk score to predict the likelihood that a bacteremic patient was infected with an ESBL-producer. We evaluated discrimination (original and cross-validated models) using receiver operating characteristic curves and C statistics. We compared risk score and decision tree performance, and we reviewed their practical and methodological attributes.Results: In total, 194 patients (15%) were infected with ESBL-producing bacteremia. The clinical risk score included 14 variables, compared to the 5 decision-tree variables. The positive and negative predictive values of the risk score and decision tree were similar (>90%), but the C statistic of the risk score (0.87) was 10% higher.Conclusions: A decision tree and risk score performed similarly for predicting ESBL infection. The decision tree was more user-friendly, with fewer variables for the end user, whereas the risk score offered higher discrimination and greater flexibility for adjusting sensitivity and specificity.
AB - Background: Timely identification of multidrug-resistant gram-negative infections remains an epidemiological challenge. Statistical models for predicting drug resistance can offer utility where rapid diagnostics are unavailable or resource-impractical. Logistic regression-derived risk scores are common in the healthcare epidemiology literature. Machine learning-derived decision trees are an alternative approach for developing decision support tools. Our group previously reported on a decision tree for predicting ESBL bloodstream infections. Our objective in the current study was to develop a risk score from the same ESBL dataset to compare these 2 methods and to offer general guiding principles for using each approach.Methods: Using a dataset of 1,288 patients with Escherichia coli or Klebsiella spp bacteremia, we generated a risk score to predict the likelihood that a bacteremic patient was infected with an ESBL-producer. We evaluated discrimination (original and cross-validated models) using receiver operating characteristic curves and C statistics. We compared risk score and decision tree performance, and we reviewed their practical and methodological attributes.Results: In total, 194 patients (15%) were infected with ESBL-producing bacteremia. The clinical risk score included 14 variables, compared to the 5 decision-tree variables. The positive and negative predictive values of the risk score and decision tree were similar (>90%), but the C statistic of the risk score (0.87) was 10% higher.Conclusions: A decision tree and risk score performed similarly for predicting ESBL infection. The decision tree was more user-friendly, with fewer variables for the end user, whereas the risk score offered higher discrimination and greater flexibility for adjusting sensitivity and specificity.
UR - http://www.scopus.com/inward/record.url?scp=85062369986&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062369986&partnerID=8YFLogxK
U2 - 10.1017/ice.2019.17
DO - 10.1017/ice.2019.17
M3 - Article
C2 - 30827286
AN - SCOPUS:85062369986
SN - 0899-823X
VL - 40
SP - 400
EP - 407
JO - Infection control and hospital epidemiology
JF - Infection control and hospital epidemiology
IS - 4
ER -