A Unified Framework on Generalizability of Clinical Prediction Models

Bohua Wan; Brian Caffo; S. Swaroop Vedula

doi:10.3389/frai.2022.872720

A Unified Framework on Generalizability of Clinical Prediction Models

Bohua Wan, Brian Caffo, S. Swaroop Vedula

Bloomberg School of Public Health

Research output: Contribution to journal › Article › peer-review

Abstract

To be useful, clinical prediction models (CPMs) must be generalizable to patients in new settings. Evaluating generalizability of CPMs helps identify spurious relationships in data, provides insights on when they fail, and thus, improves the explainability of the CPMs. There are discontinuities in concepts related to generalizability of CPMs in the clinical research and machine learning domains. Specifically, conventional statistical reasons to explain poor generalizability such as inadequate model development for the purposes of generalizability, differences in coding of predictors and outcome between development and external datasets, measurement error, inability to measure some predictors, and missing data, all have differing and often complementary treatments, in the two domains. Much of the current machine learning literature on generalizability of CPMs is in terms of dataset shift of which several types have been described. However, little research exists to synthesize concepts in the two domains. Bridging this conceptual discontinuity in the context of CPMs can facilitate systematic development of CPMs and evaluation of their sensitivity to factors that affect generalizability. We survey generalizability and dataset shift in CPMs from both the clinical research and machine learning perspectives, and describe a unifying framework to analyze generalizability of CPMs and to explain their sensitivity to factors affecting it. Our framework leads to a set of signaling statements that can be used to characterize differences between datasets in terms of factors that affect generalizability of the CPMs.

Original language	English (US)
Article number	872720
Journal	Frontiers in Artificial Intelligence
Volume	5
DOIs	https://doi.org/10.3389/frai.2022.872720
State	Published - Apr 29 2022

Keywords

clinical prediction models
dataset shift
diagnosis
explainability
external validity
generalizability
prognosis

ASJC Scopus subject areas

Artificial Intelligence

Access to Document

10.3389/frai.2022.872720

Cite this

@article{d0b1910d72d947cc8774a7411eb5b2f0,

title = "A Unified Framework on Generalizability of Clinical Prediction Models",

abstract = "To be useful, clinical prediction models (CPMs) must be generalizable to patients in new settings. Evaluating generalizability of CPMs helps identify spurious relationships in data, provides insights on when they fail, and thus, improves the explainability of the CPMs. There are discontinuities in concepts related to generalizability of CPMs in the clinical research and machine learning domains. Specifically, conventional statistical reasons to explain poor generalizability such as inadequate model development for the purposes of generalizability, differences in coding of predictors and outcome between development and external datasets, measurement error, inability to measure some predictors, and missing data, all have differing and often complementary treatments, in the two domains. Much of the current machine learning literature on generalizability of CPMs is in terms of dataset shift of which several types have been described. However, little research exists to synthesize concepts in the two domains. Bridging this conceptual discontinuity in the context of CPMs can facilitate systematic development of CPMs and evaluation of their sensitivity to factors that affect generalizability. We survey generalizability and dataset shift in CPMs from both the clinical research and machine learning perspectives, and describe a unifying framework to analyze generalizability of CPMs and to explain their sensitivity to factors affecting it. Our framework leads to a set of signaling statements that can be used to characterize differences between datasets in terms of factors that affect generalizability of the CPMs.",

keywords = "clinical prediction models, dataset shift, diagnosis, explainability, external validity, generalizability, prognosis",

author = "Bohua Wan and Brian Caffo and Vedula, {S. Swaroop}",

note = "Publisher Copyright: Copyright {\textcopyright} 2022 Wan, Caffo and Vedula.",

year = "2022",

month = apr,

day = "29",

doi = "10.3389/frai.2022.872720",

language = "English (US)",

volume = "5",

journal = "Frontiers in Artificial Intelligence",

issn = "2624-8212",

publisher = "Frontiers Media S. A.",

}

TY - JOUR

T1 - A Unified Framework on Generalizability of Clinical Prediction Models

AU - Wan, Bohua

AU - Caffo, Brian

AU - Vedula, S. Swaroop

PY - 2022/4/29

Y1 - 2022/4/29

N2 - To be useful, clinical prediction models (CPMs) must be generalizable to patients in new settings. Evaluating generalizability of CPMs helps identify spurious relationships in data, provides insights on when they fail, and thus, improves the explainability of the CPMs. There are discontinuities in concepts related to generalizability of CPMs in the clinical research and machine learning domains. Specifically, conventional statistical reasons to explain poor generalizability such as inadequate model development for the purposes of generalizability, differences in coding of predictors and outcome between development and external datasets, measurement error, inability to measure some predictors, and missing data, all have differing and often complementary treatments, in the two domains. Much of the current machine learning literature on generalizability of CPMs is in terms of dataset shift of which several types have been described. However, little research exists to synthesize concepts in the two domains. Bridging this conceptual discontinuity in the context of CPMs can facilitate systematic development of CPMs and evaluation of their sensitivity to factors that affect generalizability. We survey generalizability and dataset shift in CPMs from both the clinical research and machine learning perspectives, and describe a unifying framework to analyze generalizability of CPMs and to explain their sensitivity to factors affecting it. Our framework leads to a set of signaling statements that can be used to characterize differences between datasets in terms of factors that affect generalizability of the CPMs.

AB - To be useful, clinical prediction models (CPMs) must be generalizable to patients in new settings. Evaluating generalizability of CPMs helps identify spurious relationships in data, provides insights on when they fail, and thus, improves the explainability of the CPMs. There are discontinuities in concepts related to generalizability of CPMs in the clinical research and machine learning domains. Specifically, conventional statistical reasons to explain poor generalizability such as inadequate model development for the purposes of generalizability, differences in coding of predictors and outcome between development and external datasets, measurement error, inability to measure some predictors, and missing data, all have differing and often complementary treatments, in the two domains. Much of the current machine learning literature on generalizability of CPMs is in terms of dataset shift of which several types have been described. However, little research exists to synthesize concepts in the two domains. Bridging this conceptual discontinuity in the context of CPMs can facilitate systematic development of CPMs and evaluation of their sensitivity to factors that affect generalizability. We survey generalizability and dataset shift in CPMs from both the clinical research and machine learning perspectives, and describe a unifying framework to analyze generalizability of CPMs and to explain their sensitivity to factors affecting it. Our framework leads to a set of signaling statements that can be used to characterize differences between datasets in terms of factors that affect generalizability of the CPMs.

KW - clinical prediction models

KW - dataset shift

KW - diagnosis

KW - explainability

KW - external validity

KW - generalizability

KW - prognosis

UR - http://www.scopus.com/inward/record.url?scp=85130235255&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85130235255&partnerID=8YFLogxK

U2 - 10.3389/frai.2022.872720

DO - 10.3389/frai.2022.872720

M3 - Article

C2 - 35573904

AN - SCOPUS:85130235255

SN - 2624-8212

VL - 5

JO - Frontiers in Artificial Intelligence

JF - Frontiers in Artificial Intelligence

M1 - 872720

ER -

A Unified Framework on Generalizability of Clinical Prediction Models

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this