TY - JOUR
T1 - An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
AU - Afshar, Ali S.
AU - Li, Yijun
AU - Chen, Zixu
AU - Chen, Yuxuan
AU - Lee, Jae Hun
AU - Irani, Darius
AU - Crank, Aidan
AU - Singh, Digvijay
AU - Kanter, Michael
AU - Faraday, Nauder
AU - Kharrazi, Hadi
N1 - Publisher Copyright:
© 2021 The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association.
PY - 2021/7/1
Y1 - 2021/7/1
N2 - Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes.
AB - Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes.
KW - data quality
KW - intensive care unit
KW - physiologic monitoring
UR - http://www.scopus.com/inward/record.url?scp=85118184290&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118184290&partnerID=8YFLogxK
U2 - 10.1093/jamiaopen/ooab057
DO - 10.1093/jamiaopen/ooab057
M3 - Article
C2 - 34350392
AN - SCOPUS:85118184290
SN - 2574-2531
VL - 4
JO - JAMIA Open
JF - JAMIA Open
IS - 3
M1 - ooab057
ER -