TY - JOUR
T1 - Artificial Intelligence for Reducing Workload in Breast Cancer Screening with Digital Breast Tomosynthesis
AU - Shoshan, Yoel
AU - Bakalo, Ran
AU - Gilboa-Solomon, Flora
AU - Ratner, Vadim
AU - Barkan, Ella
AU - Ozery-Flato, Michal
AU - Amit, Mika
AU - Khapun, Daniel
AU - Ambinder, Emily B.
AU - Oluyemi, Eniola T.
AU - Panigrahi, Babita
AU - DiCarlo, Philip A.
AU - Rosen-Zvi, Michal
AU - Mullen, Lisa A.
N1 - Funding Information:
Our study included the following two health care networks: Johns Hopkins Medicine institutional review boards approved the use of their data, with a waiver of the need to obtain written informed consent for this study, which was compliant with the Health Insurance Portability and Accountability Act; and a U.S. health care network, which provided institutional review board– exempt, retrospective, deidentified data that was approved for secondary use by IBM. The study was not financially supported by a grant or external company.
Publisher Copyright:
© 2022 Radiological Society of North America Inc.. All rights reserved.
PY - 2022/4
Y1 - 2022/4
N2 - Background: Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammography, but interpretation time is substantially longer. Artificial intelligence (AI) could improve reading efficiency. Purpose: To evaluate the use of AI to reduce workload by filtering out normal DBT screens. Materials and Methods: The retrospective study included 13 306 DBT examinations from 9919 women performed between June 2013 and November 2018 from two health care networks. The cohort was split into training, validation, and test sets (3948, 1661, and 4310 women, respectively). A workflow was simulated in which the AI model classified cancer-free examinations that could be dismissed from the screening worklist and used the original radiologists’ interpretations on the rest of the worklist examinations. The AI system was also evaluated with a reader study of five breast radiologists reading the DBT mammograms of 205 women. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and recall rate were evaluated in both studies. Statistics were computed across 10 000 bootstrap samples to assess 95% CIs, noninferiority, and superiority tests. Results: The model was tested on 4310 screened women (mean age, 60 years 6 11 [standard deviation]; 5182 DBT examinations). Compared with the radiologists’ performance (417 of 459 detected cancers [90.8%], 477 recalls in 5182 examinations [9.2%]), the use of AI to automatically filter out cases would result in 39.6% less workload, noninferior sensitivity (413 of 459 detected cancers; 90.0%; P = .002), and 25% lower recall rate (358 recalls in 5182 examinations; 6.9%; P = .002). In the reader study, AUC was higher in the standalone AI compared with the mean reader (0.84 vs 0.81; P = .002). Conclusion: The artificial intelligence model was able to identify normal digital breast tomosynthesis screening examinations, which decreased the number of examinations that required radiologist interpretation in a simulated clinical workflow.
AB - Background: Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammography, but interpretation time is substantially longer. Artificial intelligence (AI) could improve reading efficiency. Purpose: To evaluate the use of AI to reduce workload by filtering out normal DBT screens. Materials and Methods: The retrospective study included 13 306 DBT examinations from 9919 women performed between June 2013 and November 2018 from two health care networks. The cohort was split into training, validation, and test sets (3948, 1661, and 4310 women, respectively). A workflow was simulated in which the AI model classified cancer-free examinations that could be dismissed from the screening worklist and used the original radiologists’ interpretations on the rest of the worklist examinations. The AI system was also evaluated with a reader study of five breast radiologists reading the DBT mammograms of 205 women. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and recall rate were evaluated in both studies. Statistics were computed across 10 000 bootstrap samples to assess 95% CIs, noninferiority, and superiority tests. Results: The model was tested on 4310 screened women (mean age, 60 years 6 11 [standard deviation]; 5182 DBT examinations). Compared with the radiologists’ performance (417 of 459 detected cancers [90.8%], 477 recalls in 5182 examinations [9.2%]), the use of AI to automatically filter out cases would result in 39.6% less workload, noninferior sensitivity (413 of 459 detected cancers; 90.0%; P = .002), and 25% lower recall rate (358 recalls in 5182 examinations; 6.9%; P = .002). In the reader study, AUC was higher in the standalone AI compared with the mean reader (0.84 vs 0.81; P = .002). Conclusion: The artificial intelligence model was able to identify normal digital breast tomosynthesis screening examinations, which decreased the number of examinations that required radiologist interpretation in a simulated clinical workflow.
UR - http://www.scopus.com/inward/record.url?scp=85126178582&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126178582&partnerID=8YFLogxK
U2 - 10.1148/RADIOL.211105
DO - 10.1148/RADIOL.211105
M3 - Article
C2 - 35040677
AN - SCOPUS:85126178582
SN - 0033-8419
VL - 303
SP - 69
EP - 77
JO - RADIOLOGY
JF - RADIOLOGY
IS - 1
ER -