TY - JOUR
T1 - Standalone AI for Breast Cancer Detection at Screening Digital Mammography and Digital Breast Tomosynthesis
T2 - A Systematic Review and Meta-Analysis
AU - Yoon, Jung Hyun
AU - Strand, Fredrik
AU - Baltzer, Pascal A.T.
AU - Conant, Emily F.
AU - Gilbert, Fiona J.
AU - Lehman, Constance D.
AU - Morris, Elizabeth A.
AU - Mullen, Lisa A.
AU - Nishikawa, Robert M.
AU - Sharma, Nisha
AU - Vejborg, Ilse
AU - Moy, Linda
AU - Mann, Ritse M.
N1 - Publisher Copyright:
© 2023 Radiological Society of North America Inc.. All rights reserved.
PY - 2023/6
Y1 - 2023/6
N2 - Background: There is considerable interest in the potential use of artificial intelligence (AI) systems in mammographic screening. However, it is essential to critically evaluate the performance of AI before it can become a modality used for independent mammographic interpretation. Purpose: To evaluate the reported standalone performances of AI for interpretation of digital mammography and digital breast tomosynthesis (DBT). Materials and Methods: A systematic search was conducted in PubMed, Google Scholar, Embase (Ovid), and Web of Science databases for studies published from January 2017 to June 2022. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) values were reviewed. Study quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 and Comparative (QUADAS-2 and QUADAS-C, respectively). A random effects meta-Analysis and meta-regression analysis were performed for overall studies and for different study types (reader studies vs historic cohort studies) and imaging techniques (digital mammography vs DBT). Results: In total, 16 studies that include 1 108 328 examinations in 497 091 women were analyzed (six reader studies, seven historic cohort studies on digital mammography, and four studies on DBT). Pooled AUCs were significantly higher for standalone AI than radiologists in the six reader studies on digital mammography (0.87 vs 0.81, P = .002), but not for historic cohort studies (0.89 vs 0.96, P = .152). Four studies on DBT showed significantly higher AUCs in AI compared with radiologists (0.90 vs 0.79, P .001). Higher sensitivity and lower specificity were seen for standalone AI compared with radiologists. Conclusion: Standalone AI for screening digital mammography performed as well as or better than radiologists. Compared with digital mammography, there is an insufficient number of studies to assess the performance of AI systems in the interpretation of DBT screening examinations.
AB - Background: There is considerable interest in the potential use of artificial intelligence (AI) systems in mammographic screening. However, it is essential to critically evaluate the performance of AI before it can become a modality used for independent mammographic interpretation. Purpose: To evaluate the reported standalone performances of AI for interpretation of digital mammography and digital breast tomosynthesis (DBT). Materials and Methods: A systematic search was conducted in PubMed, Google Scholar, Embase (Ovid), and Web of Science databases for studies published from January 2017 to June 2022. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) values were reviewed. Study quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 and Comparative (QUADAS-2 and QUADAS-C, respectively). A random effects meta-Analysis and meta-regression analysis were performed for overall studies and for different study types (reader studies vs historic cohort studies) and imaging techniques (digital mammography vs DBT). Results: In total, 16 studies that include 1 108 328 examinations in 497 091 women were analyzed (six reader studies, seven historic cohort studies on digital mammography, and four studies on DBT). Pooled AUCs were significantly higher for standalone AI than radiologists in the six reader studies on digital mammography (0.87 vs 0.81, P = .002), but not for historic cohort studies (0.89 vs 0.96, P = .152). Four studies on DBT showed significantly higher AUCs in AI compared with radiologists (0.90 vs 0.79, P .001). Higher sensitivity and lower specificity were seen for standalone AI compared with radiologists. Conclusion: Standalone AI for screening digital mammography performed as well as or better than radiologists. Compared with digital mammography, there is an insufficient number of studies to assess the performance of AI systems in the interpretation of DBT screening examinations.
UR - http://www.scopus.com/inward/record.url?scp=85164041070&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85164041070&partnerID=8YFLogxK
U2 - 10.1148/radiol.222639
DO - 10.1148/radiol.222639
M3 - Article
C2 - 37219445
AN - SCOPUS:85164041070
SN - 0033-8419
VL - 307
JO - RADIOLOGY
JF - RADIOLOGY
IS - 5
M1 - e222639
ER -