TY - GEN
T1 - Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting
AU - Yan, Benjamin
AU - Liu, Ruochen
AU - Kuo, David E.
AU - Adithan, Subathra
AU - Reis, Eduardo Pontes
AU - Kwak, Stephen
AU - Venugopal, Vasantha Kumar
AU - O'Connell, Chloe P.
AU - Saenz, Agustina
AU - Rajpurkar, Pranav
AU - Moor, Michael
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Automatically generated reports from medical images promise to improve the workflow of radiologists. Existing methods consider an image-to-report modeling task by directly generating a fully-fledged report from an image. However, this conflates the content of the report (e.g., findings and their attributes) with its style (e.g., format and choice of words), which can lead to clinically inaccurate reports. To address this, we propose a two-step approach for radiology report generation. First, we extract the content from an image; then, we verbalize the extracted content into a report that matches the style of a specific radiologist. For this, we leverage RadGraph-a graph representation of reports-together with large language models (LLMs). In our quantitative evaluations, we find that our approach leads to beneficial performance. Our human evaluation with clinical raters highlights that the AI-generated reports are indistinguishably tailored to the style of individual radiologist despite leveraging only a few examples as context.
AB - Automatically generated reports from medical images promise to improve the workflow of radiologists. Existing methods consider an image-to-report modeling task by directly generating a fully-fledged report from an image. However, this conflates the content of the report (e.g., findings and their attributes) with its style (e.g., format and choice of words), which can lead to clinically inaccurate reports. To address this, we propose a two-step approach for radiology report generation. First, we extract the content from an image; then, we verbalize the extracted content into a report that matches the style of a specific radiologist. For this, we leverage RadGraph-a graph representation of reports-together with large language models (LLMs). In our quantitative evaluations, we find that our approach leads to beneficial performance. Our human evaluation with clinical raters highlights that the AI-generated reports are indistinguishably tailored to the style of individual radiologist despite leveraging only a few examples as context.
UR - http://www.scopus.com/inward/record.url?scp=85183310642&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183310642&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85183310642
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 14676
EP - 14688
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -