TY - JOUR
T1 - A human mesh-centered approach to action recognition in the operating room
AU - Liu, Benjamin
AU - Soenens, Gilles
AU - Villarreal, Joshua
AU - Jopling, Jeffrey
AU - Van Herzeele, Isabelle
AU - Rau, Anita
AU - Yeung-Levy, Serena
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/6
Y1 - 2024/6
N2 - Aim: Video review programs in hospitals play a crucial role in optimizing operating room workflows. In scenarios where split-seconds can change the outcome of a surgery, the potential of such programs to improve safety and efficiency is profound. However, leveraging this potential requires a systematic and automated analysis of human actions. Existing methods predominantly employ manual methods, which are labor-intensive, inconsistent, and difficult to scale. Here, we present an AI-based approach to systematically analyze the behavior and actions of individuals from operating rooms (OR) videos. Methods: We designed a novel framework for human mesh recovery from long-duration surgical videos by integrating existing human detection, tracking, and mesh recovery models. We then trained an action recognition model to predict surgical actions from the predicted temporal mesh sequences. To train and evaluate our approach, we annotated an in-house dataset of 864 five-second clips from simulated surgical videos with their corresponding actions. Results: Our best model achieves an F1 score and the area under the precision-recall curve (AUPRC) of 0.81 and 0.85, respectively, demonstrating that human mesh sequences can be successfully used to recover surgical actions from operating room videos. Model ablation studies suggest that action recognition performance is enhanced by composing human mesh representations with lower arm, pelvic, and cranial joints. Conclusion: Our work presents promising opportunities for OR video review programs to study human behavior in a systematic, scalable manner.
AB - Aim: Video review programs in hospitals play a crucial role in optimizing operating room workflows. In scenarios where split-seconds can change the outcome of a surgery, the potential of such programs to improve safety and efficiency is profound. However, leveraging this potential requires a systematic and automated analysis of human actions. Existing methods predominantly employ manual methods, which are labor-intensive, inconsistent, and difficult to scale. Here, we present an AI-based approach to systematically analyze the behavior and actions of individuals from operating rooms (OR) videos. Methods: We designed a novel framework for human mesh recovery from long-duration surgical videos by integrating existing human detection, tracking, and mesh recovery models. We then trained an action recognition model to predict surgical actions from the predicted temporal mesh sequences. To train and evaluate our approach, we annotated an in-house dataset of 864 five-second clips from simulated surgical videos with their corresponding actions. Results: Our best model achieves an F1 score and the area under the precision-recall curve (AUPRC) of 0.81 and 0.85, respectively, demonstrating that human mesh sequences can be successfully used to recover surgical actions from operating room videos. Model ablation studies suggest that action recognition performance is enhanced by composing human mesh representations with lower arm, pelvic, and cranial joints. Conclusion: Our work presents promising opportunities for OR video review programs to study human behavior in a systematic, scalable manner.
KW - Action recognition
KW - artificial intelligence
KW - computer vision
KW - deep learning
KW - human mesh recovery
KW - operating room
KW - surgery
UR - http://www.scopus.com/inward/record.url?scp=85198331041&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85198331041&partnerID=8YFLogxK
U2 - 10.20517/ais.2024.19
DO - 10.20517/ais.2024.19
M3 - Article
AN - SCOPUS:85198331041
SN - 2771-0408
VL - 4
SP - 92
EP - 108
JO - Artificial Intelligence Surgery
JF - Artificial Intelligence Surgery
IS - 2
ER -