Abstract
Purpose: The expanding capabilities of surgical systems bring with them increasing complexity in the interfaces that humans use to control them. Robotic C-arm X-ray imaging systems, for instance, often require manipulation of independent axes via joysticks, while higher-level control options hide inside device-specific menus. The complexity of these interfaces hinder “ready-to-hand” use of high-level functions. Natural language offers a flexible, familiar interface for surgeons to express their desired outcome rather than remembering the steps necessary to achieve it, enabling direct access to task-aware, patient-specific C-arm functionality. Methods: We present an English language voice interface for controlling a robotic X-ray imaging system with task-aware functions for pelvic trauma surgery. Our fully integrated system uses a large language model (LLM) to convert natural spoken commands into machine-readable instructions, enabling low-level commands like “Tilt back a bit,” to increase the angular tilt or patient-specific directions like, “Go to the obturator oblique view of the right ramus,” based on automated image analysis. Results: We evaluate our system with 212 prompts provided by an attending physician, in which the system performed satisfactory actions 97% of the time. To test the fully integrated system, we conduct a real-time study in which an attending physician placed orthopedic hardware along desired trajectories through an anthropomorphic phantom, interacting solely with an X-ray system via voice. Conclusion: Voice interfaces offer a convenient, flexible way for surgeons to manipulate C-arms based on desired outcomes rather than device-specific processes. As LLMs grow increasingly capable, so too will their applications in supporting higher-level interactions with surgical assistance systems.
Original language | English (US) |
---|---|
Pages (from-to) | 1165-1173 |
Number of pages | 9 |
Journal | International Journal of Computer Assisted Radiology and Surgery |
Volume | 19 |
Issue number | 6 |
DOIs | |
State | Published - Jun 2024 |
Keywords
- Autonomous imaging
- Image-guided surgery
- Large language models
- Machine learning
- Speech-to-text
ASJC Scopus subject areas
- Surgery
- Biomedical Engineering
- Radiology Nuclear Medicine and imaging
- Computer Vision and Pattern Recognition
- Computer Science Applications
- Health Informatics
- Computer Graphics and Computer-Aided Design