Society for Surgery of the Alimentary Tract

SSAT Home SSAT Home Past & Future Meetings Past & Future Meetings
Facebook X Linkedin YouTube

Back to 2025 Abstracts


LARGE LANGUAGE MODELS ENABLE ACCURATE DATA EXTRACTION AND CURATION FROM SYNOPTIC RADIOLOGY REPORTS FOR PANCREATIC CYST SURVEILLANCE
Ankur P. Choubey*1, Emanuel Eguia1, Alexander Hollingsworth2, Subrata Chatterjee2, Remo Alessandris1, Misha Armstrong1, Emily Manin1, Lily V. Saadat1, Jenny Flood1, Avijit Chatterjee2, Vinod Balachandran1, Jeffrey Drebin1, T. Peter Kingham1, Michael D'Angelica1, William Jarnagin1, Alice Wei1, Vineet S. Rolston3, Mark A. Schattner3, Kevin Soares1
1Hepatopancreatobiliary Surgery, Memorial Sloan Kettering Cancer Center Department of Surgery, New York, NY; 2Memorial Sloan Kettering Cancer Center, New York, NY; 3Memorial Sloan Kettering Cancer Center Department of Medicine, New York, NY4

Introduction: Intraductal mucinous neoplasm (IPMN) are pre-malignant lesions that require long-term surveillance. Manual curation of radiographic features in cyst registries for data abstraction and longitudinal evaluation is time consuming and limits widespread implementation. Our aim was to examine the accuracy and feasibility of using large language models (LLMs) to extract clinical variables from radiology reports.
Methods: A single center retrospective study was performed including all patients under surveillance for pancreatic cysts. Five radiographic elements used to monitor cyst progression were included for evaluation: cyst size, main pancreatic duct (MPD) dilation ?5mm, MPD size, branch duct dilation, and presence of a solid component. LLMs on the OpenAI GPT-4 platform were employed to extract elements of interest using a zero-shot learning approach without any training data with prompting to facilitate annotation. A manually annotated institutional cyst database was used as the gold standard for comparison and to determine accuracy, sensitivity, and specificity.
Results: Overall, 3199 scans from 991 patients were included. LLMs successfully extracted the selected radiographic elements with high accuracy. Among categorical variables, LLMs demonstrated accuracy rates of 98% for MPD dilation, 95% for branch duct dilation, and 97% for solid component compared to the manually annotated database. Accuracy rates for numerical data were 91% for cyst size and 97% for MPD size. Sensitivity ranged from 72% for presence of solid component to 97% for cyst size. Specificity varied from 89% for cyst size to 99% for presence of solid component.
Conclusion: LLMs can accurately extract and curate data from synoptic radiology reports for pancreatic cyst surveillance and can be reliably used to assemble longitudinal databases. Future application of this work may potentiate the development of artificial intelligence-based surveillance models.


Back to 2025 Abstracts