IMPLEMENTATION OF AN AUTOMATED DATA ABSTRACTION WORKFLOW TO FACILITATE QUALITY IMPROVEMENT AND RESEARCH
Anai N. Kothari*1, Elsa M. Arvide1, Andrew Trans2, Timothy E. Newhook1, Morgan L. Bruno1, Whitney L. Dewhurst1, Thomas Aloia1, Stephen G. Swisher1, Jeffrey E. Lee1, Matthew Katz1, Jean-Nicolas Vauthey1, Ching-Wei D. Tzeng1
1The University of Texas MD Anderson Cancer Center Division of Surgery, Houston, TX; 2Palantir Technologies, San Francisco, CA
Extracting data from the EHR can be time-consuming and resource-intensive. Reasons for this include underperformance of commercial EHR-based tools, large volumes of unstructured data, and inaccurate manual data entry. A possible solution is developing standardized EHR documents and pairing those with tools for data abstraction. The objective of this study was to create an automated workflow to abstract data from discharge summaries and to measure the performance of this approach.
A synoptic discharge summary note template was created for the hepatopancreatobiliary surgery service and implemented in October 2017. This note template included fields for date of ambulation, nasogastric tube removal, foley catheter removal, diet initiation, and drain removal, as well as discharge pain management including opioid dosing (Figure 1). Workflow for automated abstraction was: 1) plain text of discharge summaries extracted using Palantir Foundry data platform; 2) custom regular expression to define text patterns for each data element; 3) conversion to tabular format for analysis (Figure 2). Performance of this approach was measured by comparing data quality and extraction time versus manual abstraction by 2 clinician reviewers on a sample of 20 patients.
A total of 1,682 discharges (1,447 patients) between October 1, 2017, and October 1, 2020, included a discharge summary. Of these, 1,049 (62.4%) utilized the standardized discharge summary template. Adoption of the template increased over time (45.0% in year 1 to 94.8% in year 3). Manual abstraction of the 6 data elements without assistance from the templated document took 219 (IQR: 194 - 270) seconds per patient compared to near instantaneous results with the automated approach. Automated abstraction had an accuracy of 97.1%. For just 6 simple data elements, a transition to a completely automated approach could save approximately 60.8 hours of uninterrupted human data collection per 1,000 patients.
Developing a structured discharge summary template and pairing it with an automated data extraction workflow facilitates accurate and efficient data abstraction. A similar approach could be used for other previously unstructured data streams including operative reports, daily progress notes, and pathology reports. This strategy minimizes resource-intensive manual data abstraction and does not rely on limited built-in commercial tools.
Back to 2021 Abstracts