Abstract: SA-PO1096
Using a Large Language Model to Process Kidney Transplant Biopsy Reports for Accurate Data Extraction for Clinical Research
Session Information
- Transplantation: Clinical - Postkidney Transplant Outcomes and Potpourri
November 08, 2025 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Transplantation
- 2102 Transplantation: Clinical
Authors
- Ni, Luke, University of Pennsylvania, Philadelphia, Pennsylvania, United States
- Tandukar, Srijan, University of Pennsylvania, Philadelphia, Pennsylvania, United States
Background
Manually extracting data from structured text is both time-consuming and labor-intensive, often resulting in errors that reduce efficiency and data reliability. To address this, we deployed a large language model (LLM) to extract data from transplant kidney biopsies, benchmarking its performance in terms of accuracy and speed for data extraction.
Methods
Kidney transplant biopsy reports were obtained through SQL data extraction from Clarity, the programming interface of electronic health record system, EPIC. Raw data obtained in .csv files were loaded in Python. A 12-billion parameter LLM was selected. Per HIPAA guidelines, no application programming interface (API) were used. All data processing was run on a device with NVIDIA RTX 3070 Ti GPU for local computation. Variables and data type were defined for data extraction a priori. Prompts were engineered for optimal data extraction by trial and error using a one-shot learning approach. The outputs were cross-checked through manual validation to ensure correctness.
Results
The algorithm for traditional and LLM-based data extraction is presented in Figure 1 and the output data extraction is presented in Table 1.
Conclusion
Data extraction from unstructured text has traditionally been done manually by research assistants and trainees. However, we are able to demonstrate comparable accuracy in data extraction using LLM, saving time and resources for the researchers.
Figure 1. Manual versus Large Language Model Extraction Workflow Diagram for Kidney Biopsies
Table 1. Large Language Model Extraction Data of Kidney Biopsies