ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: SA-PO1227

Automated Structured Medical Data Extraction from Audio Recordings of Outpatient Nephrology Encounters Using Large Language Models

Session Information

Category: CKD (Non-Dialysis)

  • 2302 CKD (Non-Dialysis): Clinical, Outcomes, and Trials

Authors

  • Neri, Luca, Renal Research Institute, New York, New York, United States
  • Kovarova, Vratislava, Fresenius Medical Care AG, Bad Homburg, HE, Germany
  • Morillo Navarro, Kevin, Fresenius Medical Care AG, Bad Homburg, HE, Germany
  • Silvestre-Llopis, Jordi, Fresenius Medical Care AG, Bad Homburg, HE, Germany
  • Nehezova, Katarina, Fresenius Medical Care AG, Bad Homburg, HE, Germany
  • Barbieri, Carlo, Fresenius Medical Care Italia SpA, Palazzo Pignano, Lombardia, Italy
  • Bellocchio, Francesco, Renal Research Institute, New York, New York, United States
  • Usvyat, Len A., Renal Research Institute, New York, New York, United States
  • Casana-Eslava, Raul Vicente, Fresenius Medical Care AG, Bad Homburg, HE, Germany
Background

Physician documentation often imposes a substantial time burden and can be incomplete. We developed and tested an end-to-end tool leveraging large language model for structured medical data extraction from the encounter recording, with the goal of generating complete visit summaries minimizing manual post-editing.

Methods

Fifteen outpatient follow-up visits for non-dialysis dependent chronic kidney disease, conducted in a Nephrocare Clinic in Czech Republic, were processed sequentially: (1) audio→text transcription with Whisper; (2) machine translation to English via GPT-4; and (3) extraction of 25 ontology-defined data elements (Visit Type, History, Condition Evaluation, Vitals, Medication Review) using 33 GPT-4 prompts that specified each element’s required fields. All elements were counted in the evaluation. Accuracy was tested against annotations made by the attending physicians.

Results

Against gold-standard annotations, the pipeline achieved a macro-averaged F1 of 0.87, with 100% precision and 78% recall overall. Visit type, vitals, anthropometrics, recommendations, some elements of physical examinations and medical history exceeded 0.92 F1. Lower accuracy was obtained for lab values and medication name (F1=0.71 and F1=0.67). We observed zero false positives (i.e., no hallucinations) in 5 hours of recording.

Conclusion

GPT-4 can reliably automate transcription, translation, and structured extraction from non-English clinical audio, achieving high accuracy and minimal hallucination without manual correction. Future work will validate on larger, multilingual cohorts, perform granular error analyses, and pilot real-time scribing with physician oversight.

Digital Object Identifier (DOI)