Abstract: FR-PO0448
Assessing Open-Weight, Large Language Model for Symptom Extraction from Dialysis Notes: A Cost-Effective and Privacy-Preserving Approach
Session Information
- Dialysis: Measuring and Managing Symptoms and Syndromes
November 07, 2025 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Dialysis
- 801 Dialysis: Hemodialysis and Frequent Dialysis
Authors
- Baroz, Frederic, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
- Festa, Maria Carolina, Centre Hospitalier de l'Universite de Montreal, Montreal, Quebec, Canada
- Suri, Rita, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
- Mavrakanas, Thomas A., Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
- Beaubien-Souligny, William, Centre Hospitalier de l'Universite de Montreal, Montreal, Quebec, Canada
Background
Hemodialysis often leads to intradialytic symptoms that affect quality of life. Traditional methods to study these rely on patient questionnaires or free-text dialysis notes, which are associated with cost and quality challenges. This project evaluates open-source large language models (LLMs) that can run offline, aiming to extract symptoms from nursing notes while maintaining data privacy.
Methods
We designed 36 information extraction agents by combining four open-source LLMs with distinct prompting approaches, such as in-context learning, chain-of-thought, and rule-based guidance. Each agent identified the presence of four target symptoms across 100 dialysis nursing notes. Outputs were compared to expert annotations performed independently and without knowledge of the agents’ results. Evaluation metrics included accuracy, recall, precision, specificity, and balanced F1 score.
Results
Among 100 nursing notes, chest pain was noted twice, dyspnea 14 times, dizziness 23 times, and nausea 16 times. The best-performing agent, based on Mixtral 8x7B paired with a simple prompt, achieved 93% specificity, 71% sensitivity, 63% precision. Performance remained strong across most symptoms except for nausea, and generally decreased with smaller models or more elaborate prompting strategies (Figure).
Conclusion
This study provides the first assessment of open-source LLMs operating offline on consumer-level hardware for extracting symptoms from dialysis notes. It offers an accessible, low-cost method to support research into the quality of life of dialysis patients.
Performance of the top 6 agents. All agents use Mixtral-8x7B-instruct-v0.1 with various prompting strategies.
Funding
- Government Support – Non-U.S.