ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

The Latest on X

Kidney Week

ASN / Education & Meetings / Kidney Week /

Please note that you are viewing an archived section from 2025 and some content may be unavailable. To unlock all content for 2025, please visit the archives.

Abstract: FR-PO0024

Accuracy of Large Language Model Chatbots for Hemodialysis Meal Planning

Session Information

Artificial Intelligence and Digital Health at the Bedside
November 07, 2025 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM

Category: Artificial Intelligence, Digital Health, and Data Science

300 Artificial Intelligence, Digital Health, and Data Science

Authors

Shi, Kevin Xin, University of California San Francisco, San Francisco, California, United States

Hamdan, Hiba, University of California Davis, Davis, California, United States

Cheng, Elizabeth, University of California Berkeley, Berkeley, California, United States

Tuot, Delphine S., University of California San Francisco, San Francisco, California, United States

Background

For hemodialysis patients, nutritional counseling is key. Personalized and feasible nutrition counseling is challenging, as dietary habits are shaped by factors such as cultural background and budget. Large language model (LLM) chatbots can potentially improve nutritional counseling, but the accuracy of these tools in this context is unknown.

Methods

Four LLMs, ChatGPT-o3-mini (OpenAI), Claude Sonnet 3.7 (Anthropic), Gemini 2.5 Flash Thinking Experimental (Google), and Llama 3.1 (Meta) were asked to make a culturally-concordant one day meal plan with specified portions and nutrients (calories, protein, fiber, calcium, phosphorus, potassium, and sodium) for 50 simulated hemodialysis patients generated from national US demographic, biometric, and socio-economic data. Nutrient content of LLM meal plans were compared to validated nutrition databases (e.g. USDA, AUSNUT).

Results

The simulated population had a mean age of 63 years, 58% had diabetes, 82% had fixed incomes, and 60% had ethnic cuisine preferences. The stated nutritional content of chatbot meal plans was generally inaccurate. The nutrient components (e.g. calories, protein) of most LLM meal plans fell beyond a 10% error margin compared to reference values more than 50% of the time (Table). Accuracy was worse for micronutrients compared to macronutrients. All LLMs underestimated phosphorus and potassium content in meal plans (Figure).

Conclusion

LLMs failed to generate nutritionally accurate meal plans for hemodialysis patients. Future projects should emphasize focusing model searches on validated data sources.

Accuracy (within 10%) of LLM Outputs

	Calories	Protein	Fiber	Calcium	Phosphorus	Potassium	Sodium
ChatGPT	42%	52%	28%	9%	24%	41%	18%
Claude	38%	34%	24%	8%	12%	10%	16%
Gemini	28%	50%	32%	10%	12%	8%	24%
Llama	12%	28%	26%	36%	6%	20%	8%

Funding

Other NIH Support

Digital Object Identifier (DOI)

doi: 10.1681/ASN.20254x1xptbw

ASN's Mission

Contact ASN

The Latest on X

Accuracy of Large Language Model Chatbots for Hemodialysis Meal Planning

Abstract: FR-PO0024

Accuracy of Large Language Model Chatbots for Hemodialysis Meal Planning

Session Information

Category: Artificial Intelligence, Digital Health, and Data Science

Authors

Kevin Xin Shi, MD

Hiba Hamdan, MBBS, MPH

Elizabeth Cheng

Delphine S. Tuot, MD

Background

Methods

Results

Conclusion

Accuracy (within 10%) of LLM Outputs

Funding

Digital Object Identifier (DOI)