ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: TH-PO005

ChatGPT vs. a First-Year Nephrology Fellow in Electrolyte and Acid-Base Disorders

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science

Authors

  • Mekraksakit, Poemlarp, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Krisanapan, Pajaree, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Craici, Iasmina, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Kalantari, Kambiz, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Thongprayoon, Charat, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Cheungpasitporn, Wisit, Mayo Clinic Minnesota, Rochester, Minnesota, United States
Background

ChatGPT is a leading natural language processing model known for its impressive ability to generate human-like responses in various tasks. This study aims to assess ChatGPT's proficiency in addressing electrolyte and acid-base disorders in Nephrology.

Methods

In our study, we used nephSAP and KSAP, provided by the American Society of Nephrology (ASN), to assess ChatGPT's accuracy in answering basic questions about electrolyte and acid-base disorders. Questions with images were excluded as ChatGPT cannot process images. We evaluated a total of 152 questions, with 122 from KSAP and 30 from nephSAP. ChatGPT was tested twice, with the initial and subsequent runs conducted 1 to 2 weeks apart. To compare scores, we considered the performance of a first-year Nephrology fellow who extensively studied this topic. The complete set of questions can be found at https://education.asn-online.org/.

Results

In the 122 KSAP question banks, ChatGPT achieved accuracies of 32.8% and 37.7% on the first and second runs, respectively. In comparison, a first-year Nephrology fellow achieved an accuracy of 76.2%. On the nephSAP question banks, consisting of 30 questions, ChatGPT demonstrated an accuracy of 50% on the initial run and 53.3% on subsequent runs. The first-year Nephrology fellow correctly answered 83% of the questions. Notably, ChatGPT changed its answers on the second run for 56 out of 152 questions (36.8%). Out of these 56 questions, ChatGPT corrected its answers from incorrect to correct in 18 cases, but also changed its answers from correct to incorrect in 10 instances.

Conclusion

ChatGPT's proficiency in addressing electrolyte and acid-base disorders in nephrology is limited. It did not achieve the minimum passing threshold of 75% set by the ASN for nephrologists. Its accuracies were lower compared to a dedicated first-year Nephrology fellow. ChatGPT's responses were inconsistent across different runs. Therefore, ChatGPT is not a suitable replacement for human clinicians in this clinical setting.