ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: TH-PO001

Revolutionizing Kidney Transplantation Education: Evaluating ChatGPT's Accuracy on Core Questions

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science

Authors

  • Garcia Valencia, Oscar Alejandro, Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Bentall, Andrew J., Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Issa, Naim S., Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • El Ters, Mireille, Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Craici, Iasmina, Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Thongprayoon, Charat, Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Davis, Paul W., Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
  • Cheungpasitporn, Wisit, Mayo Clinic Division of Nephrology and Hypertension, Rochester, Minnesota, United States
Background

ChatGPT is a state-of-the-art, high-capacity language model that has shown proficiency in various processing tasks, such as generating responses resembling those of human beings. While there is growing speculation about ChatGPT serving as a potential substitute for physicians in a clinical environment, its proficiency in clinical subspecialties remains unclear. The aim of this study is to evaluate the performance of ChatGPT in answering questions related to kidney transplantation.

Methods

We conducted an evaluation of ChatGPT's accuracy in answering questions related to kidney transplantation using the Nephrology Self-Assessment Program and Kidney Self-Assessment Program of the American Society of Nephrology (ASN). Questions containing images were excluded due to current limitations in ChatGPT's image processing capabilities. A total of 117 questions were included in the evaluation, 60 from NephSAP and 57 from KSAP. Each question bank was executed twice using ChatGPT (Mar 14 version, OpenAI), and the level of concordance between the runs, conducted 2 weeks apart, was determined.

Results

On the 60 NephSAP questions, ChatGPT achieved accuracies of 58.3% and 58.3% on the 1st and 2nd runs, respectively, with an overall concordance of 86.7%. Similarly, on the KSAP question banks (57 questions), the accuracy of ChatGPT was 47.4% and 42.1% on the 1st and 2nd runs, respectively, with a concordance of 68.4%. The overall concordance between the two runs was 77.8%. The concordance in correct answers was found to be higher than that of incorrect answers (43.6 vs 34.2%).

Conclusion

Upon evaluating ChatGPT's performance in answering questions related to kidney transplantation, we found that its accuracy was below the passing threshold set by the ASN for both Nephsap and KSAP. Excluding questions with clinical images, the overall accuracy of ChatGPT (Nephsap + KSAP) was found to be 53% on the 1st try and 50.4% on the 2nd one. From this we conlclude that the current version of ChatGPT is not yet a reliable medical education tool for training nephrologists, and requires further development.