ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: TH-PO006

Exploring ChatGPT's Aptitude in Essential Concepts of Hypertension

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science

Authors

  • Gonzalez Suarez, Maria Lourdes, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Schwartz, Gary L., Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Gregoire, James Robert, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Erickson, Stephen B., Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Thongprayoon, Charat, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Cheungpasitporn, Wisit, Mayo Clinic Minnesota, Rochester, Minnesota, United States
Background

ChatGPT is a state-of-the-art language model with human-like response generation capacity for various tasks. While there are debates about the possibility of ChatGPT replacing clinicians in clinical settings, its competence in nephrology, specifically in hypertension, remains uncertain. This study aims to assess ChatGPT's proficiency in addressing fundamental queries related to the diagnosis, treatment, and management of hypertension.

Methods

Using the Nephrology Self-Assessment Program (NephSAP) issues 2016-2022: V15N1, V17N1, V19N1, V21N4 from the American Society of Nephrology, we conducted a rigorous evaluation of ChatGPT's accuracy in answering questions related to hypertension. We excluded questions containing images due to ChatGPT's current limitations in image processing. The analysis included 95 questions from NephSAP. Each question set was executed 3 times using ChatGPT (version Mar 14, OpenAI), and we determined the level of agreement between the initial and subsequent attempts, conducted 2 weeks apart.

Results

Our analysis revealed that ChatGPT achieved accuracies of 65.5% on first attempt, and 76.4 and 78.1 % on second and on third attempts, respectively, for the NephSAP questions. We noted that ChatGPT had a higher level of correct answers compared to incorrect ones, and it improved its knowledge after every attempt (table 1).

Conclusion

Our findings indicate that ChatGPT's accuracy in addressing core concepts related to hypertension management falls below the minimum passing threshold of 75% established by the ASN for nephrologists, with an initial accuracy rate of 65.5%. This emphasizes the need for further development and training to improve ChatGPT's accuracy and consistency in the area of hypertension. Our study's outcomes have significant implications for ChatGPT's potential use as an educational tool for clinicians, highlighting the importance of ongoing research and development to broaden its proficiency in clinical subspecialties.

Accuracy of ChapGPT on Hypertension Questions
KSAP IssueFirst Attempt, (%)Second Attempt, (%)Third Attempt, (%)
V15N16275.979.3
V17N183.393.393.3
V19N153.363.366.6
V21N4*63.373.373.3
Total accuracy65.4876.4578.12

* Questions 1-25