ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: FR-PO0006

Machine Learning-Based Predictive Model for CKD Development in a Healthy Population

Session Information

Category: Artificial Intelligence, Digital Health, and Data Science

  • 300 Artificial Intelligence, Digital Health, and Data Science

Authors

  • Kuma, Akihiro, Hyogo Medical University, Nishinomiya, Japan
  • Kanda, Eiichiro, Kawasaki Medical School, Kurashiki, Japan
  • Kato, Akihiko, Kosai Municipal Hospital, Kosai, Japan
  • Kuragano, Takahiro, Hyogo Medical University, Nishinomiya, Japan
Background

Lifestyle-related diseases are known risk factors for the development of chronic kidney disease (CKD), which remains a significant public health concern. Thus, a new predictive system for CKD development is needed for use in an apparently healthy population.

Methods

This study utilized annual health checkup data from Japanese individuals aged 18–65 years collected between 2017–2022. Participants with an estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m2 or proteinuria ≥1+ in 2017 were excluded. CKD development was defined as having an eGFR <60 mL/min/1.73 m2 or proteinuria in 2022. To develop a predictive model for CKD onset over a five-year period, artificial intelligence (AI) was integrated with supervised machine learning techniques. The dataset included blood and urine testresults, as well as responses from self-administered questionnaires. Four algorithms— logistic regression analysis, support vector machine (SVM), random forest, and XGBoost—were trained using data from 2017 to 2021.

Results

Of the 24,558 recruited participants, 9,273 (93% male) met the eligibility criteria. The mean age and baseline eGFR were 36.3 years and 84.2 mL/min/1.73 m2, respectively. A total of 1,041 participants (11.2%) developed CKD. The training dataset comprised 70% of the eligible participants, and the remaining 30% formed the test dataset. Training and testing accuracies for logistic regression, SVM, and random forest were 89% and 88%, 89% and 90%, and 99% and 100%, respectively. In the random forest model, the top five important features for CKD development were baseline eGFR, triglyceride level, mean blood pressure, body mass index (BMI), and low-density lipoprotein cholesterol level. In the XGBoost model, the top five most important features were baseline eGFR (F-score: 97.0), BMI (86.0), mean blood pressure (83.0), high-density lipoprotein cholesterol level (82.0), and gamma-glutamyl transpeptidase level (73.0). The test accuracy of was XGBoost was 88%.

Conclusion

The predictive model can estimate CKD development five years later in an apparently healthy population. The identified risk factors for CKD development included baseline eGFR, mean blood pressure, BMI, and dyslipidemia.

Digital Object Identifier (DOI)