Abstract: TH-PO384
Prediction Model from Big Data for Rapid GFR Decline in CKD Patients by Machine Learning Technique
Session Information
- CKD: Risk Scores and Translational Epidemiology
November 07, 2019 | Location: Exhibit Hall, Walter E. Washington Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: CKD (Non-Dialysis)
- 2101 CKD (Non-Dialysis): Epidemiology, Risk Factors, and Prevention
Authors
- Inaguma, Daijo, Fujita Health University School of Medicine, Toyoake, Aichi, Japan
- Yanagiya, Ryosuke, Fujita Health University School of Medicine, Toyoake, Aichi, Japan
- Koseki, Akira, IBM, Tokyo, Japan
- Iwamori, Toshiya, IBM, Tokyo, Japan
- Kudo, Michiharu, IBM, Tokyo, Japan
- Yuzawa, Yukio, Fujita Health University School of Medicine, Toyoake, Aichi, Japan
Background
Recent studies have focused on kidney function trajectory because it might be related to the incidence of cardiovascular (CV) disease and all-cause mortality. We aimed to investigate risk factors for rapid GFR decline and create a machine learning-based predictive model by using one big hospital database.
Methods
We used a database derived from the Fujita Health University Hospital. Medical data were available for 120,689 eGFR-recorded patients in this study. Among them, 21,198 patients met the CKD criteria. Rapid GFR decline in patients with CKD was defined as eGFR decline of ≥30% per 2 years; we used average eGFR of past 90 days to avoid temporal spikes of measurements. We then selected unique 5,818 CKD patients with rapid GFR decline, from which 10,093 samples of rapid GFR decline were obtained. We built a prediction model to classify rapid GFR decline using machine learning algorithms including logistic regression, decision tree, and random forest. We used explanatory variables including 90-day past data of eGFR, proteinuria, serum creatinine (Cr), blood pressure, body mass index, sex, and age. Among those longitudinal data, we used average, standard deviation (SD), and exponentially smoothed average (ESA) to form explanatory variables for the prediction model. Contribution to rapid GFR decline was examined by weight of each variable.
Results
We used serial 10,093 data each from 5,818 CKD patients with rapid GFR decline and without rapid GFR decline for the prediction model. There were no significant differences in age and sex between the two groups. Mean proteinuria, ESA of proteinuria, SD of serum Cr, ESA of serum Cr, and SD of Hematocrit were associated with rapid GFR decline in the random forest model. Moreover, the random forest model predicted rapid GFR decline with an accuracy of 0.75 (area under the curve). Meanwhile, area under the curves for predicting rapid GFR decline were 0.69 and 0.69 in the logistic regression and decision tree models, respectively. By the decision tree analysis, the incidence of rapid GFR decline was 90% if the following criteria were fulfilled: 4+ or more of mean urine protein; ≤1.33 ESA of serum Cr; ≤1.03 SD of hemoglobin.
Conclusion
The random forest model by machine learning could be useful to identify patients with rapid GFR decline in real world clinical setting.