Abstract: TH-PO021
Machine Learning Models for IgA Nephropathy Diagnosis: A Retrospective Study on Predictive Performance and Influential Variables
Session Information
- AI, Digital Health, Data Science - I
November 02, 2023 | Location: Exhibit Hall, Pennsylvania Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Augmented Intelligence, Digital Health, and Data Science
- 300 Augmented Intelligence, Digital Health, and Data Science
Authors
- Noda, Ryunosuke, St. Marianna University School of Medicine, Kawasaki, Kanagawa, Japan
- Ichikawa, Daisuke, St. Marianna University School of Medicine, Kawasaki, Kanagawa, Japan
- Shibagaki, Yugo, St. Marianna University School of Medicine, Kawasaki, Kanagawa, Japan
Background
IgA nephropathy often requires therapeutic modalities associated with potential complications, such as steroids, and so requires definitive diagnosis by invasive renal biopsy rather than non-invasive clinical diagnostic measures. Although the efficacy of machine learning (ML) for diagnostic purposes has been underscored in recent years, its application in the context of nephrology remains unclear. In this study, we investigated the diagnostic performance of ML algorithms for IgA nephropathy.
Methods
We conducted a retrospective cohort study on 1,419 cases that underwent renal biopsy in our hospital from January 2006 to September 2022. Cases with indeterminate diagnoses and overlapping pathologies were excluded. The remaining cases were randomly divided into train and test datasets at an 8:2 ratio. We utilized a total of 44 variables, which included age at the time of renal biopsy, gender, blood tests, and urinalysis, as explanatory variables. Subsequently, multiple machine learning algorithms were evaluated, including K-nearest neighbor, support vector machines, random forest, extreme gradient boosting, and LightGBM, using Python. The model with the highest average Area Under the Curve (AUC) was identified through stratified 5-fold cross-validation in the train set. Thereafter, we compared the AUC of this model in the test set to that of the logistic regression (LR) model in the test set. To interpret the predictive outcomes, we deployed the SHapley Additive exPlanations (SHAP) methodology.
Results
In the train set, LightGBM outperformed the other ML models, exhibiting the highest AUC of 0.92, which also mirrored its performance in the test set (LightGBM 0.92, LR 0.88). SHAP analysis unveiled that the variables contributing to prediction were, in descending order, urinary red blood cell count, serum albumin, IgA/C3 ratio, urinary protein/creatinine ratio, and age.
Conclusion
Our study indicated that ML, particularly the LightGBM model, could improve IgA nephropathy diagnostic performance beyond conventional logistic regression. The influential variables identified were consistent with those reported in the existing literature. This highlights the potential utility of ML in IgA nephropathy diagnosis, necessitating further validation for clinical use.
Funding
- Private Foundation Support