Abstract: TH-PO020
Early Identification of the Need for CRRT in Children: A Machine Learning Approach
Session Information
- AI, Digital Health, Data Science - I
November 02, 2023 | Location: Exhibit Hall, Pennsylvania Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Augmented Intelligence, Digital Health, and Data Science
- 300 Augmented Intelligence, Digital Health, and Data Science
Authors
- Menon, Shina, Seattle Children's Hospital, Seattle, Washington, United States
- Li, Qingyang, Seattle Children's Hospital, Seattle, Washington, United States
- Van De Sompele, David R., Seattle Children's Hospital, Seattle, Washington, United States
- Doud, Alexander James, Seattle Children's Hospital, Seattle, Washington, United States
- Vong, Kin L., Seattle Children's Hospital, Seattle, Washington, United States
- Bourdrez, Hillary, Seattle Children's Hospital, Seattle, Washington, United States
- Wainwright, Mark Wainwright, Seattle Children's Hospital, Seattle, Washington, United States
Background
Continuous Renal Replacement Therapy (CRRT) is the preferred modality of renal replacement in critically ill children with acute kidney injury. We sought to develop a machine learning (ML) model to identify patients at least one day before the CRRT initiation.
Methods
We used data from patients admitted to the pediatric ICU with a length of stay >24 hours from 2008-2021. Candidate data elements were selected based on clinical expertise and anticipated United States Core Data for Interoperability requirements. To address the class imbalance problem, we oversampled data from the CRRT patient group and then selected using a random 24-hour window preceding the outcome of interest for each new sample. By including test cases with data collected 1 day before the CRRT decision, this process ensures earlier identification of the need for CRRT. We engineered features by vectorizing the patient state and then selected features using correlation-based feature selection (CFS) and information gain (IG) feature selection in combination with 5 methods of classification: random forest, logistic regression, Naïve Bayes (NB), support vector machine, and extreme gradient boosting. Data curation, analyses and development were conducted using Python (version 3.9).
Results
19457 PICU encounters (159 received CRRT) were identified and stratified into training(85%) and test(15%) sets after data augmentation. Models constructed using IG contained 436 features versus 42 features using CFS. The top features included blood urea nitrogen, creatinine, platelets, Glasgow coma score, and fluid balance. NB model outperformed the other approaches. NB with IG feature selection achieved area under the ROC curve (AUROC) 0.925, area under the precision-recall curve (AUPRC) 0.813, accuracy 0.88, F1 score 0.88, whereas the NB model with CFS achieved comparable accuracy and F1-score but the performance was slightly better with 0.945 AUROC, 0.890 AUPRC.
Conclusion
We present a model that uses ML and leverages structured, time-series data to identify patients likely to need CRRT in the ICU. The study's limitations include being conducted in a single institution and not considering unstructured data, which may improve the model's performance.