Abstract: TH-PO875
Using Machine Learning to Identify People Considering Living Kidney Donation on Reddit
Session Information
- Transplantation: Donation and Access
November 02, 2023 | Location: Exhibit Hall, Pennsylvania Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Transplantation
- 2102 Transplantation: Clinical
Authors
- Waterman, Amy D., Houston Methodist Hospital, Houston, Texas, United States
- Nielsen, Joshua E., University of Louisville, Louisville, Kentucky, United States
- Davis, LaShara A., Houston Methodist Hospital, Houston, Texas, United States
- Chen, Xiaoyu, University of Louisville, Louisville, Kentucky, United States
- Gentili, Monica, University of Louisville, Louisville, Kentucky, United States
Background
Machine learning (ML) strategies may help to identify potential kidney recipients and living kidney donors on online digital platforms, who are considering living kidney donation (LKD), to target helpful information to them. Our study’s aims were to: (1) Identify people sharing their personal LKD experiences on the digital platform Reddit, (2) Determine whether ML models could distinguish between simplified or more nuanced labels.
Methods
A multidisciplinary team of engineers and transplant experts created and piloted a user labeling system to code 3,292 posts created by Reddit users from 2010-2023 using 3 simplified labels and six nuanced labels (Table). To validate the system, four team members independently labeled the same 100 posts manually, and definitions were refined until reaching unity. The remaining 3,192 posts were manually labeled using refined definitions. We explored the ability to automate this classification process using an ML model known as Bidirectional Encoder Representations from Transformers (BERT). Two models were trained to predict simplified and nuanced labels, respectively. Exploratory work using ChatGPT was also included for automatic classification.
Results
The BERT model accurately classified the simplified labels with 87% accuracy, but when trained to classify all six nuanced labels, was only able to perform with 67.1% accuracy. Preliminary experiments using ChatGPT showed poorer alignment with automated user labeling than ML models (69% and 45.3% for simplified and nuanced labels, respectively).
Conclusion
Using expert defined classification criteria combined with ML methods, it is possible to identify those who may be interested in LKD on digital platforms. Current methods perform better on more simplified classifications, but improvements can be made and advances in ML may increase the predictive power of future models. Future work will explore ways to enhance the BERT method using integration with ChatGPT.