ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: SA-PO0020

Fine-Tuned Transformers Illuminate Kidney Single-Cell Heterogeneity

Session Information

Category: Artificial Intelligence, Digital Health, and Data Science

  • 300 Artificial Intelligence, Digital Health, and Data Science

Authors

  • Ziyadeh, Elias Mark, University of Pennsylvania, Philadelphia, Pennsylvania, United States
  • Li, Chenyu, University of Pennsylvania, Philadelphia, Pennsylvania, United States
  • Susztak, Katalin, University of Pennsylvania, Philadelphia, Pennsylvania, United States

Group or Team Name

  • Susztak Lab.
Background

Kidney disease remains a global health burden, in part due to the kidney's complex cellular landscape and subtle transcriptional changes that challenge standard analysis methods. While single-cell RNA sequencing (scRNA-seq) enables high-resolution profiling, downstream tools often rely on distance metrics that may overlook injury states or rare cell subtypes. Transformer-based models like Geneformer and Universal Cell Embeddings (UCE) learn contextual gene expression patterns from large, diverse datasets, but kidney cells are underrepresented in these corpora. We hypothesized that fine-tuning on kidney-specific data could improve model accuracy and biological insight.

Methods

We compiled a high-quality kidney atlas of 720,924 scRNA-seq profiles from healthy and diseased human samples. After preprocessing and batch correction, we fine-tuned Geneformer for three tasks: (1) classifying healthy vs. injured tubular cells, (2) predicting transcription factor (TF) dose responses to fibrotic stimuli, and (3) simulating gene knockouts. UCE was adapted to create kidney-specific embeddings using masked-label training. Models were evaluated via F1 scores (classification), AUC (dose-response), fibrosis score shifts (perturbation), and silhouette scores (clustering).

Results

Fine-tuned Geneformer distinguished injured from healthy distal convoluted and connecting tubule cells with an F1 of 0.909, and thick ascending limb cells with 0.891. It predicted TF dose-sensitivity with an AUC of 0.86. Simulated knockout of fibrosis-linked genes (e.g., TMCO1) led to anti-fibrotic signature shifts aligned with prior data. Kidney-specific UCE embeddings produced improved cell type clusters and uncovered subpopulations not resolved by unfined models.

Conclusion

Kidney-specific fine-tuning significantly improves the performance of transformer-based models in resolving injury states, predicting responses, and simulating gene perturbations. This approach combines the generalizability of foundation models with the specificity of renal data, laying the groundwork for next-generation biomarker discovery and therapeutic development in nephrology.

Funding

  • NIDDK Support

Digital Object Identifier (DOI)