ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: TH-PO0735

Deep Learning Detection of Stain- and Center-Specific Bias in Assessing Multicenter Lupus Nephritis Whole-Slide Images

Session Information

Category: Glomerular Diseases

  • 1402 Glomerular Diseases: Clinical, Outcomes, and Therapeutics

Authors

  • Daouk, Mohammad, University of Houston, Houston, Texas, United States
  • Becker, Jan U., Universitatsklinikum Koln Klinische Infektiologie, Cologne, NRW, Germany
  • Kambham, Neeraja, Stanford University, Stanford, California, United States
  • Chang, Anthony, The University of Chicago, Chicago, Illinois, United States
  • Mohan, Chandra, University of Houston, Houston, Texas, United States
  • Nguyen, Hien V, University of Houston, Houston, Texas, United States
Background

Convolutional neural networks (CNNs) built to grade Lupus Nephritis (LN) glomeruli may learn slide-specific “shortcuts” such as stain or scanner brand, rather than pathology. We measured the size of this bias in a multi-institutional cohort.

Methods

From 363 WSIs (4 stains (H&E, PAS, Masson trichrome, silver; 3 centers) we extracted 9 674 glomerular patches at three magnifications, z-normalized, and split by WSI into 85 % train and 15 % hold-out. On the 85 % set we ran 5-fold cross-validation (each fold: 80 % train, 20 % val). ResNet-18 (ImageNet-pretrained) was fine-tuned in three regimes:
Single-head predicting stain type.
Single-head predicting center type.
Dual-head with a lesion head (proliferative vs non-proliferative) plus a bias head (stain or center) whose loss was scaled by λ∈{10-1,…,10-4,0,–10-4,…,–10-1}. The weights control how strongly the model relies on the bias head during training. Negative λ inverts the bias target. Uncertainty was estimated via 50 Monte-Carlo dropout passes.

Results

At λ=10-1, stain=1.00, center=0.99, lesion=0.87. At λ=10-4, stain=0.30, center=0.88, lesion=0.86. For λ≤–10-3 bias head was silenced (≤0.05 accuracy), lesion dropped to 0.80. Across 45 models, stain vs lesion: Spearman r=0.57 (p<0.001); center vs lesion: r=0.49 (p=0.001).

Conclusion

Standard CNN pipelines readily exploit stain- and center-specific artefacts embedded in multi-center LN datasets, creating an illusion of high diagnostic performance. Mandatory bias audits, stain-invariant color augmentation, domain-adversarial or contrastive training, and cross-center external validation are essential before clinical deployment. Paradoxically, the strong stain signal also suggests that stain-dependent morphological information could be harnessed deliberately provided models are insulated from confounding. Rigorous characterization and mitigation of technical shortcuts are prerequisites for trustworthy AI-assisted renal pathology.

Funding

  • Other NIH Support

Digital Object Identifier (DOI)