ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005


The Latest on Twitter

Kidney Week

Abstract: SA-PO392

Studying Rare Disease Using an Electronic Health Record (EHR) and Machine Learning Based Approach: The Kaiser Permanente Southern California (KPSC) Membranous Nephropathy (MN) Cohort

Session Information

Category: Glomerular Diseases

  • 1203 Glomerular Diseases: Clinical, Outcomes, and Trials


  • Sun, Amy Z., Kaiser Permanente Los Angeles Medical Center, Los Angeles, California, United States
  • Shu, Yu-Hsiang, Kaiser Permanente, Pasadena, California, United States
  • Harrison, Teresa N., Kaiser Permanente, Pasadena, California, United States
  • O'Shaughnessy, Michelle M., Stanford University, Palo Alto, California, United States
  • Sim, John J., Kaiser Permanente Los Angeles Medical Center , Los Angeles, California, United States

Large scale epidemiology studies on glomerular disease such as MN are needed. Identifying MN patients using EHR is limited by the need to manually review kidney biopsy pathology reports (gold standard diagnostic test) to confirm cases. An ability to accurately identify patients with MN using only structured EHR data (e.g. diagnosis codes) would enhance the efficiency and scale of observational and comparative effectiveness studies within this population.


A retrospective cohort study was performed among KPSC patients who underwent a kidney biopsy 6/28/1999-6/25/2015 (n=5542). Biopsies were manually reviewed and designated as MN or non-MN. The sensitivity (SN), specificity (SP), and positive predictive value (PPV) of ICD9 diagnosis codes appearing w/in 1 yr after biopsy were determined using 2 approaches: 1) Clinical (581.1, 582.1, or 583.1, MN specific codes) AND 2) Machine learning (ICD9 codes, kidney-related or not, with highest predictive performance).


Among biopsy proven MN cases, 59% and 86% received a MN diagnosis w/in 30 days and 1 yr after biopsy, respectively. The SN and SP of this clinical approach were 86% and 76% respectively, but the PPV was 26%. If >2 codes were required, SP increased and PPV improved, but SN declined. Machine learning approach detected that using just 2 ICD9 codes (581.1 or 583.1) improved SP to 94% and PPV to 58% with a decrease in SN to 83%. SP was 98%, PPV 78%, and SN 64% if ≥3 codes was required.


Our study is the one of the first to leverage the EHR (ICD codes) to identify patients with biopsy-proven MN. Data-driven approaches showed better overall performance than a solely clinical-based approach. Expanding machine learning approaches to include demographics, additional clinical data, or free text from pathology reports might further increase diagnostic performance.