ASN's Mission

ASN leads the fight to prevent, treat, and cure kidney diseases throughout the world by educating health professionals and scientists, advancing research and innovation, communicating new knowledge, and advocating for the highest quality care for patients.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005


The Latest on Twitter

Kidney Week

Abstract: PO0525

Identifying and Clustering CKD Progression Trajectories Using Machine Learning

Session Information

Category: CKD (Non-Dialysis)

  • 2101 CKD (Non-Dialysis): Epidemiology, Risk Factors, and Prevention


  • Abdul Sultan, Alyshah, AstraZeneca, Cambridge, United Kingdom
  • Rhodes, Kirsty, AstraZeneca, Cambridge, United Kingdom
  • Doulis, Michail, AstraZeneca, Gothenburg, Sweden
  • Brookes-Smith, Irena, AstraZeneca, Cambridge, United Kingdom
  • Faria, Jolyon S., AstraZeneca, Cambridge, United Kingdom
  • Salazar, Jose Domingo, AstraZeneca, Cambridge, United Kingdom
  • James, Glen, AstraZeneca, Cambridge, United Kingdom
  • MacPhee, Iain, AstraZeneca, Cambridge, United Kingdom
  • Unwin, Robert J., AstraZeneca, Cambridge, United Kingdom
  • Wright, David, AstraZeneca, Cambridge, United Kingdom
  • Patel, Mishal, AstraZeneca, Cambridge, United Kingdom
  • Metcalfe, Paul D., AstraZeneca, Cambridge, United Kingdom
  • Jermutus, Lutz, AstraZeneca, Cambridge, United Kingdom

There is evidence suggesting that estimated glomerular filtration rate (eGFR) slope can be used as a surrogate clinical endpoint in renal clinical trials. However, there are limited data on the characteristics of fast and slow progressors based on eGFR slope from large population-based studies.


We identified CKD patients (based on two consecutive eGFRs of <75ml/min/1.73m2 recorded more than 90 days apart) aged ≥18 years from the UK Clinical Practice Research Datalink (CPRD) between 2004 and 2019. Estimated GFR measurements over a 3-year observation period post-index date (date of 2nd eGFR measurement) were extracted. Patients were clustered based on their eGFR trajectories using statistical (linear mixed effect models (LMM)) and machine learning techniques (unsupervised machine learning and Bayesian approaches). Association between trajectory clusters and all-cause mortality was assessed using Cox regression analysis.


Preliminarily, 407,108 patients with 1.8 million eGFR measurement (median 4 (IQR: 2-6) eGFR measurements per patient) were identified. Using LMM, we found 5% of patients declined rapidly with an average rate of eGFR change per year -4.78 (95%CI: -9.40 to -3.28) whereas the majority (95%) remain stable or progressed slowly. A distinct fast progressing cluster was also detected using unsupervised machine learning and Bayesian methods which showed broadly linear patterns. Overall, there was an agreement between all three clustering approaches. These findings were replicated in the validation dataset showing consistent findings. Compared to stable/slow progressors, fast progressors were 3 times more likely (Hazard Ratio (HR)=2.82: 95%CI 2.75-2.90) to die following the 3-year observation period.


A clear fast progressing cluster was identified with an average eGFR decline of ≥5 ml/min/1.73m2 per year with a higher risk of all-cause mortality compared to other clusters. Whilst Bayesian and unsupervised machine learning methods can detect non-linear patterns, we found broadly linear trajectories.


  • Commercial Support