ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005


The Latest on X

Kidney Week

Please note that you are viewing an archived section from 2023 and some content may be unavailable. To unlock all content for 2023, please visit the archives.

Abstract: TH-PO045

Using an Artificial Intelligence Tool Incorporating Natural Language Processing to Identify Low-Prevalence Cases of ANCA-Associated Vasculitis in Electronic Health Records

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science


  • van Leeuwen, Jolijn R., Leids Universitair Medisch Centrum, Leiden, Zuid-Holland, Netherlands
  • Penne, Erik Lars, Noordwest Ziekenhuisgroep, Alkmaar, Noord-Holland, Netherlands
  • Rabelink, Ton J., Leids Universitair Medisch Centrum, Leiden, Zuid-Holland, Netherlands
  • Knevel, Rachel, Leids Universitair Medisch Centrum, Leiden, Zuid-Holland, Netherlands
  • Teng, Yoe Kie Onno, Leids Universitair Medisch Centrum, Leiden, Zuid-Holland, Netherlands

Anti-neutrophil cytoplasmatic antibody (ANCA)-associated vasculitis (AAV) is a rare, life-threatening, systemic auto-immune disease. Due to the low prevalence and heterogenous registration, there is an urgent need to improve identification of AAV patients within the electronic health record (EHR)-system of health organizations to facilitate clinical research.


Our aim was to identify, with a high sensitivity, low-prevalence AAV patients within large EHR-systems (>2.000.000 records) using an artificial intelligence (AI)-search tool. We combined a search on structured and unstructured data with natural language processing (NLP)-based exclusion. We developed the method in an academic center with an established AAV training set (n=203) and validated the method in a non-academic center with a validation set (n=84). We anonymously reviewed all identified patient records for AAV diagnosis.


The final search strategy combined four queries on disease description, laboratory measurements, medication and specialisms. In the training center, this search identified 608 patients, of which 346 were AAV patients upon manual review. 197/203 patients of the training set were retrieved, indicating a sensitivity of 97%. Employing NLP-based exclusion resulted in 444 patients with 339 AAV patients, resulting in an increase of positive predictive value (PPV) from 57% to 78% and a sensitivity of 96%. In the validation center the search strategy identified 333 patients, of which 194 were AAV patients, including 82/84 (98%) patients of the validation set. After NLP-based exclusion 223 patients remained, including 196 AAV patients, improving PPV from 58 to 86% with a sensitivity of 98%. Our identification method outperformed ICD-10 coding predominantly in identifying myeloperoxidase (MPO)-positive AAV patients and patients with few specialisms involved.


We demonstrated excellent performance of an AI-based identification method, incorporating NLP, to identify AAV patients in EHRs and we validated the applicability and transportability. This method can accelerate research efforts, while avoiding the limitations of ICD-10-based registration.


  • Commercial Support – Vifor Pharma