ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: SA-PO611

Leveraging Statistical Natural Language Processing (NLP) to Surface Clinically Relevant Biomarkers in Pediatric Nephrotic Syndrome

Session Information

Category: Genetic Diseases of the Kidney

  • 802 Non-Cystic Mendelian Diseases

Authors

  • Hildebrandt, Friedhelm, Boston Children's Hospital, Boston, Massachusetts, United States
  • Lovric, Svjetlana, Boston Children's Hospital, Boston, Massachusetts, United States
  • Shril, Shirlee, Boston Children's Hospital, Boston, Massachusetts, United States
  • Mcneillie, Patrick, IBM Watson Health, Bethesda, Maryland, United States
  • Dankwa-Mullan, Irene, IBM Watson Health, Bethesda, Maryland, United States
  • Leibovitz, Evan, IBM, Cambridge, Massachusetts, United States
  • Scanlan, Kevin J, IBM Watson Health, Bethesda, Maryland, United States
Background

The majority of pediatric idiopathic nephrotic syndrome (NS) have minimal change disease, which is generally responsive to steroid therapy. Patients with genetic forms of steroid-resistant nephrotic syndrome (SRNS) are unresponsive to steroid therapy. Thus, therapeutic decisions are based on the underlying etiology, renal histology and genetic screening. While mechanisms of NS are not well understood, recent advances in molecular genetics have shown that single gene defects are responsible for a 25-33% of all cases of isolated and syndromic SRNS. Biomarkers represent significant value to the clinical domain, offering information on disease diagnosis, prognosis, risk-assessment, and treatment efficacy. However, the process of extracting biomarkers from unstructured literature is time consuming and requires domain expertise. This study evaluates the potential for NLP and cognitive analytics to facilitate a review of SRNS to accelerate discovery of potential biomarkers.

Methods

Boston Children’s Hospital (BCH) and IBM collaborated to train a machine learning model to identify appropriate entities and relationships across literature articles focused on SRNS. The team identified and labeled 11 entity types and 50 relationship types across 180 literature articles. The trained model was tested against the unstructured text of articles and outputs were analyzed for accuracy and precision.

Results

Comparing the expert output and the trained model showed 100% precision (23/23) and 92.0% sensitivity (23/25). One false-negatives was due to lack of co-reference, which links the lexical subject across multiple sentences. The other false-negative was due to the gene not being identified as relevant. The model took less than 30 seconds to identify the relevant biomarkers and provided passage level references to enable seamless follow up by the researcher.

Conclusion

The machine learning model provided rapid and accurate extraction of potential molecular biomarkers for NS. With additional training this model could be expanded to other rare diseases, accelerating mutational analysis for therapeutic interventions.

Funding

  • Commercial Support –