Abstract: SA-PO561
Heterogeneity in Electronic Health Record (HER) Phenotype Concepts in Collagen Type IV-Associated Nephropathies
Session Information
- Genetic Diseases: Diagnosis
November 05, 2022 | Location: Exhibit Hall, Orange County Convention Center‚ West Building
Abstract Time: 10:00 AM - 12:00 PM
Category: Genetic Diseases of the Kidneys
- 1102 Genetic Diseases of the Kidneys: Non-Cystic
Authors
- Nestor, Jordan Gabriela, Columbia University, New York, New York, United States
- Kiryluk, Krzysztof, Columbia University, New York, New York, United States
- Weng, Chunhua, Columbia University, New York, New York, United States
Background
Limited appreciation for the full spectrum of disease manifestations of collagen type IV-associated nephropathies (COL4A-AN) contributes to delays in diagnosis. Understanding the diversity of phenotypes is compounded by the heterogeneity of terms used to describe phenotype concepts in the EHR.
Methods
We extracted terms from published COL4A-AN case series and mapped them to concept unique identifiers (CUIs) in the Unified Medical Language System (UMLS). We identified 100 exome sequenced Columbia Biobank participants with diagnostic variant(s) in COL4A3/4/5 and performed a heuristic manual chart review. We counted the total number of unique concepts identified across structured (e.g., ICD9/10 and SNOMED-CT codes, etc.) and unstructured (e.g., clinical narratives, raw laboratory values, etc.) formats. Each encoded data element was mapped to standardized terminologies of the OMOP-Common Data Model. Then, we analyzed the diversity of codes used and conducted qualitative interviews with providers on billing practices.
Results
Most of the rich descriptions were documented within the text of clinical narratives written by kidney experts. In addition, a review of the raw urinalysis data revealed temporal, diagnostic evidence of hematuria in nearly half the cohort. Across structured data formats, we found numerous billing codes used to document particular concepts, such as hematuria and hearing loss. Through qualitative interviews, we found that nephrologists selected codes that reflected the primary disease addressed in the visit and ones that demonstrated the medical complexity of the patient’s disease to maximize reimbursement.
Conclusion
EHR data heterogeneity is an obstacle to the development of accurate and valid phenotype algorithms for COL4A-AN and should be accounted for in EHR phenotyping. Extracting concepts from clinical text using natural language processing techniques, in addition to structured data elements like billing codes, may prove useful.
UMLS Concept Map for COL4A-AN
Funding
- Other NIH Support