ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: FR-PO022

Accuracy of Bayesian Improved First Name Surname Geocoding (BIFSG) for Race and Ethnicity Imputation in a Kidney Care Management Program to Assess Racial Disparities

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science

Authors

  • Bruce, Liana DesHarnais, Somatus, McLean, Virginia, United States
  • Krasniak, Christopher S., Somatus, McLean, Virginia, United States
  • Eddings, Cliff S., Somatus, McLean, Virginia, United States
  • Phan, Brandon, Somatus, McLean, Virginia, United States
  • Mikhael, Bassem, Somatus, McLean, Virginia, United States
  • Kimura, Joe, Somatus, McLean, Virginia, United States
Background

Self-reported race and ethnicity data are ideal for classifying race and ethnicity to improve equity and close health outcome disparities, but these data have low response rates, typically <20%. Our goal was to validate race and ethnicity imputed using the BIFSG algorithm against self-reported data in a kidney care management population.

Methods

Patients are assessed at baseline and responses classified into six Office of Management and Budget standardized combined categories. We applied RAND’s indirect estimation method to generate estimates based on first names, surnames, and ZIP Codes. Accuracy, specificity, sensitivity, and positive predictive value (PPV) were then calculated to compare BIFSG-imputed values with self-reported values in a validation subsample.

Results

53,695 (16%) of 326,679 patients self-reported race/ethnicity. BIFSG predicted 269,354 (82%) of the overall population, including 44,964 of the self-report cohort. After imputation, 278,085 (85%) of patients had non-missing race/ethnicity. Overall imputed value accuracy compared to self-report was 99%. PPV was highest for Hispanic and lowest for American Indian or Alaskan Native, while accuracy was highest for Native and lowest for White.

Conclusion

Imputation of race/ethnicity can improve analyses of health disparities in kidney disease. The BIFSG imputation model obtained highly accurate (99%) predictions of race and ethnicity in a large chronic kidney disease population, increasing coverage of racial identity from 16% to 85%. The BIFSG algorithm could be supplemented with additional sources eg, historical records to impute residual missing values. Incorporating additional data and advanced machine learning models will improve predictions to better track health disparities.

MetricCalculationOverallWhiteBlack or African AmericanHispanicAsian or Pacific IslanderAmerican Indian or Alaskan Native
Precision (PPV)TP/(TP+FP)80.53%75.10%89.14%92.23%69.20%51.35%
Recall (Sensitivity)TP/(TP+FN)80.48%93.96%64.66%82.79%60.49%6.55%
SpecificityTN/(TN+FP)99.31%88.71%99.45%99.90%99.98%99.99%
Accuracy(TN+TP)/(TN+TP+FP+FN)98.67%90.11%97.17%99.67%99.94%99.91%

TP: True Positives, FP: False Positives, FN: False Negatives, TN: True Negatives

Funding

  • Commercial Support – Somatus