Designing a standard protocol for manually reviewing patient data demographics for record linkage


  • Sen Xiong Indiana University School of Medicine, Regenstrief Institute
  • Shuan Grannis, MD, MS, FAAP Indiana University School of Medicine, Regenstrief Institute



Background and Hypothesis: Accurate record linkage is essential to address fragmentation of patient data across independent healthcare organizations. To accurately evaluate record linkage methods, so-called “gold standard” data sets with labeled true matches and non-matches are needed. Human review, the process of manually assessing potentially linked patient demographic records and determining whether the record pair belongs to an idiosyncratic individual, is needed to create these datasets. However, the human review process is susceptible to bias and human error. Consequently, record linkage accuracy evaluations are prone to be biased by inaccurate gold standards. Consistent and scientifically rigorous methods for creating gold standard record linkage data sets must be developed, as none have yet been described. In this study, we describe a repeatable process for developing consistent manually reviewed datasets and analyze the results obtained from 15 human reviews of 200 record pairs following our protocol.

Experimental Design/Methods: We obtained patient records from the Indiana Network for Patient Care and Marion County Health Department. We created record pairs for manual reviews by probabilistically linking datasets using multiple blocking schemes. Two-hundred record pairs were then manually reviewed by 15 different individuals and the results were analyzed.

Results: Across the 200 record pairs reviewed by 15 reviewers, 155 were nondiscordant pairs whereas 45 were discordant, 40 among which were the result of outliers.

Conclusion and Potential Impact: From the record pair evaluation results, some empirical rules can be established for the process of manual review, though the nuances of evaluation reasoning require more discussion and a larger sample size. Nonetheless, establishing a standard for manual reviewing is a step towards better health care and complete patient records.