Résolution d'entités pour améliorer la qualité des données transactionnelles dans un système de santé
Abstract
Healthcare data involves a complex network of entities such as patients, providers and payers. Tracking every entity in the system with a high degree of confidence is one of the biggest data quality challenges in healthcare. Often referred to as "entity resolution", the precise association of each patient's care episodes is essential to retrieving complete histories. In this applicative paper on transactional data of the healthcare system, we first draw up an inventory of problems related to patient disambiguation, such as identifier dissociations and collisions. Then, on a real dataset with more than 150 billion patient-healthcare professional interactions, we propose approaches to correctly re-associate the interactions to a unique patient identi-fier. The results obtained show a reduction of 93% in the gap between the number of patients observed and the number of patients expected according to Census.