RNTI

MODULAD
Nettoyage de données guidé par la sémantique inter-colonnes
In EGC 2016, vol. RNTI-E-30, pp.549-550
Résumé
Today, the volume of unstructured and heterogeneous data is exploding, coming from multiple sources with different levels of quality. Therefore, it is very likely to manipulate data without knowledge about their structures and their semantics. In fact, the meta-data may be insufficient or totally absent. Data anomalies may be due to the poverty of their semantic descriptions, or even the absence of their descriptions. We propose an approach to understand better the semantics and the structure of the data. It helps to correct the intra-column anomalies (homogenization) and then the inter-columns ones caused by the violation of semantic dependencies.