RNTI

MODULAD
Une approche combinée pour l'enrichissement d'ontologie à partir de textes et de données du LOD
In EGC 2016, vol. RNTI-E-30, pp.171-182
Abstract
This paper proposes an approach to automatically label documents describing products, with very specific concepts reflecting specific users' needs. The peculiarity of the approach is that it confronts a triple challenge: 1) the concepts used for labeling have no direct terminology in the documents, 2) their formal definitions are not initially known, 3) all the necessary information is not necessarily mentioned in the documents. To solve this problem, we propose an annotation process in two steps, guided by an ontology. The first step is to populate the ontology with information extracted from documents, completed by others from external resources. The second one is a reasoning step on the extracted data covering either a learning phase of concept definitions, or a phase of application of learned definitions. Thus, the SAUPODOC approach is a novel approach of ontology enrichment exploiting the foundations of the Semantic Web, by combining the contributions of the LOD and text analytics, machine learning and reasoning tools. The evaluation, on two domains of application, provides quality results and demonstrates the interest of the approach.