Apprentissage de structures séquentielles pour l'extraction d'entités et de relations dans des textes d'appels d'offres
Abstract
In this article we present a study exploiting machine learning methods for sequential struc-
tures extraction dedicated to extract semantic relations in call for tender databases on public
facilities projects. One of the relationships we consider concerns the impact of a development
project. We characterize it as an association between the concepts that define the infrastructure
(buildings) and the concepts that define their implantation, namely surfaces. This sequen-
tial structure extraction paradigm is considered as a labeling problem of sequential data. A
comparative is carried out exploiting several statistical learning techniques. This study demon-
strates the robustness of the CRF model for this kind of task when long term characteristics
that describe the contexte of occurrence of the labels are taken into account.