Découverte et extraction d'arguments de relations n-aires corrélés dans les textes
Abstract
In this paper, we present a hybrid method based on datamining approaches and syntactic relations to
automatically discover and extract relevant data found in plain text. We use a domain Ontological and
Terminological Resource (OTR) which represents relevant data modelled as n-ary relations. N-ary relation
links a studied object (e.g. packaging) with its features as several arguments (e.g. its thickness). Our
work focuses on extracting those arguments in texts in order to populate the OTR with new instances.
The method relies on discovering implicit rules concerning the expression of arguments in texts using
sequential pattern mining and sequential rules, and on integrating specific syntactic relations in the discovered
sequential patterns to construct linguistic sequential patterns of correlated arguments in texts.
We have made concluding experiments on a corpus from food packaging domain where relevant data to
be extracted are experimental results on packagings.