Extraction de contraintes dans des spécifications de validation de données
Abstract
Data Validation specifications are mainly made of sentences where verbal groups give some
constraints to be verified. For the purpose of an automated treatment of these natural language
specifications, we need to extract and identify those constraints using natural language processing
tools. We present in this article an experimentation with a neural network model based
on BERT fine-tuning. A list of constraints to identify and a corpus of sentences and syntactic
propositions have been created, and a paraphrase generator has been used to balance the
lack of training data. Results are promising but can be nevertheless improved, for example by
increasing the quantity of data.