Détection d'anomalies sur des documents juridiques contractuels
Abstract
This study aims to identify different types of anomalies in image corpora of legal contracts of homogeneous structure, such as sets of contracts from the same source. To achieve this, we rely on a combination of structural and semantic analysis methods. The structural analysis methods proposed have the advantage of being adaptable to different types of contracts, and of requiring only a small amount of annotated data. Following the structural analysis, we propose a preliminary study for the extraction of structural anomalies and semantic anomalies, based on the logical content of the documents and exploiting original text categorization methods based on folding. The various stages of this process are the subject of detailed experiments on real contract databases.