RNTI

MODULAD
Reconnaissance de sections et d'entités dans les décisions de justice : application des modèles probabilistes HMM et CRF
In EGC 2017, vol. RNTI-E-33, pp.201-212
Abstract
A court decision is a text document, which is a synthesis of the outcome of a court case. Lawyers regularly use them as a source of interpretation of the law and also in order to un- derstand the opinion of judges. The available huge quantity of decisions requires automated solutions to help the actors of law. We propose to address some of the challenges related to the search and the analysis of the growing set of court decisions in France in a larger project. The first phase of this project focuses on extracting information from decisions in order to build a jurisprudential knowledge base structuring and organizing decisions. Such a base facilitates the descriptive and predictive analysis of decisions corpora. This paper presents an application of probabilistic models for the zoning of decisions and the recognition of entities in their content (location, date, participants, rules of law, ...). Our tests show the advantage of the approaches based on Conditional Random Fields (CRF) compared to simpler and faster models based on Hidden Markov Models (HMM). We present the technical aspects of the selection and annota- tion of the training corpus, and the definition of discriminating descriptors. The specificity of the texts is important and should be taken into account when applying information extracting methods in a specific domain.