RNTI

MODULAD
CATI : une approche interactive de découverte et de classification de grands corpus de documents
In EGC 2022, vol. RNTI-E-38, pp.75-86
Abstract
In this paper we present CATI, an approach for multi modal document classification implemented on an assisted, interactive document collection manipulation web application. The application helps non computer scientist users to discover, browse and classify large document collections, where documents contain text and can come with images and metadata such as timestamp, author, geolocation, etc. CATI provides classification assistants such as event detection, text and image based document clustering. It comes with an interface that helps users select among several text and other information based features to classify the documents. Our study shows that using the classification assistants and helping users choose the right features gives good classification results for large document collection within a few clicks.