RNTI

MODULAD
CORPEX : Analyse exploratoire d'un corpus biomédical à l'aide de la classification croisée
In EGC 2023, vol. RNTI-E-39, pp.597-604
Abstract
We propose an interface that supports corpus analysis via interactive visualizations of coclusters to explore the topics for a set of texts. The user can create or load a corpus of documents, clean them and study simultaneously the terms and the documents. This article details the functionalities related to the dynamic generation of corpora, especially in a biomedical context, and also the loading of document-term matrices for already pre-processed corpora. The analysis of the corpus by cross-classification (co-clustering) and the joint visualization of the terms and documents according to the co-partitioning, are effective tools for a quick understanding of the topics in a corpus. The automatic saving of the results allows to easily relaunch different co-clustering analyses and obtain crossed views of the topics at different levels of granularity.