SENSE-LM : une synergie entre modèles de langage et représentations sensorimotrices pour la recherche de références olfactives et auditives dans des documents écrits
Abstract
The five human senses – vision, taste, smell, hearing, and touch – shape human perception
through multiple modalities. Extracting references to sensory experiences in text is a complex
task with broad applications. This paper introduces SENSE-LM, an information extraction system
designed to extract sensory references in large text collections. By combining a language
model, BERT, with linguistic resources like sensorimotor norms, SENSE-LM performs sensory
extraction at both coarse-grained (sentence classification) and fine-grained (sensory term
extraction) levels. Our evaluation on Olfaction and Audition centered textes shows SENSE-LM
outperforms state-of-the-art methods in automating these tasks.