Normalisation à base de règles: une stratégie efficiente pour l'extraction d'évènements fondée sur des LLMs
Abstract
In this paper, we explore the integration of LLMs with symbolic processing for achieving
high granularity event extraction. We will show that the weakness of LLMs in producing
structured information, often pointed out in the literature, can be overcomed by designing a
domain tailored mapping function (hybridization). In order to support this claim, we compare
the results of an in-context learning method with our hybrid methodology and we show that
we can achieve superior results (+6.3 %) on a new dataset of subject-predicate-object triples
in the medical domain (681 triples for 200 sentences). This result is achieved by leaving the
LLM (Llama-3) free to generate the predicate types it is more familiar with, and then applying
a mapping function. Besides improving explainability and controllability of the output, the
intervention of such a function (which was implemented in five days), causes about a half
reduction of GHG emissions produced when processing the corpus.