Etudier l'incertitude dans les articles scientifiques : mise en perspective d'une méthode linguistique
Abstract
Uncertainty is an integral part of the scientific research process and is inherent in the construction
of new knowledge. In this article, we examine the way in which uncertainty is expressed
in scientific articles, and propose an annotation framework that takes into account the
different dimensions of this notion. Scientific uncertainty is defined here as the expression of a
lack of knowledge or a lack of precision in the information on an identified subject or concept.
We propose a gold standard dataset composed of 1,839 sentences of manually annotated scientific
articles from several disciplines. We also propose a linguistic knowledge-based approach
for the automatic annotation of articles and for the detection and categorisation of scientific
uncertainty. We compare the effectiveness of our approach in terms of Precision, Recall and
F1 scores to the few-shot prompting methods performed via the Phi-3.5 and Llama 3 Large
Language Models for the same annotation task. This comparative evaluation shows similar
scores between the different approaches, with F1 scores up to 0.858 for our approach.
- 122