RNTI

MODULAD
Classification de questions en langage naturel par le type sémantique des réponses attendues
In EGC 2021, vol. RNTI-E-37, pp.181-192
Abstract
Question answering systems (QA) are traditionally made up of the following three tasks: 1) Analysis of the question, 2) Analysis of the textual set containing the answers, and 3) Search and extraction of the answers. In the last decade, learning-based QA systems have taken the form of an end-to-end model. Therefore, the three stages are no longer explicitly represented. As a result, the most recent QA systems make many mistakes when the answer is not in the text or when reasoning is required. In particular, the semantic type of the expected answer (TSA) may be inconsistent with the semantic type of the returned answer. In this article, we focus on the task of identifying TSA. First, we propose a taxonomy to represent TSAs. Secondly, we experiment models developed with CamemBERT from FQUAD, a French dataset consisting of questions and related answers. The evaluation is carried out on PIAF, another French dataset consisting of questions and related answers.