Conception itérative et semi-supervisée d'assistants conversationnels par regroupement interactif des questions
Abstract
The design of a dataset needed to train a chatbot is most often the result of manual and
tedious step. To guarantee the efficiency and objectivity of the annotation, we propose an
active learning method based on constraints annotation. It's an iterative approach, relying on
a clustering algorithm to segment data and using annotator knowledge to lead clustering from
unlabeled question to relevant intents structure. In this paper, we study the optimal modeling
parameters to get an exploitable dataset with a minimum of annotations, and show that this
approach allows to make a coherent structure for the training of a chatbot.