Construction et exploitation d'un corpus multilingue algérien pour l'analyse des opinions et des émotions
Abstract
This paper deals with the problem of the lack of resources in the opinion and emotion analysis related to north african dialects in general and the algerian dialect in particular. A collaborative platform "TWIFIL" for the annotation of multilingual public data is proposed. The result is a human generated corpus of extracted tweets. The purpose of this action is two-fold. The first, it addresses the shortage of relevant data for algerian dialect's opinion and emotion analysis. Second, it provides a more reliable (the appreciation of not just one person) annotated corpus. We also report on a number of evaluations, we have performed to test the generated corpus.