Complémentarités de représentations vectorielles pour la similarité sémantique
Abstract
The goal of the Semantic Textual Similarity task is to automatically quantify the semantic
similarity of two text snippets. Since 2012, the task has been organized on a yearly basis as a
part of the SemEval evaluation campaign. This paper presents a method that aims to combine
different sentence-based vector representations in order to improve the computation of semantic
similarity values. Our hypothesis is that such a combination of different representations
allows us to pinpoint different semantic aspects, which improves the accuracy of similarity
computations. The method's main difficulty lies in the selection of the most complementary
representations, for which we present an optimization method. Our final system is based on the
winning system of the 2015 evaluation campaign, augmented with the complementary vector
representations selected by our optimization method. We equally present evaluation results on
the data set of the 2016 campaign, which confirms the benefit of our method.