Une mesure de similarité entre phrases basée sur des noyaux sémantiques
Abstract
We propose a new approach for semantic similarity between sentences by using semantic
kernels that compose the sentences. Kernels, composed of triples (subject, verb, and object),
are supposed to summarize the general meaning of each sentence they belong to. Based on the
semantic similarities between kernel elements, we build descriptive features summarizing information
about semantic similarity between phrases the kernels originate from. Then, using a
supervised machine learning technique we estimate the coefficients of the descriptive features.
The learning process is done on a benchmark containing phrases whose semantic similarities
were evaluated human experts. Comparative studies with other semantic similarity measures in
the litterature show good performances of our approach. Based on the latter, an application is
being developed for highlighting semantic parts related to the elements described in abstracts
of scientific articles.