Plongement de métrique pour le calcul de similarité sémantique à l'échelle
Abstract
In this paper, we explore the embedding of the shortest-path metrics from a knowledge base
(Wordnet) into the Hamming hypercube, in order to enhance the computation performance.
We show that, although an isometric embedding is untractable, it is possible to achieve good
non-isometric embeddings. We report a speedup of three orders of magnitude for the task
of computing Leacock and Chodorow (LCH) similarities while keeping strong correlations
(r = .819, = .826).