De représentations de documents à programmes : l'hypothèse distributionnelle peut-elle vraiment être utilisée sur les langages de programmation?
Abstract
Many deep learning models have been applied to programming languages, all of them relying
on natural language models and their underlying distributional hypothesis, but never questionning
the relevance of this latter. In this paper we thus explore wether this hypothesis still
stands for programming languages. Several methods are used, which we apply on variants of a
well-known, easy to understand and to adapt model of natural language processing : doc2vec.
Among other contributions, we propose a set of short programs that allow the observation of
both syntactic and semantic analogies.