RNTI

MODULAD
De représentations de documents à programmes : l'hypothèse distributionnelle peut-elle vraiment être utilisée sur les langages de programmation?
In EGC 2025, vol. RNTI-E-41, pp.427-434
Abstract
Many deep learning models have been applied to programming languages, all of them relying on natural language models and their underlying distributional hypothesis, but never questionning the relevance of this latter. In this paper we thus explore wether this hypothesis still stands for programming languages. Several methods are used, which we apply on variants of a well-known, easy to understand and to adapt model of natural language processing : doc2vec. Among other contributions, we propose a set of short programs that allow the observation of both syntactic and semantic analogies.