RNTI

MODULAD
BioSTransformers: Modèles de langage pour l'apprentissage sans exemple dans des textes biomédicaux
In EGC 2023, vol. RNTI-E-39, pp.409-416
Abstract
Training language transformers on biomedical data has shown promising results. However, these language models require fine-tuning on very specific supervised data for each task, which are rarely available in the biomedical domain. We propose to use siamese neural models (sentence transformers) that embed texts to be compared in a vector space, and apply them on two main tasks: biomedical classification of articles and question answering. Our models optimize an objective self-supervised contrastive learning function on articles from the MEDLINE bibliographic database associated with their MeSH (Medical Subject Headings) keywords. The obtained results on several benchmarks show that the proposed models can solve these tasks without examples (zero-shot) and are comparable to biomedical transformers fine-tuned on supervised data specific to the problems treated.