Calibration des modèles d'apprentissage pour l'amélioration des détecteurs automatiques d'exemples mal-étiquetés
Abstract
Mislabeled data is a pervasive issue that undermines the performance of machine-learning
models across various industries. Methods for detecting mislabeled instances usually involve
training a base machine learning model and then probing it for every instance in order to obtain
a trust score that the provided label is genuine or incorrect. In this paper, we experiment with
the calibration of this base model. Our empirical findings show that employing calibration
methods improves the accuracy and robustness of mislabeled instance detection, providing a
practical and effective solution for industry applications.