Modélisation des métadonnées d'un data lake en data vault
Abstract
With the rise of big data, business intelligence devised solutions for managing great data
volume and variety. Data lakes are an answer from a storage point of view, but require managing
adequate metadata to guarantee efficient data access. From a multidimensional metadata
model designed for a data lake presenting a lack of schema evolutivity, we propose to use a
data vault to address this issue. To illustrate the feasibility of this approach, we instantiate our
metadata conceptual model into relational and document oriented logical and physical models.
We also compare the physical models in terms of metadata storage and query response time.