Approche préventive pour une gestion élastique du traitement parallèle et distribué de flux de données
Abstract
In a context of stream processing, it is important to guarantee some properties of performance,
quality of results and scalability to final users. Adjusting resource usage to processing
requirements in order to consume only necessary resources, is a major challenge dealing with
Big Data and Green IT. The approach suggested in this article, adapts dynamically and automatically
the parallelism degree of operators belonging to a same continuous query. It takes
into account the evolution of input stream rates. We suggest i) a metric estimating the activity
level of operators in a near future ii) the approach AUTOSCALE which evaluates the gain
brought by a set of the parallelism degree modifications at local and global scope iii) thanks
to an integration to the solution Apache Storm, we show performance tests comparing our
approach to the native solution of this stream processing engine.