Reframing for Non-Linear Dataset Shift
Abstract
Discriminative classification models assume that both training and
deployment data have same distributions of data attributes. These models give
significantly varied performances when they are deployed under varied circumstances
with different data distributions. This phenomenon is called Dataset
Shift. In this paper we have provided a method which first determines whether
there is a significant shift in the distributions of attributes between the training
and deployment datasets. If there exists a shift in the data the proposed method
then uses a Hill climbing approach to map this shift irrespective of its nature i.e.
(linear or non-linear) to the equation for quadratic transformation. Experimental
results on three real life datasets show strong performance gains achieved by
the proposed method over previously established methods such as retraining and
linear reframing.