Model drift

Overview

Models are trained on a specific dataset. As time passes, models may be requested to run inference on data that differs from this dataset. Also, the data the model was trained on may change. This makes model predictions less accurate over time which causes a data drift: the average data processed by the model significantly differs from the training dataset. The model’s evaluation metrics start to score lower.

When this happens, a new version of the model, trained on a new dataset, should replace the one before. Models can also be retrained on missing or newer samples to improve their predictive power. Model Ops monitors model performance metrics and alerts when retraining should be considered.

By monitoring the distribution of data feeding a model and detecting when it changes, you can learn if predictions are different than normal, faster. You can identify why models are making wrong predictions: if the models are getting wrong data or simply are no longer up to date.

Monitor models

To see model drift data, send a request to get a drift summary. This endpoint includes baseline and current distribution values with their chiSquaredValue and pValue scores.

Retrain

Update your model’s baseline with a revised dataset to minimize model drift.

Drift parameters

The list explains the drift parameters returned by Modzy.


baseline-distribution	The baseline training data used to evaluate model drift.
baseline-period	The date range that holds the training data used to evaluate model drift.
distribution	The result data values and their amount.
chiSquaredValue	Quantifies the difference between two statistical distributions. Modzy uses Pearson’s version to identify variations in model outputs over time as a drift measure. This value is unbounded.
pValue	A value between 0 and 1 related to the `chiSquaredValue`. It’s gets larger as the compared distributions become more dissimilar. The threshold values are compared to this value. This value is different from the generally accepted statistical definition of the pValue, which typically decreases to get smaller as fewer distributions are available that would have a higher Chi Square value metric emerge when compared to the baseline distribution. The version Modzy uses is equal to 1 – pValue if you consider the pValue to be defined in the standard statistical way.
thresholds	Threshold values range from 0 to 1 and correspond to pValues for a Chi Squared test. It’s gets larger as the compared distributions become more dissimilar.