Bias–Variance Tradeoff

Alex Egg,

“Bias and variance are the two components of imprecision in predictive models, and in general there is a trade-off between them, so normally reducing one tends to increase the other. Bias in predictive models is a measure of model rigidity and inflexibility, and means that your model is not capturing all the signal it could from the data. Bias is also known as under-fitting. Variance on the other hand is a measure of model inconsistency, high variance models tend to perform very well on some data points and really bad on others. This is also known as over-fitting and means that your model is too flexible for the amount of training data you have and ends up picking up noise in addition to the signal, learning random patterns that happen by chance and do not generalize beyond your training data. [1]

If your model is performing really well on the training set, but much poorer on the hold-out set, then it’s suffering from high variance. On the other hand if your model is performing poorly on both training and test data sets, it is suffering from high bias.”

bias/variance visual analogy

Figure 1: I like this visual analogy of the bias/variance tradeoff. A completely random model is on the top-right. A perfect model is on the bottom-left. Most of the time our models are somewhere in-between the top-left and bottom-right.

[1] P. Domingos. A Few Useful Things to Know about Machine Learning

Permalink: bias-variance-tradeoff


Last edited by Alex Egg, 2017-06-20 16:17:42
View Revision History