Diagnosing

Diagnosing Bias and Variance

Core ML Concept

Introduction to Bias-Variance Analysis

Machine learning models rarely work perfectly on first attempt
Key to improvement: deciding what to try next
Bias-variance analysis provides clear guidance on how to improve model performance

Understanding Bias and Variance Visually

Polynomial Regression Example

Using the housing price prediction example with polynomial regression:

High Bias (Underfitting)

Linear model (d=1)
Too simple to capture data patterns
Error is high on both training and new data

High Variance (Overfitting)

4th-order polynomial (d=4)
Captures noise in training data
Fits training data well but performs poorly on new data

Just Right

Quadratic model (d=2)
Captures underlying pattern without fitting noise
Performs well on both training and new data

Systematic Diagnosis with Error Analysis

Key Indicators

Instead of visual inspection, use training and cross-validation errors:

High Bias (Underfitting) Indicators:

J_train is high (poor performance on training data)
J_cv is also high
J_train and J_cv are usually close to each other

High Variance (Overfitting) Indicators:

J_train is low (good performance on training data)
J_cv is much higher than J_train
Large gap between training and CV performance

”Just Right” Indicators:

J_train is relatively low
J_cv is also relatively low
Small gap between J_train and J_cv

Bias-Variance as a Function of Model Complexity

As model complexity increases:

Training Error (J_train) typically decreases:

Simple models (low d) have high training error
Complex models (high d) have low training error

Cross-Validation Error (J_cv) follows a U-shaped curve:

Too simple (low d): high J_cv due to underfitting
Too complex (high d): high J_cv due to overfitting
“Just right” (middle): lowest J_cv

Summary of Diagnosis Approach

Quick Reference

High Bias (Underfitting)

Indicator: J_train is high
Additional sign: J_cv is similarly high
Region: Left side of the model complexity curve

High Variance (Overfitting)

Indicator: J_cv >> J_train (much greater than)
Additional sign: J_train is typically low
Region: Right side of the model complexity curve

Both High Bias and High Variance

Uncommon but possible (especially in neural networks)
Indicator: J_train is high AND J_cv >> J_train
Example: Model that overfits some regions of input space while underfitting others

Looking Ahead

Next topic: How regularization affects bias and variance
Understanding this relationship helps determine when to use regularization

Bias-variance analysis provides a systematic framework for diagnosing model performance issues. By examining training and cross-validation errors, you can determine whether your model is underfitting or overfitting, and make informed decisions about how to improve it. This approach works even when visualization of the model isn’t possible, making it invaluable for complex, real-world machine learning applications.