Selection

Model Selection and Training/Cross Validation/Test Sets

Advanced Evaluation

The Issue with Two-Way Splits

In the previous video, we used training/test splits to evaluate models
Problem: If we use the test set to select our model:
Test error becomes an overly optimistic estimate of generalization error
We’ve essentially “leaked” information from the test set into our model selection process

Three-Way Split: The Solution

Better Approach

Instead of two-way splits, use a three-way split of your data:

Training Set (typically ~60%): Used to fit model parameters (w,b)
Cross-Validation Set (typically ~20%): Used for model selection
Test Set (typically ~20%): Used for final evaluation of selected model

Error Metrics for Three-Way Splits

For each subset, we compute an error measure:

Training Error

J_train(w,b) = (1/2m_train)∑(f(x^(i)) - y^(i))²
Measures how well model fits training data

Cross-Validation Error

J_cv(w,b) = (1/2m_cv)∑(f(x_cv^(i)) - y_cv^(i))²
Used for model selection
Also called validation error or dev error

Test Error

J_test(w,b) = (1/2m_test)∑(f(x_test^(i)) - y_test^(i))²
Only used for final evaluation
Provides unbiased estimate of generalization

Model Selection Process

Step-by-Step Approach

Example: Selecting Polynomial Degree

Train multiple models with different polynomial degrees (d=1,2,…,10)

For each d, fit parameters w^d, b^d using only the training set

Evaluate each model on the cross-validation set

Compute J_cv(w^1,b^1), J_cv(w^2,b^2), …, J_cv(w^10,b^10)

Select the model with lowest cross-validation error

If d=4 gives lowest J_cv, choose this model

Estimate generalization error using the test set

Report J_test(w^4,b^4) as your final performance estimate

Beyond Polynomials: Neural Network Architecture Selection

Broader Applications

The same three-way split approach works for selecting between different neural network architectures:

Train multiple neural network architectures (different sizes/depths)

Each trained on the training set only

Evaluate each network on the cross-validation set

For classification, J_cv is typically the fraction of misclassified examples

Select the architecture with lowest cross-validation error
Report final performance using the test set

Best Practices in Model Selection

Use Training Set For

Fitting model parameters (w, b)
Training neural networks

Use Cross-Validation Set For

Selecting model type or architecture
Choosing hyperparameters
Making any other model decisions

Use Test Set For

ONLY final evaluation
Never for making decisions
Getting unbiased estimate of generalization

Looking Ahead

This three-way split approach is widely used in practice for model selection
Next, we’ll explore powerful diagnostics to improve model performance
The most important diagnostic: bias and variance analysis

The three-way split into training, cross-validation, and test sets provides a robust framework for both selecting the best model and fairly estimating its performance on new data. By reserving the cross-validation set for model selection decisions and keeping the test set completely untouched until the final evaluation, we avoid the optimistic bias that comes from testing on data that influenced our model choices.