Skip to content
Pablo Rodriguez

Selection

Model Selection and Training/Cross Validation/Test Sets

Section titled “Model Selection and Training/Cross Validation/Test Sets”
Advanced Evaluation
  • In the previous video, we used training/test splits to evaluate models
  • Problem: If we use the test set to select our model:
  • Test error becomes an overly optimistic estimate of generalization error
  • We’ve essentially “leaked” information from the test set into our model selection process
Better Approach

Instead of two-way splits, use a three-way split of your data:

  1. Training Set (typically ~60%): Used to fit model parameters (w,b)
  2. Cross-Validation Set (typically ~20%): Used for model selection
  3. Test Set (typically ~20%): Used for final evaluation of selected model

For each subset, we compute an error measure:

Training Error

  • J_train(w,b) = (1/2m_train)∑(f(x^(i)) - y^(i))²
  • Measures how well model fits training data

Cross-Validation Error

  • J_cv(w,b) = (1/2m_cv)∑(f(x_cv^(i)) - y_cv^(i))²
  • Used for model selection
  • Also called validation error or dev error

Test Error

  • J_test(w,b) = (1/2m_test)∑(f(x_test^(i)) - y_test^(i))²
  • Only used for final evaluation
  • Provides unbiased estimate of generalization
Step-by-Step Approach
  1. Train multiple models with different polynomial degrees (d=1,2,…,10)
  • For each d, fit parameters w^d, b^d using only the training set
  1. Evaluate each model on the cross-validation set
  • Compute J_cv(w^1,b^1), J_cv(w^2,b^2), …, J_cv(w^10,b^10)
  1. Select the model with lowest cross-validation error
  • If d=4 gives lowest J_cv, choose this model
  1. Estimate generalization error using the test set
  • Report J_test(w^4,b^4) as your final performance estimate

Beyond Polynomials: Neural Network Architecture Selection

Section titled “Beyond Polynomials: Neural Network Architecture Selection”
Broader Applications

The same three-way split approach works for selecting between different neural network architectures:

  1. Train multiple neural network architectures (different sizes/depths)
  • Each trained on the training set only
  1. Evaluate each network on the cross-validation set
  • For classification, J_cv is typically the fraction of misclassified examples
  1. Select the architecture with lowest cross-validation error

  2. Report final performance using the test set

Use Training Set For

  • Fitting model parameters (w, b)
  • Training neural networks

Use Cross-Validation Set For

  • Selecting model type or architecture
  • Choosing hyperparameters
  • Making any other model decisions

Use Test Set For

  • ONLY final evaluation
  • Never for making decisions
  • Getting unbiased estimate of generalization
  • This three-way split approach is widely used in practice for model selection
  • Next, we’ll explore powerful diagnostics to improve model performance
  • The most important diagnostic: bias and variance analysis

The three-way split into training, cross-validation, and test sets provides a robust framework for both selecting the best model and fairly estimating its performance on new data. By reserving the cross-validation set for model selection decisions and keeping the test set completely untouched until the final evaluation, we avoid the optimistic bias that comes from testing on data that influenced our model choices.