Skip to content
Pablo Rodriguez

Relu Lab

Definition
  • ReLU (Rectified Linear Unit) is defined as:
  • a = max(0,z)
  • Provides a continuous linear relationship with an ‘off’ range
  • The ‘off’ feature makes ReLU a non-linear activation

Piecewise Linear Functions

  • Functions composed of linear pieces
  • Slope remains consistent during linear portions
  • Changes abruptly at transition points
  • At transition points, a new linear function is added

Role of Non-Linear Activation

  • Responsible for disabling input before/after transition points
  • Allows network to model complex functions by “stitching together” linear segments
  • Enables selective activation of different parts of the network

Lab Exercise: Modeling Piecewise Linear Functions

Section titled “Lab Exercise: Modeling Piecewise Linear Functions”
  • First layer: 3 units (each responsible for one segment of the function)
  • Unit 0: Pre-programmed and fixed to map the first segment
  • Units 1 & 2: Need weight/bias adjustments to model 2nd and 3rd segments
  • Output unit: Fixed to sum the outputs of the first layer
  • Use sliders to modify weights and biases to match the target
  • Start with w₁ and b₁, leaving w₂ and b₂ at zero until 2nd segment is matched
  • Clicking rather than sliding provides quicker adjustment

Understanding How ReLU Enables Piecewise Functions

Section titled “Understanding How ReLU Enables Piecewise Functions”
Key Insight
  • Responsible for segment [0,1]
  • ReLU cuts off the function after interval [0,1]
  • Critical feature: Prevents Unit 0 from interfering with following segments
  • Responsible for the 2nd segment
  • ReLU keeps this unit inactive until x > 1
  • Since Unit 0 is not contributing, w₁ equals the target line slope
  • Bias must be adjusted to keep output negative until x reaches 1
  • Note: Contribution extends to the 3rd segment
  • Responsible for the 3rd segment
  • ReLU zeros output until x reaches the appropriate value
  • w₂ must be set so that sum of Unit 1 and 2 creates desired slope
  • Bias adjusted to keep output negative until x reaches 2

ReLU’s non-linear behavior provides neural networks the critical ability to selectively activate different parts of the network depending on the input. This capability allows networks to model complex functions by combining simpler linear segments, creating piecewise linear approximations of any target function.