Binary Classification
- When y is either 0 or 1
- Use sigmoid activation
- Neural network learns to predict probability that y equals 1
- Similar to logistic regression
Binary Classification
Regression (+ and -)
Non-negative Regression
For hidden layers:
activation='relu'
(recommended default)
For output layer:
Binary classification: activation='sigmoid'
Regression (positive/negative): activation='linear'
Non-negative outputs: activation='relu'
Research literature mentions other activation functions:
tanh (hyperbolic tangent)
LeakyReLU
Swish
New activation functions emerge periodically
Sometimes perform slightly better in specific cases
Example: “I’ve used the LeakyReLU activation function a few times in my work, and sometimes it works a little bit better than the ReLU”
For most applications, sigmoid/ReLU/linear are sufficient
Choosing the right activation function is essential for neural network performance. For output layers, select based on your prediction target type (binary, unbounded, or non-negative). For hidden layers, ReLU is the standard choice due to its computational efficiency and better gradient properties.