Building Complex Neural Networks
Building Complex Neural Networks
Section titled “Building Complex Neural Networks”-
Neural networks are built from layers that take vector inputs and output vector outputs
-
Example of complex neural network:
- Has 4 layers (not counting input layer)
- Layers 1, 2, 3 are hidden layers
- Layer 4 is output layer
- Layer 0 is input layer (not counted in convention)
- “By convention, when we say that a neural network has four layers, that includes all the hidden layers in the output layer, but we don’t count the input layer”
- Has 4 layers (not counting input layer)
-
Layer 3 Computation Example (Third Hidden Layer):
- Inputs: vector a^[2] (output from previous layer)
- Outputs: vector a^[3]
- For 3 neurons/hidden units:
- Parameters: w₁^[3], b₁^[3], w₂^[3], b₂^[3], w₃^[3], b₃^[3]
- Computes:
- a₁^[3] = sigmoid(w₁^[3]·a^[2] + b₁^[3])
- a₂^[3] = sigmoid(w₂^[3]·a^[2] + b₂^[3])
- a₃^[3] = sigmoid(w₃^[3]·a^[2] + b₃^[3])
- Output: vector [a₁^[3], a₂^[3], a₃^[3]]
-
Notation Structure:
- Superscript [l] denotes layer number
- Subscript j denotes neuron/unit number
- w^[l] and b^[l] are parameters for layer l
- a^[l] are activations from layer l
-
General Formula:
- a_j^[l] = g(w_j^[l]·a^[l-1] + b_j^[l])
- Where:
- g is the “activation function” (sigmoid in this case)
- a^[l-1] is output from previous layer
- j is the neuron number
-
Additional Notation:
- Input vector X also named a^[0]
- Makes formula consistent for first layer:
- a^[1] = sigmoid(w^[1]·a^[0] + b^[1])
- where a^[0] = X (input features)
Note: The activation function g outputs activation values. Sigmoid is the only activation function shown so far, but others will be introduced later.
With this notation, you can compute activations for any layer in a neural network using the activations from the previous layer and the layer’s parameters. This forms the foundation for the neural network inference algorithm that will be covered next.