Questions

Training Neural Networks in TensorFlow

The 3-Step Process

Step 1: Define Model Architecture

Specify how to compute output given input x and parameters w, b
Similar to logistic regression where:
f(x) = g(w·x + b) where g is sigmoid function
g(z) = 1/(1+e^(-z))
For neural networks, architecture defined with:

model = tf.keras.Sequential([
tf.keras.layers.Dense(25, activation='sigmoid'),
tf.keras.layers.Dense(15, activation='sigmoid'),
tf.keras.layers.Dense(1, activation='sigmoid')
])

Specifies complete architecture including:
25 hidden units in first layer
15 hidden units in second layer
1 output unit in final layer
All using sigmoid activation

Step 2: Specify Loss and Cost Functions

Define loss function (single training example)
Define cost function (average loss over entire training set)
For binary classification:
Loss: -y log(f(x)) - (1-y)log(1-f(x))
In TensorFlow: “binary cross-entropy loss”

model.compile(loss=tf.keras.losses.BinaryCrossentropy())

For regression problems:
Loss: (1/2)(f(x) - y)²
In TensorFlow: Mean Squared Error

model.compile(loss=tf.keras.losses.MeanSquaredError())

Step 3: Minimize Cost Function

Use gradient descent to minimize cost J(W,B)
Update rule: W = W - α·∂J/∂W, B = B - α·∂J/∂B
TensorFlow handles backpropagation for computing derivatives
Train the model with:

model.fit(X, y, epochs=100)

TensorFlow implements algorithms even faster than standard gradient descent
The fit() method handles the entire optimization process

Modern Deep Learning Development

Most implementations use libraries like TensorFlow or PyTorch
Similar to how developers now use libraries for:
Sorting algorithms
Mathematical operations (square roots)
Matrix operations
Understanding underlying mechanisms still valuable for debugging

Knowledge Check Answers

Question 1

Q: For which type of task would you use the binary cross entropy loss function? A: binary classification (classification with exactly 2 classes)

Question 2

Q: Which line of code updates the network parameters in order to reduce the cost? A: model.fit(X,y,epochs=100)

TensorFlow simplifies neural network training into three clear steps: architecture definition, loss specification, and cost minimization. Modern deep learning relies heavily on libraries, but understanding the fundamentals remains essential for effective development and troubleshooting.