Questions
Training Neural Networks in TensorFlow
Section titled “Training Neural Networks in TensorFlow”The 3-Step Process
Section titled “The 3-Step Process”Step 1: Define Model Architecture
Section titled “Step 1: Define Model Architecture”- Specify how to compute output given input x and parameters w, b
- Similar to logistic regression where:
- f(x) = g(w·x + b) where g is sigmoid function
- g(z) = 1/(1+e^(-z))
- For neural networks, architecture defined with:
model = tf.keras.Sequential([tf.keras.layers.Dense(25, activation='sigmoid'),tf.keras.layers.Dense(15, activation='sigmoid'),tf.keras.layers.Dense(1, activation='sigmoid')])
- Specifies complete architecture including:
- 25 hidden units in first layer
- 15 hidden units in second layer
- 1 output unit in final layer
- All using sigmoid activation
Step 2: Specify Loss and Cost Functions
Section titled “Step 2: Specify Loss and Cost Functions”- Define loss function (single training example)
- Define cost function (average loss over entire training set)
- For binary classification:
- Loss: -y log(f(x)) - (1-y)log(1-f(x))
- In TensorFlow: “binary cross-entropy loss”
model.compile(loss=tf.keras.losses.BinaryCrossentropy())
- For regression problems:
- Loss: (1/2)(f(x) - y)²
- In TensorFlow: Mean Squared Error
model.compile(loss=tf.keras.losses.MeanSquaredError())
Step 3: Minimize Cost Function
Section titled “Step 3: Minimize Cost Function”- Use gradient descent to minimize cost J(W,B)
- Update rule: W = W - α·∂J/∂W, B = B - α·∂J/∂B
- TensorFlow handles backpropagation for computing derivatives
- Train the model with:
model.fit(X, y, epochs=100)
- TensorFlow implements algorithms even faster than standard gradient descent
- The fit() method handles the entire optimization process
Modern Deep Learning Development
Section titled “Modern Deep Learning Development”- Most implementations use libraries like TensorFlow or PyTorch
- Similar to how developers now use libraries for:
- Sorting algorithms
- Mathematical operations (square roots)
- Matrix operations
- Understanding underlying mechanisms still valuable for debugging
Knowledge Check Answers
Section titled “Knowledge Check Answers”Question 1
Section titled “Question 1”Q: For which type of task would you use the binary cross entropy loss function? A: binary classification (classification with exactly 2 classes)
Question 2
Section titled “Question 2”Q: Which line of code updates the network parameters in order to reduce the cost? A: model.fit(X,y,epochs=100)
TensorFlow simplifies neural network training into three clear steps: architecture definition, loss specification, and cost minimization. Modern deep learning relies heavily on libraries, but understanding the fundamentals remains essential for effective development and troubleshooting.