Skip to content
Pablo Rodriguez

Questions

  • Specify how to compute output given input x and parameters w, b
  • Similar to logistic regression where:
  • f(x) = g(w·x + b) where g is sigmoid function
  • g(z) = 1/(1+e^(-z))
  • For neural networks, architecture defined with:
Neural Network Architecture
model = tf.keras.Sequential([
tf.keras.layers.Dense(25, activation='sigmoid'),
tf.keras.layers.Dense(15, activation='sigmoid'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
  • Specifies complete architecture including:
  • 25 hidden units in first layer
  • 15 hidden units in second layer
  • 1 output unit in final layer
  • All using sigmoid activation
  • Define loss function (single training example)
  • Define cost function (average loss over entire training set)
  • For binary classification:
  • Loss: -y log(f(x)) - (1-y)log(1-f(x))
  • In TensorFlow: “binary cross-entropy loss”
Compile Model
model.compile(loss=tf.keras.losses.BinaryCrossentropy())
  • For regression problems:
  • Loss: (1/2)(f(x) - y)²
  • In TensorFlow: Mean Squared Error
Regression Loss
model.compile(loss=tf.keras.losses.MeanSquaredError())
  • Use gradient descent to minimize cost J(W,B)
  • Update rule: W = W - α·∂J/∂W, B = B - α·∂J/∂B
  • TensorFlow handles backpropagation for computing derivatives
  • Train the model with:
Train Model
model.fit(X, y, epochs=100)
  • TensorFlow implements algorithms even faster than standard gradient descent
  • The fit() method handles the entire optimization process
  • Most implementations use libraries like TensorFlow or PyTorch
  • Similar to how developers now use libraries for:
  • Sorting algorithms
  • Mathematical operations (square roots)
  • Matrix operations
  • Understanding underlying mechanisms still valuable for debugging

Q: For which type of task would you use the binary cross entropy loss function? A: binary classification (classification with exactly 2 classes)

Q: Which line of code updates the network parameters in order to reduce the cost? A: model.fit(X,y,epochs=100)

TensorFlow simplifies neural network training into three clear steps: architecture definition, loss specification, and cost minimization. Modern deep learning relies heavily on libraries, but understanding the fundamentals remains essential for effective development and troubleshooting.