Course contents #1
Course contents #1
Module 1: Introduction to Deep Learning and Neural Networks This will include:
1. Mathematical Foundations
o Perceptron and Multi-Layer Perceptron (MLP) derivations
o Activation functions and their gradients
o Loss functions and their mathematical formulations
2. Theoretical Explanations
o Basics of deep learning and neural networks
o Importance of non-linearity and backpropagation
3. Code Exercises
o Implementing a simple neural network in TensorFlow/Keras
o Exploring activation functions and their effects
4. Reading Materials
o Key papers and foundational books on deep learning
A perceptron is the simplest type of artificial neural network. It consists of a single layer of
neurons with adjustable weights.
An MLP extends the perceptron by introducing multiple layers of neurons, allowing for the
learning of complex patterns.
3. Activation Functions
Activation functions introduce non-linearity into neural networks, allowing them to learn
complex patterns.
1. Sigmoid Function
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
7. Further Reading
Books:
o "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
o "Neural Networks and Deep Learning" by Michael Nielsen.
Papers:
o "Gradient-Based Learning Applied to Document Recognition" by Yann
LeCun et al.
o "Efficient Backprop" by Yann LeCun et al.
Momentum-based Optimization
RMSProp
model.compile(optimizer=Adam(learning_rate=0.001),
loss='categorical_crossentropy', metrics=['accuracy'])
5. Further Reading
For Module 2: Optimization Techniques, we need a case study that effectively demonstrates
the impact of different optimizers on deep learning performance.
Module 1: Introduction to Deep Learning and Neural Networks with extended depth.
This will include:
1. Mathematical Foundations
o Perceptron and Multi-Layer Perceptron (MLP) derivations
o Activation functions and their gradients
o Loss functions and their mathematical formulations
2. Theoretical Explanations
o Basics of deep learning and neural networks
o Importance of non-linearity and backpropagation
3. Code Exercises
o Implementing a simple neural network in TensorFlow/Keras
o Exploring activation functions and their effects
4. Reading Materials
o Key papers and foundational books on deep learning
The Improve Code Implementation section has a small example dataset for testing the MLP
model. I'll generate a synthetic dataset using sklearn.datasets for better reproducibility
and clarity.
python
CopyEdit
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
Explanation:
1. Dataset Generation:
o We use make_classification() from sklearn.datasets to generate a
binary classification dataset.
o The dataset consists of 1000 samples with 10 features.
o The labels (y) are binary (0 or 1).
2. Preprocessing:
o The dataset is split into training (80%) and testing (20%) using
train_test_split().
o Standardization is applied using StandardScaler() to improve training
stability.
3. Model Definition:
o The MLP consists of:
An input layer with 10 neurons.
Two hidden layers with 64 and 32 neurons, respectively, using ReLU
activation.
A final output layer with a single neuron and a sigmoid activation
function (suitable for binary classification).
o The model is compiled with the Adam optimizer and binary cross-entropy
loss.
4. Training and Evaluation:
o The model is trained for 10 epochs with batch size 32.
o The validation dataset (validation_data=(X_test, y_test)) helps monitor
overfitting.
o Finally, the test accuracy is printed.
# Standardize dataset
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
layers.Dense(64, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(32, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
return model
# Loss plot
ax[0].plot(history.history['loss'], label='Train Loss')
ax[0].plot(history.history['val_loss'], label='Validation Loss')
ax[0].set_title("Loss Over Epochs")
ax[0].set_xlabel("Epochs")
ax[0].set_ylabel("Loss")
ax[0].legend()
# Accuracy plot
ax[1].plot(history.history['accuracy'], label='Train Accuracy')
ax[1].plot(history.history['val_accuracy'], label='Validation Accuracy')
ax[1].set_title("Accuracy Over Epochs")
ax[1].set_xlabel("Epochs")
ax[1].set_ylabel("Accuracy")
ax[1].legend()
plt.show()
Key Enhancements:
python
CopyEdit
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
# Standardize dataset
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
plt.show()
layers.Dense(64, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(32, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
return model
# Loss plot
ax[0].plot(history.history['loss'], label='Train Loss')
ax[0].plot(history.history['val_loss'], label='Validation Loss')
ax[0].set_title("Loss Over Epochs")
ax[0].set_xlabel("Epochs")
ax[0].set_ylabel("Loss")
ax[0].legend()
# Accuracy plot
ax[1].plot(history.history['accuracy'], label='Train Accuracy')
ax[1].plot(history.history['val_accuracy'], label='Validation Accuracy')
ax[1].set_title("Accuracy Over Epochs")
ax[1].set_xlabel("Epochs")
ax[1].set_ylabel("Accuracy")
ax[1].legend()
plt.show()
plt.figure(figsize=(6, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues",
xticklabels=["Class 0", "Class 1"], yticklabels=["Class 0", "Class 1"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
Extending the code to include Precision, Recall, F1-score, and ROC Curve visualization
for a more comprehensive performance evaluation.
# Standardize dataset
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
plt.show()
layers.Dense(64, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(32, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.3),
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
return model
# Loss plot
ax[0].plot(history.history['loss'], label='Train Loss')
ax[0].plot(history.history['val_loss'], label='Validation Loss')
ax[0].set_title("Loss Over Epochs")
ax[0].set_xlabel("Epochs")
ax[0].set_ylabel("Loss")
ax[0].legend()
# Accuracy plot
ax[1].plot(history.history['accuracy'], label='Train Accuracy')
ax[1].plot(history.history['val_accuracy'], label='Validation Accuracy')
ax[1].set_title("Accuracy Over Epochs")
ax[1].set_xlabel("Epochs")
ax[1].set_ylabel("Accuracy")
ax[1].legend()
plt.show()
plt.figure(figsize=(6, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues",
xticklabels=["Class 0", "Class 1"], yticklabels=["Class 0", "Class 1"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
1. Classification Report
o Shows Precision, Recall, and F1-score for each class.
2. ROC Curve & AUC Score
o Helps visualize the model's trade-off between sensitivity and specificity.
1. Confusion Matrix Visualization
The confusion matrix helps analyze how well the model is classifying each class.
The heatmap visually represents the true positive, false positive, false negative, and
true negative values.
Code Snippet
python
CopyEdit
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues",
xticklabels=["Class 0", "Class 1"], yticklabels=["Class 0",
"Class 1"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
Code Snippet
python
CopyEdit
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
plt.show()
Code Snippet
python
CopyEdit
fig, ax = plt.subplots(1, 2, figsize=(12, 5))
# Loss plot
ax[0].plot(history.history['loss'], label='Train Loss')
ax[0].plot(history.history['val_loss'], label='Validation Loss')
ax[0].set_title("Loss Over Epochs")
ax[0].set_xlabel("Epochs")
ax[0].set_ylabel("Loss")
ax[0].legend()
# Accuracy plot
ax[1].plot(history.history['accuracy'], label='Train Accuracy')
ax[1].plot(history.history['val_accuracy'], label='Validation Accuracy')
ax[1].set_title("Accuracy Over Epochs")
ax[1].set_xlabel("Epochs")
ax[1].set_ylabel("Accuracy")
ax[1].legend()
plt.show()
4. Classification Report
Code Snippet
python
CopyEdit
print("Classification Report:")
print(classification_report(y_test, y_pred, target_names=["Class 0", "Class
1"]))
The ROC curve visualizes the trade-off between sensitivity (True Positive Rate) and
specificity (False Positive Rate).
AUC Score quantifies model performance (1.0 = perfect, 0.5 = random guessing).
Code Snippet
python
CopyEdit
y_probs = mlp_model.predict(X_test).ravel() # Get probability scores
fpr, tpr, _ = roc_curve(y_test, y_probs)
roc_auc = auc(fpr, tpr)
1. Precision-Recall Curve
Code Snippet
python
CopyEdit
from sklearn.metrics import precision_recall_curve, average_precision_score
Code Snippet
python
CopyEdit
import shap
Code Snippet
python
CopyEdit
from sklearn.inspection import permutation_importance
plt.figure(figsize=(8, 6))
plt.barh([f'Feature {i+1}' for i in sorted_idx],
result.importances_mean[sorted_idx], color='teal')
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Feature Importance via Permutation")
plt.show()
Code Snippet
python
CopyEdit
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
model.compile(optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
loss='binary_crossentropy', metrics=['accuracy'])
return model
bash
CopyEdit
pip install keras-tuner
Code Snippet
python
CopyEdit
import keras_tuner as kt
model.compile(
optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', [0.01,
0.001, 0.0001])),
loss='binary_crossentropy',
metrics=['accuracy']
)
return model
# Execute Search
tuner.search(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
Code Snippet
python
CopyEdit
# Select a random test instance
idx = np.random.randint(0, len(X_test))
X_sample = X_test[idx:idx+1]
Code Snippet
python
CopyEdit
from sklearn.metrics import classification_report
Code Snippet
python
CopyEdit
from sklearn.calibration import calibration_curve
Code Snippet
python
CopyEdit
from tensorflow.keras.callbacks import LearningRateScheduler
lr_callback = LearningRateScheduler(scheduler)