DL Lab Manual Full - Pagenumber
DL Lab Manual Full - Pagenumber
Science, Media & Entertainment, Autonomous Cars, etc. # Define the input shape and number of classes
input_shape = (32, 32, 3)
What is CNN? num_classes = 10
Three Layers of CNN # Normalize the images and shuffle the training data
def preprocess_image(image, label): Output:
image = tf.image.resize(image, (32, 32))
Convolutional Neural Networks specialized for applications in image & video image = tf.cast(image, tf.float32) / 255.0 Test accuracy: 0.6983000040054321
recognition. CNN is mainly used in image analysis tasks like Image recognition, return image, label
Object detection & Segmentation.
BATCH_SIZE = 32
Introduction process data in a way that is inspired by the human brain. It is a type of machine
# Display the first few rows again to check data after initial analysis
learning process, called deep learning, that uses interconnected nodes or neurons in print("\nFirst few rows of the dataset after initial analysis:")
Artificial neural networks (ANNs, also shortened to neural networks (NNs) or a layered structure that resembles the human brain. print(df.head())
neural nets) are a branch of machine learning models that are built using principles
of neuronal organization discovered by connectionism in the biological neural • We have worked on predicting insurance cost based on the features provided # Convert the 'sex' column into a dummy variable and drop the first category
('female')
networks constituting animal brains. in the data set Male = pd.get_dummies(df['sex'], drop_first=True)
• We wanted to test the dataset with different regression models
An ANN is based on a collection of connected units or nodes called artificial • We have also done feature engineering initially and then exploratory # Add the dummy variable to the original dataframe
df = pd.concat([df, Male], axis=1)
neurons, which loosely model the neurons in a biological brain. Each connection, analysis to build an understanding of the relationship between variables.
like the synapses in a biological brain, can transmit a signal to other neurons. An # Convert the 'smoker' column into a dummy variable and drop the first category
('no')
artificial neuron receives signals then processes them and can signal neurons Smoker = pd.get_dummies(df['smoker'], drop_first=True)
connected to it. The "signal" at a connection is a real number, and the output of
each neuron is computed by some non-linear function of the sum of its inputs. The Source code: # Add the dummy variable to the original dataframe
df = pd.concat([df, Smoker], axis=1)
connections are called edges. Neurons and edges typically have a weight that
import numpy as np
adjusts as learning proceeds. The weight increases or decreases the strength of the # Rename the 'yes' column to 'Smoker' for clarity
import pandas as pd
signal at a connection. Neurons may have a threshold such that a signal is sent only import matplotlib.pyplot as plt
df = df.rename(columns={'yes':'Smoker'})
if the aggregate signal crosses that threshold. import seaborn as sns
# Get the unique values from the 'region' column
import missingno as msno
print("\nUnique values in the 'region' column:")
Typically, neurons are aggregated into layers. Different layers may perform # Load the insurance dataset
print(df['region'].unique())
different transformations on their inputs. Signals travel from the first layer (the df = pd.read_csv('/content/sample_data/insurance (1).csv')
# Convert the 'region' column into dummy variables for each region
input layer), to the last layer (the output layer), possibly after traversing the layers region = pd.get_dummies(df['region'])
# Display the first few rows of the dataset
multiple times .
print("First few rows of the dataset:")
# Add the dummy variables to the original dataframe
print(df.head())
df = pd.concat([df, region], axis=1)
How artificial neural network is used for classification?
# Set the figure size for the missing values matrix plot
# Display the first few rows again to check the dataframe after encoding
plt.figure(figsize=(8,5))
Classification ANNs seek to classify an observation as belonging to some discrete print("\nFirst few rows of the dataset after encoding categorical variables:")
print(df.head())
class as a function of the inputs. The input features (independent variables) can be # Visualize the missing values in the dataset using a matrix plot
categorical or numeric types, however, we require a categorical feature as the print("\nVisualizing missing values in the dataset:")
# Set the figure size for the plot
msno.matrix(df)
dependent variable. plt.show()
plt.figure(figsize=(8,4))
# Create a count plot for the 'sex' column with a specified color palette
Create Simple Deep Learning Neural Network for Classification # Count the number of missing values in each column (if any)
print("\nCount plot for the 'sex' column:")
print("\nCounting missing values in each column:")
4 5 6
sns.set_style('white') sns.boxplot(x='region', y='charges', data=df, palette='coolwarm', hue='sex', from sklearn.preprocessing import MinMaxScaler
sns.countplot(x='sex', data=df, palette='GnBu') ax=ax[1])
sns.despine(left=True) plt.show() # Initialize the scaler and fit it on the training data
plt.show() scaler = MinMaxScaler()
# Create subplots for multiple scatter plots scaler.fit(X_train)
# Set the figure size for the next plot fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12,5))
plt.figure(figsize=(8,4)) # Transform the training data using the fitted scaler
# Scatter plot for 'bmi' vs 'charges', colored by 'sex' X_train = scaler.transform(X_train)
# Create a boxplot for 'charges' based on 'sex' and 'Smoker' status, with a specified print("\nScatter plot of 'bmi' vs 'charges' by 'sex':")
color palette sns.scatterplot(x='bmi', y='charges', data=df, palette='GnBu_r', hue='sex', ax=ax[0]) # Transform the test data using the fitted scaler
print("\nBoxplot of 'charges' by 'sex' and 'Smoker' status:") X_validate = scaler.transform(X_test)
sns.set_style('white') # Scatter plot for 'bmi' vs 'charges', colored by 'Smoker' status
sns.boxplot(x='sex', y='charges', data=df, palette='OrRd', hue='Smoker') print("\nScatter plot of 'bmi' vs 'charges' by 'Smoker' status:") from tensorflow.keras.models import Sequential
sns.despine(left=True) sns.scatterplot(x='bmi', y='charges', data=df, palette='magma', hue='Smoker', from tensorflow.keras.layers import Dense, Dropout
plt.show() ax=ax[1]) from tensorflow.keras.callbacks import EarlyStopping
# Create subplots for multiple scatter plots sns.set_style('dark') # Initialize a sequential model
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(12,5)) sns.despine(left=True) model = Sequential()
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
# Scatter plot for 'age' vs 'charges', colored by 'sex' plt.show() # Add the first hidden layer with 8 units and ReLU activation
print("\nScatter plot of 'age' vs 'charges' by 'sex':") model.add(Dense(units = 8, activation = 'relu'))
sns.scatterplot(x='age', y='charges', data=df, palette='coolwarm', hue='sex', # Drop the columns 'sex', 'region', 'smoker', and 'southwest' from the dataframe
ax=ax[0]) df.drop(['sex', 'region', 'smoker', 'southwest'], axis=1, inplace=True) # Add the second hidden layer with 3 units and ReLU activation
model.add(Dense(units = 3, activation = 'relu'))
# Scatter plot for 'age' vs 'charges', colored by 'Smoker' # Display the first few rows again to check the dataframe after dropping columns
print("\nScatter plot of 'age' vs 'charges' by 'Smoker' status:") print("\nFirst few rows of the dataset after dropping unnecessary columns:") # Add the output layer with 1 unit (since we're predicting a single value)
sns.scatterplot(x='age', y='charges', data=df, palette='GnBu', hue='Smoker', print(df.head()) model.add(Dense(units = 1))
ax=ax[1])
# Set the figure size for the heatmap plot # Compile the model using the Adam optimizer and Mean Squared Error as the loss
# Scatter plot for 'age' vs 'charges', colored by 'region' plt.figure(figsize=(10,4)) function
print("\nScatter plot of 'age' vs 'charges' by 'region':") model.compile(optimizer = 'adam', loss = 'mse')
sns.scatterplot(x='age', y='charges', data=df, palette='magma_r', hue='region', # Create a heatmap of the correlation matrix of the dataframe, with a specified color
ax=ax[2]) map # Define early stopping to prevent overfitting, stopping if validation loss doesn't
print("\nHeatmap of the correlation matrix:") improve for 15 epochs
# Set the style for the seaborn plots to 'dark' sns.heatmap(df.corr(), cmap='OrRd') early_stop = EarlyStopping(monitor='val_loss', mode= 'min', verbose= 0, patience=15)
sns.set_style('dark') plt.show()
sns.despine(left=True) # Train the model with the training data, validate on the test data, and use early
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.) # Define the feature matrix X (all columns except 'charges') and target variable y stopping
plt.show() ('charges') model.fit(x=X_train, y=y_train, epochs = 2000, validation_data=(X_test, y_test),
X = df.drop('charges', axis=1) batch_size=128, callbacks=[early_stop])
# Create subplots for multiple boxplots y = df['charges']
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12,5)) # Convert the training history to a DataFrame and plot the loss over epochs
# Import the train_test_split function to split the dataset into training and testing loss = pd.DataFrame(model.history.history)
# Boxplot for 'region' vs 'charges', colored by 'Smoker' status sets loss.plot()
print("\nBoxplot of 'charges' by 'region' and 'Smoker' status:") from sklearn.model_selection import train_test_split plt.title("Model Loss Over Epochs") # Add a title to the plot
sns.boxplot(x='region', y='charges', data=df, palette='GnBu', hue='Smoker', ax=ax[0]) plt.xlabel("Epochs") # Add label for the x-axis
# Split the dataset into training and testing sets (75% training, 25% testing) plt.ylabel("Loss") # Add label for the y-axis
# Boxplot for 'region' vs 'charges', colored by 'sex' X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) plt.show()
print("\nBoxplot of 'charges' by 'region' and 'sex':")
# Import the MinMaxScaler to scale the features from sklearn.metrics import mean_squared_error
7 8 9
dtype: int64
# Make predictions on the test data
Summary of the dataset (data types and non-null counts):
pred = model.predict(X_test) <class 'pandas.core.frame.DataFrame'>
RangeIndex: 1338 entries, 0 to 1337
# Calculate and print the Root Mean Squared Error (RMSE) for the test set predictions Data columns (total 7 columns):
# Column Non-Null Count Dtype
rmse_test = np.sqrt(mean_squared_error(y_test, pred)) --- ------ -------------- -----
print(f"Root Mean Squared Error(RMSE) on Test Data: {rmse_test}") 0 age 1338 non-null int64
1 sex 1338 non-null object
# Select a subset of the data for further predictions (dropping the 'charges' column) 2 bmi 1338 non-null float64
3 children 1338 non-null int64
entry_1 = df[:][257:477].drop('charges', axis=1) 4 smoker 1338 non-null object
5 region 1338 non-null object Boxplot of 'charges' by 'sex' and 'Smoker' status:
# Make predictions on the selected subset 6 charges 1338 non-null float64
pred = model.predict(entry_1) dtypes: float64(2), int64(2), object(3)
memory usage: 73.3+ KB
None
# Calculate and print the RMSE for this subset of data
rmse_entry_1 = np.sqrt(mean_squared_error(df[:][257:477]['charges'], pred)) Summary statistics of numerical columns:
print(f"Root Mean Squared Error(RMSE) on Subset Data: {rmse_entry_1}") age bmi children charges
count 1338.000000 1338.000000 1338.000000 1338.000000
mean 39.207025 30.663397 1.094918 13270.422265
std 14.049960 6.098187 1.205493 12110.011237
min 18.000000 15.960000 0.000000 1121.873900
Output: 25% 27.000000 26.296250 0.000000 4740.287150
50% 39.000000 30.400000 1.000000 9382.033000
75% 51.000000 34.693750 2.000000 16639.912515
First few rows of the dataset: max 64.000000 53.130000 5.000000 63770.428010
age sex bmi children smoker region charges
0 19 female 27.900 0 yes southwest 16884.92400 First few rows of the dataset after initial analysis: Scatter plot of 'age' vs 'charges' by 'sex':
1 18 male 33.770 1 no southeast 1725.55230 age sex bmi children smoker region charges
2 28 male 33.000 3 no southeast 4449.46200 0 19 female 27.900 0 yes southwest 16884.92400 Scatter plot of 'age' vs 'charges' by 'Smoker' status:
3 33 male 22.705 0 no northwest 21984.47061 1 18 male 33.770 1 no southeast 1725.55230
4 32 male 28.880 0 no northwest 3866.85520 2 28 male 33.000 3 no southeast 4449.46200 Scatter plot of 'age' vs 'charges' by 'region':
3 33 male 22.705 0 no northwest 21984.47061
Visualizing missing values in the dataset: 4 32 male 28.880 0 no northwest 3866.85520
<Figure size 800x500 with 0 Axes>
Unique values in the 'region' column:
['southwest' 'southeast' 'northwest' 'northeast']
10 11 12
Exercise-3
Module name : Understanding and Using CNN : Image
recognition
Design a CNN for Image Recognition which includes hyper
parameter tuning.
Introduction
Deep Learning models have important applications in image processing. However,
one of the challenges in this field is the definition of hyperparameters. Thus, the
objective of this work is to propose a rigorous methodology for hyperparameter
Scatter plot of 'bmi' vs 'charges' by 'sex': tuning of Convolutional Neural Network for building construction image
classification.
Scatter plot of 'bmi' vs 'charges' by 'Smoker' status:
Neural network hyperparameters are like settings you choose before teaching a
neural network to do a task. They control things like how many layers the network
has, how quickly it learns, and how it adjusts its internal values. Picking the right
hyperparameters is important to help the network learn effectively and solve the
task accurately. It’s a bit like adjusting the knobs on a machine to make it work just
right for a particular job.
to build the model from a specific dataset. Hyperparameter Tuning in Deep Learning Source code:
For installing keras-tuner (run in another cell)
!pip install keras-tuner
Here we will demonstrate the process to tune 2 things of Neural Network: (1) the The first hyperparameter to tune is the number of neurons in each hidden
hyperparameters and (2) the layers. I find it more difficult to find the latter tutorials layer. In this case, the number of neurons in every layer is set to be the same. It
than the former. The first one is the same as other conventional Machine Learning also can be made different. The number of neurons should be adjusted to the import tensorflow as tf
algorithms. The hyperparameters to tune are the number of neurons, activation solution complexity. The task with a more complex level to predict needs more from tensorflow import keras
function, optimizer, learning rate, batch size, and epochs. The second step is to tune from tensorflow.keras import layers
neurons. The number of neurons range is set to be from 10 to 100. from tensorflow.keras.datasets import cifar10
the number of layers. This is what other conventional algorithms do not have. from kerastuner.tuners import RandomSearch
Different layers can affect the accuracy. Fewer layers may give an underfitting An activation function is a parameter in each layer. Input data are fed to the input
result while too many layers may make it overfitting. layer, followed by hidden layers, and the final output layer. The output layer # Step 1: Load and preprocess the dataset (CIFAR-10)
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
contains the output value. The input values moving from a layer to another layer x_train = x_train.astype('float32') / 255.0
For the hyperparameter-tuning demonstration, I use a dataset provided by Kaggle. I keep changing according to the activation function. The activation function decides x_test = x_test.astype('float32') / 255.0
build a simple Multilayer Perceptron (MLP) neural network to do a binary how to compute the input values of a layer into output values. The output values of
# Convert labels to one-hot encoding
classification task with prediction probability. The used package in Python is Keras a layer are then passed to the next layer as input values again. The next layer then y_train = keras.utils.to_categorical(y_train, 10) # 10 classes in CIFAR10
built on top of Tensorflow. The dataset has an input dimension of 10. There are two computes the values into output values for another layer again. There are 9 y_test = keras.utils.to_categorical(y_test, 10)
hidden layers, followed by one output layer. The accuracy metric is the accuracy activation functions to tune in to this demonstration. Each activation function has its
score. The callback of EarlyStopping is used to stop the learning process if there is own formula (and graph) to compute the input values # Step 2: Define the CNN architecture
def build_model(hp):
no accuracy improvement in 20 epochs. Below is the illustration. model = keras.Sequential()
The layers of a neural network are compiled and an optimizer is assigned. The
optimizer is responsible to change the learning rate and weights of neurons in the # Tune the number of convolutional layers and filters
for i in range(hp.Int('num_conv_layers', 2, 4)):
neural network to reach the minimum loss function. Optimizer is very important to model.add(layers.Conv2D(hp.Int(f'conv_{i}_filters', 32, 256, 32), (3, 3),
achieve the possible highest accuracy or minimum loss. There are 7 optimizers to activation='relu', padding='same'))
choose from. Each has a different concept behind it. model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
One of the hyperparameters in the optimizer is the learning rate. We will also tune
the learning rate. Learning rate controls the step size for a model to reach the # Tune the number of dense layers and units
minimum loss function. A higher learning rate makes the model learn faster, but it for i in range(hp.Int('num_dense_layers', 1, 3)):
model.add(layers.Dense(units=hp.Int(f'dense_{i}_units', 64, 512, 32),
may miss the minimum loss function and only reach the surrounding of it. A lower activation='relu'))
learning rate gives a better chance to find a minimum loss function. As a tradeoff
lower learning rate needs higher epochs, or more time and memory capacity # Output layer
model.add(layers.Dense(10, activation='softmax'))
resources.
# Compile the model
model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', [1e-3,
1e-4])),
loss='categorical_crossentropy', metrics=['accuracy'])
return model
# Step 3: Initialize the Keras Tuner RandomSearch and search for the best
hyperparameters
16 17 18
tuner = RandomSearch(
Exercise-4 which are formed from feedforward networks, are similar to human brains in their
build_model,
behaviour. Simply said, recurrent neural networks can anticipate sequential data in a
objective='val_accuracy', Module name: Predicting Sequential Data way that other algorithms can’t.
max_trials=5, # Number of different hyperparameter combinations to try
directory='hyperparameter_tuning', # Directory to store the results Exercise: Implement a Recurrence Neural Network for
)
project_name='cifar10_tuner' # Name of the project Predicting Sequential Data.
# Search for the best hyperparameter configuration Introduction
tuner.search(x_train, y_train, epochs=1, validation_data=(x_test, y_test))
# Step 4: Get the best hyperparameters and build the final model A Deep Learning approach for modelling sequential data is Recurrent Neural
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0] Networks (RNN). RNNs were the standard suggestion for working with sequential
final_model = build_model(best_hps)
data before the advent of attention models. Specific parameters for each element of All of the inputs and outputs in standard neural networks are independent of one
# Step 5: Train and evaluate the final model the sequence may be required by a deep feedforward model. It may also be unable another, however in some circumstances, such as when predicting thenext word of a
final_model.fit(x_train, y_train, epochs=1, validation_data=(x_test, y_test)) to generalize to variable-length sequences. phrase, the prior words are necessary, and so the previous
# Evaluate the final model on the test set
words must be remembered. As a result, RNN was created, which used a Hidden
loss, accuracy = final_model.evaluate(x_test, y_test) Layer to overcome the problem. The most important component of RNN is the
print(f'Test accuracy: {accuracy}') Hidden state, which remembers specific information about a sequence.
Output: RNNs have a Memory that stores all information about the calculations. It employs
the same settings for each input since it produces the same outcome by performing
Test accuracy: 0.5800999999046326 the same task on all inputs or hidden layers.
Recurrent Neural Networks use the same weights for each element of the sequence, The Architecture of a Traditional RNN
decreasing the number of parameters and allowing the model to generalize to
sequences of varying lengths. RNNs generalize to structured data other than RNNs are a type of neural network that has hidden states and allows past
sequential data, such as geographical or graphical data, because of its design. outputs to be used as inputs. They usually go like this:
Recurrent neural networks, like many other deep learning techniques, are relatively
old. They were first developed in the 1980s, but we didn’t appreciate their full
potential until lately. The advent of long short-term memory (LSTM) in the 1990s,
combined with an increase in computational power and the vast amounts of data
that we now have to deal with, has really pushed RNNs to the forefront.
Neural networks imitate the function of the human brain in the fields of AI,
machine learning, and deep learning, allowing computer programs to recognize
patterns and solve common issues.
RNNs are a type of neural network that can be used to model sequence data. RNNs,
19 20 21
• Many To One: In this scenario, a single output is produced by combining many num_samples = 1000 output unit)
initial_input = np.concatenate((initial_input[:, 1:, :], prediction[:,
time_steps = 10 # Number of time steps in each input sequence
inputs from distinct time steps. Sentiment analysis and emotion identification use np.newaxis, :]), axis=1)
such networks, in which the class label is determined by a sequence of words. data = generate_data(num_samples, time_steps) print("Predictions:", predictions)
• Many To Many: For many to many, there are numerous options. Two inputs
# Split the data into training and testing sets
yield three outputs. Machine translation systems, such as English to French or vice
train_ratio = 0.8
versa translation systems, use many to many networks. train_size = int(train_ratio * num_samples)
train_data = data[:train_size]
test_data = data[train_size:]
Output:
Common Activation Functions
# Prepare the data for training and testing Test loss: 0.03264236077666283
A neuron’s activation function dictates whether it should be turned on or off. train_input = train_data[:, :-1, :] # Input data for training (all time steps except Predictions: [0.55553997, 0.42471844, 0.5136012, 0.57316655, 0.4701597,
0.71725094, 0.5853208, 0.44183043, 0.7576916, 0.41853464]
Nonlinear functions usually transform a neuron’s output to a number between the last one)
train_target = train_data[:, 1:, :] # Target data for training (all time steps
0 and 1 or -1 and 1. except the first one)
test_input = test_data[:, :-1, :] # Input data for testing (all time steps except
the last one)
test_target = test_data[:, 1:, :] # Target data for testing (all time steps except
the first one)
22 23 24
Exercise-5 and thus three input nodes and the hidden layer has three nodes. The output layer }
gives two outputs, therefore there are two output nodes. The nodes in the input layer
Module name: Removing noise from the images take input and forward it for further process, in the diagram above the nodes in the
# Create the GridSearchCV object to perform hyperparameter tuning
grid_search = GridSearchCV(mlp, param_grid, cv=3,
Exercise: Implement Multi-Layer Perceptron algorithm for input layer forwards their output to each of the three nodes in the hidden layer, and scoring='neg_mean_squared_error', n_jobs=-1)
Image denoising in the same way, the hidden layer processes the information and passes it to the
# Perform hyperparameter tuning
hyperparameter tuning. output layer. grid_search.fit(X_train, y_train)
Introduction Every node in the multi-layer perception uses a sigmoid activation function. The # Get the best hyperparameters and the best MLP model
best_params = grid_search.best_params_
sigmoid activation function takes real values as input and converts them to numbers best_mlp = grid_search.best_estimator_
Multi-layer perception is also known as MLP. It is fully connected dense layers, between 0 and 1 using the sigmoid formula.
# Predict the denoised image using the best model
which transform any input dimension to the desired dimension. A multi-layer
perception is a neural network that has multiple layers. To create a neural (𝒙) = 𝟏 ∕ (𝟏 + 𝐞𝐱𝐩(−𝒙)) denoised_image = best_mlp.predict(X_train).reshape(image_size)
network we combine neurons together so that the outputs of some neurons are # Calculate mean squared error between the denoised image and the clean image
inputs of other neurons. Now that we are done with the theory part of multi-layer perception, let’s go mse = mean_squared_error(clean_image, denoised_image)
ahead and implement some code in python using the TensorFlow library.
print("Best Hyperparameters:", best_params)
A gentle introduction to neural networks and TensorFlow can be found here: print("Mean Squared Error:", mse)
• Neural Networks
• Introduction to TensorFlow Source code: Output:
A multi-layer perceptron has one input layer and for each input, there is one import numpy as np Best Hyperparameters: {'activation': 'relu', 'alpha': 0.001,
neuron(or node), it has one output layer with a single node for each output and from sklearn.neural_network import MLPRegressor 'hidden_layer_sizes': (50, 50)}
from sklearn.model_selection import GridSearchCV Mean Squared Error: 0.0007964874470608723
it can have any number of hidden layers and each hidden layer can have any from sklearn.metrics import mean_squared_error
number of nodes. A schematic diagram of a Multi-Layer Perceptron (MLP) is
depicted below. # Generate synthetic noisy image data for demonstration purposes
np.random.seed(42)
image_size = (100, 100)
num_samples = 1000
noise_level = 0.1
clean_image = np.random.random(image_size)
noisy_image = clean_image + noise_level * np.random.random(image_size)
25 26 27
Exercise-6 Object detection algorithms are broadly classified into two categories based on how state-of-theart results, beating other real-time object detection algorithms by a large
many times the same input image is passed through a network. margin.
Module name : Advanced Deep Learning Architectures
Exercise: Implement Object Detection Using YOLO. Single-shot object detection While algorithms like Faster RCNN work by detecting possible regions of interest
using the Region Proposal Network and then performing recognition on those
Introduction Single-shot object detection uses a single pass of the input image to make regions separately, YOLO performs all of its predictions with the help of a single
predictions about the presence and location of objects in the image. It processes an fully connected layer.
Object detection is a popular task in computer vision. entire image in a single pass, making them computationally efficient.
It deals with localizing a region of interest within an image and classifying this Methods that use Region Proposal Networks perform multiple iterations for the
region like a typical image classifier. One image can include several regions of However, single-shot object detection is generally less accurate than other methods, same image, while YOLO gets away with a single iteration.
interest pointing to different objects. This makes object detection a more advanced and it’s less effective in detecting small objects. Such algorithms can be used to
problem of image classification. detect objects in real time in resource-constrained environments. Several new versions of the same model have been proposed since the initial release
of YOLO in 2015, each building on and improving its predecessor. Here's a
YOLO (You Only Look Once) is a popular object detection model known for its YOLO is a single-shot detector that uses a fully convolutional neural network timeline showcasing YOLO's development in recent years.
speed and accuracy. It was first introduced by Joseph Redmon et al. in 2016 and has (CNN) to process an image. We will dive deeper into the YOLO model in the next
since undergone several iterations, the latest being YOLO v7. section.
What is YOLO?
You Only Look Once (YOLO) proposes using an end-to-end neural network that
makes predictions of bounding boxes and class probabilities all at once. It differs
from the approach taken by previous object detection algorithms, which repurposed
classifiers to perform detection.
Following a fundamentally different approach to object detection, YOLO achieved The first 20 convolution layers of the model are pre-trained using ImageNet by
28 29 30
plugging in a temporary average pooling and fully connected layer. Then, this pre- Source code: center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
trained model is converted to perform detection since previous research showcased w = int(detection[2] * width)
that adding convolution and connected layers to a pre-trained network improves import cv2 h = int(detection[3] * height)
performance. YOLO’s final fully connected layer predicts both class probabilities import numpy as np
from google.colab.patches import cv2_imshow # Rectangle coordinates
and bounding box coordinates.
x = int(center_x - w / 2)
# Load Yolo y = int(center_y - h / 2)
YOLO divides an input image into an S × S grid. If the center of an object falls into print("LOADING YOLO")
a grid cell, that grid cell is responsible for detecting that object. Each grid cell net = cv2.dnn.readNet("/content/drive/MyDrive/4-1 DL lab/yolov3.weights", boxes.append([x, y, w, h])
"/content/drive/MyDrive/4-1 DL lab/yolov3.cfg") confidences.append(float(confidence))
predicts B bounding boxes and confidence scores for those boxes. These confidence class_ids.append(class_id)
scores reflect how confident the model is that the box contains an object and how # Save all the names in file to the list classes
accurate it thinks the predicted box is. classes = [] # We use NMS function in opencv to perform Non-maximum Suppression
with open("/content/drive/MyDrive/4-1 DL lab/coco.names", "r") as f: # we give it score threshold and nms threshold as arguments.
classes = [line.strip() for line in f.readlines()] indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
YOLO predicts multiple bounding boxes per grid cell. At training time, we only colors = np.random.uniform(0, 255, size=(len(classes), 3))
want one bounding box predictor to be responsible for each object. YOLO assigns # Get layers of the network for i in range(len(boxes)):
one predictor to be “responsible” for predicting an object based on which prediction layer_names = net.getLayerNames() if i in indexes:
x, y, w, h = boxes[i]
has the highest current IOU with the ground truth. This leads to specialization # Determine the output layer names from the YOLO model label = str(classes[class_ids[i]])
between the bounding box predictors. Each predictor gets better at forecasting output_layers = [layer_names[i - 1] for i in color = colors[class_ids[i]]
certain sizes, aspect ratios, or classes of objects, improving the net.getUnconnectedOutLayers()]
print("YOLO LOADED") # Draw the bounding box
overall recall score. One key technique used in the YOLO models is non-maximum cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
suppression (NMS). NMS is a post-processing step that is used to improve the # Capture frame-by-frame
accuracy and efficiency of object detection. In object detection, it is common for img = cv2.imread("/content/drive/MyDrive/4-1 DL lab/sample2.jpg") # Draw the label
img = cv2.resize(img, None, fx=0.4, fy=0.4) cv2.putText(img, label, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 1/2, color, 2)
multiple bounding boxes to be generated for a single object in an image. These
height, width, channels = img.shape print(f"Detected object: {label}, confidence: {confidences[i]}, box: {x},
bounding boxes may overlap or be located at different positions, but they all print("Input Image:") {y}, {w}, {h}")
represent the same object. NMS is used to identify and remove redundant or cv2_imshow(img)
incorrect bounding boxes and to output a single bounding box for each object in the # Using blob function of opencv to preprocess image # Display the image
blob = cv2.dnn.blobFromImage(img, 1 / 255.0, (416, 416), swapRB=True, cv2_imshow(img)
image. Now, let us look into the improvements that the later versions of YOLO crop=False) cv2.waitKey(0)
have brought to the parent model.
# Detecting objects
net.setInput(blob)
outs = net.forward(output_layers)
31 32 33
Exercise-7 grow infinitely large, the training may become unstable, as the network will tend to
"overreact" to such misclassifications (in extreme cases, this might lead to gradients
Module name : Optimization of Training in Deep Learning exploding).
Exercise: Design a Deep learning Network for Robust Bi-
Tempered Logistic Loss. An important note on naming - logistic loss is the same as binary cross-entropy.
Cross-entropy is a general term applicable to multiclass classification, but you
Introduction might encounter the same function called logistic loss in the literature, which is not
entirely correct as the logistic loss is for binary classification only.
Logistic loss
In order to understand the bi-tempered logistic loss, let's recap the standard version Softmax
of this loss function. Logistic loss is the most common loss function used for
training binary classifiers. You can look up the exact formula and other details here. Logistic loss work by comparing the expected probability distribution to the
The plot below displays logistic loss responses for different network predictions: network response. In order for this to be feasible, it has to be ensured that the
network outputs can be interpreted as a probability distribution over multiple
classes, i.e. individual elements are from the range of [0,1] and add up to one. To
meet this requirement, at the end of a typical classification network, a softmax layer
is used.
Detected object: car, confidence: 0.9670625925064087, box: 144, 92, 87, 100
Detected object: person, confidence: 0.975329577922821, box: 69, 76, 131, 213 Softmax is a pretty straightforward function that normalizes a vector of numbers in
Detected object: bicycle, confidence: 0.9977372884750366, box: 68, 189, 163, 212
such a way as to conform to the aforementioned probability axioms. The
normalization uses the exponential function, so it doesn't preserve relative ratios
between individual elements. For the exact formula and more details, you can refer
to this article.
The plots below show the results of passing a set of neuron activations through the
softmax function.
As can be seen, the loss is zero if the response is correct (the activation value is
equal to 1), and grows exponentially when the error increases, reaching its limit of
infinitely large loss when the activation is equal to zero (i.e the loss is an
unbounded function of error). This shape results in large errors being much more
heavily penalized than relatively small ones (e.g the error for 0.45 is seven times
greater than the one for 0.9).
In the context of noisy data, it's reasonable to expect that the network will produce
huge errors for incorrectly labelled (i.e. noisy) data. As the majority of data
annotations are correct, the network progressively learns the general function
underlying the data. When encountering a mislabelled sample, a well-trained
network will produce a correct label that doesn't match with the wrong annotation Notice how all the values became positive, they sum up to 1, and the relative
-1
and causes the error to be very large. By allowing the losses in such scenarios to magnitude of each value is preserved, i.e. if a given value is the 3rd largest in the
34 35 36
input vector, it stays at this position after the transformation. If we consider this set from tensorflow.keras.models import Sequential model.compile(optimizer='adam', loss=bi_tempered_logistic_loss,
of 10 neurons to be an output layer of a classifier network, we can read from the ``` metrics=['accuracy'])
second plot that the network considers the fourth class (denoted with "3") to be the ```
most probable, with the probability of around 30%. The ninth class, with the lowest ### Step 3: Create the Network
value prior to the transformation, stays the least probable. ```python ### Step 6: Train the Model
def create_network(input_shape, num_classes): ```python
Bi-tempered logistic loss model = Sequential([ model.fit(x_train, y_train, epochs=num_epochs, batch_size=batch_size ,
Bi-tempered logistic loss is a modification of the standard logistic loss. Conv2D(32, (3, 3), activation='relu', input_shape=input_shape), validation_data=(x_val, y_val))
MaxPooling2D((2, 2)), ```
The authors propose a slightly altered, parametric version of the typical logistic Conv2D(64, (3, 3), activation='relu'), ### Step 7: Evaluate the Model
loss. The new function has some desirable features which make it useful for training MaxPooling2D((2, 2)), ```python
neural networks that are more robust against noisy data. This means that Flatten(), test_loss, test_accuracy = model.evaluate(x_test, y_test)
imperfections in the training data don't impact the model quality as much as they Dense(128, activation='relu'), print(f'Test accuracy: {test_accuracy*100:.2f}%')
would normally if the network was trained with the typical loss. Dense(num_classes, activation='softmax') ```
])
The function differs from its standard counterpart by being parametric. Under the return model ### Step 8: Fine-tuning (Optional)
hood, it uses parametric alternatives to the logarithm and softmax functions, called ``` Depending on the performance of your model, you may need to fine-tune the
tempered logarithm and tempered exponential functions. Let's look into each of Replace this architecture with one that suits your specific problem. architecture, hyperparameters, or even experiment with different variants of the Bi-
them to build an understanding of their impact on the final loss function. Tempered Logistic Loss.
### Step 4: Define Bi-Tempered Logistic Loss
Procedure : You'll need to define the Bi-Tempered Logistic Loss function. Make sure to define Remember to preprocess your data appropriately, use data augmentation if
the loss and the associated gradient. necessary, and adjust hyperparameters like learning rate, batch size, etc., based on
Designing a deep learning network involves several steps, and incorporating a your specific problem.
custom loss function like the Bi-Tempered Logistic Loss requires careful ```python
consideration. Below, I'll outline the steps you can take to design a deep learning def bi_tempered_logistic_loss(y_true, y_pred, t1=0.8, t2=1.2,label_smoothing=0.1): This is a basic template to get you started. You might need to customize and adapt
network using the Bi-Tempered Logistic Loss: """ these steps based on the specific details of your problem, dataset, and requirements.
Bi-Tempered Logistic Loss.
### Step 1: Define the Network Architecture """ Source code:
Choose an architecture suitable for your specific task. This could be a y_pred = tf.clip_by_value(y_pred, 1e-7, 1.0 - 1e-7)
Convolutional Neural Network (CNN) for computer vision tasks, a Recurrent temp1 = (1 - y_true) * (y_pred ** (t1 - 1)) / t1 import numpy as np
Neural Network (RNN) for sequential data, or a feedforward neural network for temp2 = (1 - y_true) * ((1 - y_pred) ** (t2 - 1)) / t2 import tensorflow as tf
loss_values = temp1 + temp2 from tensorflow.keras.layers import Dense
simpler tasks. Make sure to tailor the architecture to the requirements of your from tensorflow.keras.models import Sequential
problem. loss = tf.reduce_sum(loss_values, axis=-1)
loss = tf.reduce_mean(loss) # Step 1: Define a simple network architecture
### Step 2: Import Libraries and Modules return loss def create_network(input_shape):
model = Sequential([
```python ``` Dense(64, activation='relu', input_shape=(input_shape,)),
import tensorflow as tf ### Step 5: Compile the Model Dense(1, activation='sigmoid') ]) # Added the sigmoid activation here
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten ```python return model
model = create_network(input_shape, num_classes)
37 38 39
import tensorflow as tf
# Create an instance of the model
from tensorflow.keras.models import Sequential model = alexnet_model(input_shape, num_classes)
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# Print the model summary
def alexnet_model(input_shape, num_classes):
model.summary()
model = Sequential([
# First Convolutional Layer
Conv2D(96, (11,11), strides=(4,4), activation='relu',
input_shape=input_shape, padding='valid'),
MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'),
40 41 42
43 44 45
Exercise-10 Source code: # Build and compile the discriminator
discriminator = build_discriminator(image_shape)
Module name : Advanced GANs discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5),
import numpy as np metrics=['accuracy'])
Exercise: Demonstration of GAN import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist # Build and compile the generator
from tensorflow.keras.models import Sequential, Model generator = build_generator(latent_dim)
Introduction from tensorflow.keras.layers import Dense, LeakyReLU, BatchNormalization, Reshape, discriminator.trainable = False
Flatten, Input gan = build_gan(generator, discriminator)
Generative Adversarial Networks (GANs) are a type of deep learning model used from tensorflow.keras.optimizers import Adam gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))
for generating new data samples that are similar to a given dataset. GANs consist of # Load MNIST data # Define training loop
two neural networks, a generator and a discriminator, which are trained (X_train, _), (_, _) = mnist.load_data() def train_gan(epochs=1, batch_size=128):
simultaneously through adversarial training. X_train = (X_train.astype(np.float32) - 127.5) / 127.5 batch_count = X_train.shape[0] // batch_size
X_train = X_train.reshape(X_train.shape[0], 784)
for e in range(epochs):
In this example, I'll demonstrate how to create a simple GAN to generate # Define GAN architecture for _ in range(batch_count):
handwritten digits (similar to those in the MNIST dataset). def build_generator(latent_dim): noise = np.random.normal(0, 1, size=[batch_size, latent_dim])
model = Sequential([ generated_images = generator.predict(noise)
Dense(128, input_dim=latent_dim), image_batch = X_train[np.random.randint(0, X_train.shape[0],
In this example, we: LeakyReLU(alpha=0.2), size=batch_size)]
BatchNormalization(momentum=0.8),
Load the MNIST dataset and preprocess it. Dense(256), X = np.concatenate([image_batch, generated_images])
LeakyReLU(alpha=0.2), y_dis = np.zeros(2 * batch_size)
BatchNormalization(momentum=0.8), y_dis[:batch_size] = 0.9
Define the generator, discriminator, and the GAN. Dense(512),
LeakyReLU(alpha=0.2), discriminator.trainable = True
BatchNormalization(momentum=0.8), d_loss = discriminator.train_on_batch(X, y_dis)
Compile the discriminator and the GAN.
Dense(784, activation='tanh')
]) noise = np.random.normal(0, 1, size=[batch_size, latent_dim])
Train the GAN in a loop. return model y_gen = np.ones(batch_size)
discriminator.trainable = False
def build_discriminator(image_shape): g_loss = gan.train_on_batch(noise, y_gen)
After training, you'll see generated images saved in the current directory with model = Sequential([
filenames like gan_generated_image_epoch_X.png. Dense(512, input_dim=image_shape), print(f"Epoch {e} - D loss: {d_loss[0]} | D accuracy: {100 * d_loss[1]} | G
LeakyReLU(alpha=0.2), loss: {g_loss}")
Dense(256),
Keep in mind that this is a very basic GAN example. Real-world GAN applications LeakyReLU(alpha=0.2), if e % 1 == 0:
require more advanced architectures, training techniques, and careful Dense(1, activation='sigmoid') plot_generated_images(e, generator, latent_dim)
hyperparameter tuning. ])
return model # Function to plot generated images
def plot_generated_images(epoch, generator, latent_dim, examples=10, dim=(1, 10),
def build_gan(generator, discriminator): figsize=(10, 1)):
model = Sequential([generator, discriminator]) noise = np.random.normal(0, 1, size=[examples, latent_dim])
return model generated_images = generator.predict(noise)
generated_images = generated_images.reshape(examples, 28, 28)
# Define hyperparameters
latent_dim = 100 plt.figure(figsize=figsize)
image_shape = 784 for i in range(generated_images.shape[0]):
plt.subplot(dim[0], dim[1], i+1)
46 47 48
plt.tight_layout()
plt.savefig(f'gan_generated_image_epoch_{epoch}.png')
Output:
B.Tony (21HP1A0562)
ALIET, Vijayawada
49