0% found this document useful (0 votes)
13 views6 pages

DL Lab Manual Full - Pagenumber

Uploaded by

durainainar6789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views6 pages

DL Lab Manual Full - Pagenumber

Uploaded by

durainainar6789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

train_dataset =

DEEP LEARNING LAB MANUAL train_dataset.map(preprocess_image).shuffle(50000).batch(BATCH_SIZE).prefetch(tf.data


1) Convolutional Layer: In a typical neural network each input neuron is .experimental.AUTOTUNE)
connected to the next hidden layer. In CNN, only a small region of the input layer test_dataset =
Exercise-1 neurons connect to the neuron hidden layer. test_dataset.map(preprocess_image).batch(BATCH_SIZE).prefetch(tf.data.experimental.AU
TOTUNE)
Build a Convolution Neural Network for Image Recognition
2) Pooling Layer: The pooling layer is used to reduce the dimensionality of the # Define the CNN model
Introduction feature map. There will be multiple activation & pooling layers inside the hidden def build_cnn_model(input_shape, num_classes):
layer of the CNN. model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
The various deep learning methods use data to train neural network algorithms to do input_shape=input_shape),
a variety of machine learning tasks, such as the classification of different classes of 3) Fully-Connected layer: Fully Connected Layers form the last few layers in the tf.keras.layers.MaxPooling2D((2, 2)),
objects. Convolutional neural networks are deep learning algorithms network. The input to the fully connected layer is the output from the final Pooling tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D((2, 2)),
that are very powerful for the analysis of images. This article will explain to you or Convolutional Layer, which is flattened and then fed into the fully
tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
how to construct, train and evaluate convolutional neural networks. connected layer. tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
You will also learn how to improve their ability to learn from data, and how to tf.keras.layers.Dense(num_classes, activation='softmax')
interpret the results of the training. Deep Learning has various applications like ])
image processing, natural language processing, etc. It is also used in Medical return model

Science, Media & Entertainment, Autonomous Cars, etc. # Define the input shape and number of classes
input_shape = (32, 32, 3)
What is CNN? num_classes = 10

# Build the CNN model


CNN is a powerful algorithm for image processing. These algorithms are currently model = build_cnn_model(input_shape, num_classes)
the best algorithms we have for the automated processing of images. Many # Compile the model
model.compile(optimizer='adam',
companies use these algorithms to do things like identifying the objects in an
loss='sparse_categorical_crossentropy',
image. Images contain data of RGB combination. Matplotlib can be used to import metrics=['accuracy'])
an image into memory from a file. The computer doesn’t see an image, all it sees is Source code: # Train the model
an array of numbers. Color images are stored in 3-dimensional arrays. The first two import tensorflow as tf epochs = 10
import tensorflow_datasets as tfds model.fit(train_dataset, epochs=epochs)
dimensions correspond to the height and width of the image (the number of pixels). # Evaluate the model on the test set
The last dimension corresponds to the red, green, and blue colors present in each # Load the CIFAR-10 dataset test_loss, test_acc = model.evaluate(test_dataset)
pixel. dataset, info = tfds.load('cifar10', as_supervised=True, with_info=True) print(f"Test accuracy: {test_acc}")
train_dataset, test_dataset = dataset['train'], dataset['test']

Three Layers of CNN # Normalize the images and shuffle the training data
def preprocess_image(image, label): Output:
image = tf.image.resize(image, (32, 32))
Convolutional Neural Networks specialized for applications in image & video image = tf.cast(image, tf.float32) / 255.0 Test accuracy: 0.6983000040054321
recognition. CNN is mainly used in image analysis tasks like Image recognition, return image, label
Object detection & Segmentation.
BATCH_SIZE = 32

There are three types of layers in Convolutional Neural Networks:


1 2 3

Exercise-2 1. Load and explore image data. print(df.isnull().sum())

2. Define the neural network architecture.


Design Artificial Neural Networks for Identifying and 3. Specify training options.
# Print a summary of the dataset, including data types and non-null counts
print("\nSummary of the dataset (data types and non-null counts):")
Classifying an actor using Kaggle Dataset. 4. Train the neural network. print(df.info())
5. Predict the labels of new data and calculate the classification accuracy.
# Get summary statistics for numerical columns in the dataset
Link for Kaggle Dataset : insurance.csv print("\nSummary statistics of numerical columns:")
A neural network is a method in artificial intelligence that teaches computers to print(df.describe())

Introduction process data in a way that is inspired by the human brain. It is a type of machine
# Display the first few rows again to check data after initial analysis
learning process, called deep learning, that uses interconnected nodes or neurons in print("\nFirst few rows of the dataset after initial analysis:")
Artificial neural networks (ANNs, also shortened to neural networks (NNs) or a layered structure that resembles the human brain. print(df.head())

neural nets) are a branch of machine learning models that are built using principles
of neuronal organization discovered by connectionism in the biological neural • We have worked on predicting insurance cost based on the features provided # Convert the 'sex' column into a dummy variable and drop the first category
('female')
networks constituting animal brains. in the data set Male = pd.get_dummies(df['sex'], drop_first=True)
• We wanted to test the dataset with different regression models
An ANN is based on a collection of connected units or nodes called artificial • We have also done feature engineering initially and then exploratory # Add the dummy variable to the original dataframe
df = pd.concat([df, Male], axis=1)
neurons, which loosely model the neurons in a biological brain. Each connection, analysis to build an understanding of the relationship between variables.
like the synapses in a biological brain, can transmit a signal to other neurons. An # Convert the 'smoker' column into a dummy variable and drop the first category
('no')
artificial neuron receives signals then processes them and can signal neurons Smoker = pd.get_dummies(df['smoker'], drop_first=True)
connected to it. The "signal" at a connection is a real number, and the output of
each neuron is computed by some non-linear function of the sum of its inputs. The Source code: # Add the dummy variable to the original dataframe
df = pd.concat([df, Smoker], axis=1)
connections are called edges. Neurons and edges typically have a weight that
import numpy as np
adjusts as learning proceeds. The weight increases or decreases the strength of the # Rename the 'yes' column to 'Smoker' for clarity
import pandas as pd
signal at a connection. Neurons may have a threshold such that a signal is sent only import matplotlib.pyplot as plt
df = df.rename(columns={'yes':'Smoker'})
if the aggregate signal crosses that threshold. import seaborn as sns
# Get the unique values from the 'region' column
import missingno as msno
print("\nUnique values in the 'region' column:")
Typically, neurons are aggregated into layers. Different layers may perform # Load the insurance dataset
print(df['region'].unique())
different transformations on their inputs. Signals travel from the first layer (the df = pd.read_csv('/content/sample_data/insurance (1).csv')
# Convert the 'region' column into dummy variables for each region
input layer), to the last layer (the output layer), possibly after traversing the layers region = pd.get_dummies(df['region'])
# Display the first few rows of the dataset
multiple times .
print("First few rows of the dataset:")
# Add the dummy variables to the original dataframe
print(df.head())
df = pd.concat([df, region], axis=1)
How artificial neural network is used for classification?
# Set the figure size for the missing values matrix plot
# Display the first few rows again to check the dataframe after encoding
plt.figure(figsize=(8,5))
Classification ANNs seek to classify an observation as belonging to some discrete print("\nFirst few rows of the dataset after encoding categorical variables:")
print(df.head())
class as a function of the inputs. The input features (independent variables) can be # Visualize the missing values in the dataset using a matrix plot
categorical or numeric types, however, we require a categorical feature as the print("\nVisualizing missing values in the dataset:")
# Set the figure size for the plot
msno.matrix(df)
dependent variable. plt.show()
plt.figure(figsize=(8,4))

# Create a count plot for the 'sex' column with a specified color palette
Create Simple Deep Learning Neural Network for Classification # Count the number of missing values in each column (if any)
print("\nCount plot for the 'sex' column:")
print("\nCounting missing values in each column:")

4 5 6

sns.set_style('white') sns.boxplot(x='region', y='charges', data=df, palette='coolwarm', hue='sex', from sklearn.preprocessing import MinMaxScaler
sns.countplot(x='sex', data=df, palette='GnBu') ax=ax[1])
sns.despine(left=True) plt.show() # Initialize the scaler and fit it on the training data
plt.show() scaler = MinMaxScaler()
# Create subplots for multiple scatter plots scaler.fit(X_train)
# Set the figure size for the next plot fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12,5))
plt.figure(figsize=(8,4)) # Transform the training data using the fitted scaler
# Scatter plot for 'bmi' vs 'charges', colored by 'sex' X_train = scaler.transform(X_train)
# Create a boxplot for 'charges' based on 'sex' and 'Smoker' status, with a specified print("\nScatter plot of 'bmi' vs 'charges' by 'sex':")
color palette sns.scatterplot(x='bmi', y='charges', data=df, palette='GnBu_r', hue='sex', ax=ax[0]) # Transform the test data using the fitted scaler
print("\nBoxplot of 'charges' by 'sex' and 'Smoker' status:") X_validate = scaler.transform(X_test)
sns.set_style('white') # Scatter plot for 'bmi' vs 'charges', colored by 'Smoker' status
sns.boxplot(x='sex', y='charges', data=df, palette='OrRd', hue='Smoker') print("\nScatter plot of 'bmi' vs 'charges' by 'Smoker' status:") from tensorflow.keras.models import Sequential
sns.despine(left=True) sns.scatterplot(x='bmi', y='charges', data=df, palette='magma', hue='Smoker', from tensorflow.keras.layers import Dense, Dropout
plt.show() ax=ax[1]) from tensorflow.keras.callbacks import EarlyStopping

# Create subplots for multiple scatter plots sns.set_style('dark') # Initialize a sequential model
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(12,5)) sns.despine(left=True) model = Sequential()
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
# Scatter plot for 'age' vs 'charges', colored by 'sex' plt.show() # Add the first hidden layer with 8 units and ReLU activation
print("\nScatter plot of 'age' vs 'charges' by 'sex':") model.add(Dense(units = 8, activation = 'relu'))
sns.scatterplot(x='age', y='charges', data=df, palette='coolwarm', hue='sex', # Drop the columns 'sex', 'region', 'smoker', and 'southwest' from the dataframe
ax=ax[0]) df.drop(['sex', 'region', 'smoker', 'southwest'], axis=1, inplace=True) # Add the second hidden layer with 3 units and ReLU activation
model.add(Dense(units = 3, activation = 'relu'))
# Scatter plot for 'age' vs 'charges', colored by 'Smoker' # Display the first few rows again to check the dataframe after dropping columns
print("\nScatter plot of 'age' vs 'charges' by 'Smoker' status:") print("\nFirst few rows of the dataset after dropping unnecessary columns:") # Add the output layer with 1 unit (since we're predicting a single value)
sns.scatterplot(x='age', y='charges', data=df, palette='GnBu', hue='Smoker', print(df.head()) model.add(Dense(units = 1))
ax=ax[1])
# Set the figure size for the heatmap plot # Compile the model using the Adam optimizer and Mean Squared Error as the loss
# Scatter plot for 'age' vs 'charges', colored by 'region' plt.figure(figsize=(10,4)) function
print("\nScatter plot of 'age' vs 'charges' by 'region':") model.compile(optimizer = 'adam', loss = 'mse')
sns.scatterplot(x='age', y='charges', data=df, palette='magma_r', hue='region', # Create a heatmap of the correlation matrix of the dataframe, with a specified color
ax=ax[2]) map # Define early stopping to prevent overfitting, stopping if validation loss doesn't
print("\nHeatmap of the correlation matrix:") improve for 15 epochs
# Set the style for the seaborn plots to 'dark' sns.heatmap(df.corr(), cmap='OrRd') early_stop = EarlyStopping(monitor='val_loss', mode= 'min', verbose= 0, patience=15)
sns.set_style('dark') plt.show()
sns.despine(left=True) # Train the model with the training data, validate on the test data, and use early
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.) # Define the feature matrix X (all columns except 'charges') and target variable y stopping
plt.show() ('charges') model.fit(x=X_train, y=y_train, epochs = 2000, validation_data=(X_test, y_test),
X = df.drop('charges', axis=1) batch_size=128, callbacks=[early_stop])
# Create subplots for multiple boxplots y = df['charges']
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12,5)) # Convert the training history to a DataFrame and plot the loss over epochs
# Import the train_test_split function to split the dataset into training and testing loss = pd.DataFrame(model.history.history)
# Boxplot for 'region' vs 'charges', colored by 'Smoker' status sets loss.plot()
print("\nBoxplot of 'charges' by 'region' and 'Smoker' status:") from sklearn.model_selection import train_test_split plt.title("Model Loss Over Epochs") # Add a title to the plot
sns.boxplot(x='region', y='charges', data=df, palette='GnBu', hue='Smoker', ax=ax[0]) plt.xlabel("Epochs") # Add label for the x-axis
# Split the dataset into training and testing sets (75% training, 25% testing) plt.ylabel("Loss") # Add label for the y-axis
# Boxplot for 'region' vs 'charges', colored by 'sex' X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) plt.show()
print("\nBoxplot of 'charges' by 'region' and 'sex':")
# Import the MinMaxScaler to scale the features from sklearn.metrics import mean_squared_error

7 8 9
dtype: int64
# Make predictions on the test data
Summary of the dataset (data types and non-null counts):
pred = model.predict(X_test) <class 'pandas.core.frame.DataFrame'>
RangeIndex: 1338 entries, 0 to 1337
# Calculate and print the Root Mean Squared Error (RMSE) for the test set predictions Data columns (total 7 columns):
# Column Non-Null Count Dtype
rmse_test = np.sqrt(mean_squared_error(y_test, pred)) --- ------ -------------- -----
print(f"Root Mean Squared Error(RMSE) on Test Data: {rmse_test}") 0 age 1338 non-null int64
1 sex 1338 non-null object
# Select a subset of the data for further predictions (dropping the 'charges' column) 2 bmi 1338 non-null float64
3 children 1338 non-null int64
entry_1 = df[:][257:477].drop('charges', axis=1) 4 smoker 1338 non-null object
5 region 1338 non-null object Boxplot of 'charges' by 'sex' and 'Smoker' status:
# Make predictions on the selected subset 6 charges 1338 non-null float64
pred = model.predict(entry_1) dtypes: float64(2), int64(2), object(3)
memory usage: 73.3+ KB
None
# Calculate and print the RMSE for this subset of data
rmse_entry_1 = np.sqrt(mean_squared_error(df[:][257:477]['charges'], pred)) Summary statistics of numerical columns:
print(f"Root Mean Squared Error(RMSE) on Subset Data: {rmse_entry_1}") age bmi children charges
count 1338.000000 1338.000000 1338.000000 1338.000000
mean 39.207025 30.663397 1.094918 13270.422265
std 14.049960 6.098187 1.205493 12110.011237
min 18.000000 15.960000 0.000000 1121.873900
Output: 25% 27.000000 26.296250 0.000000 4740.287150
50% 39.000000 30.400000 1.000000 9382.033000
75% 51.000000 34.693750 2.000000 16639.912515
First few rows of the dataset: max 64.000000 53.130000 5.000000 63770.428010
age sex bmi children smoker region charges
0 19 female 27.900 0 yes southwest 16884.92400 First few rows of the dataset after initial analysis: Scatter plot of 'age' vs 'charges' by 'sex':
1 18 male 33.770 1 no southeast 1725.55230 age sex bmi children smoker region charges
2 28 male 33.000 3 no southeast 4449.46200 0 19 female 27.900 0 yes southwest 16884.92400 Scatter plot of 'age' vs 'charges' by 'Smoker' status:
3 33 male 22.705 0 no northwest 21984.47061 1 18 male 33.770 1 no southeast 1725.55230
4 32 male 28.880 0 no northwest 3866.85520 2 28 male 33.000 3 no southeast 4449.46200 Scatter plot of 'age' vs 'charges' by 'region':
3 33 male 22.705 0 no northwest 21984.47061
Visualizing missing values in the dataset: 4 32 male 28.880 0 no northwest 3866.85520
<Figure size 800x500 with 0 Axes>
Unique values in the 'region' column:
['southwest' 'southeast' 'northwest' 'northeast']

First few rows of the dataset after encoding categorical variables:


age sex bmi children smoker region charges male \
0 19 female 27.900 0 yes southwest 16884.92400 False
1 18 male 33.770 1 no southeast 1725.55230 True
2 28 male 33.000 3 no southeast 4449.46200 True
3 33 male 22.705 0 no northwest 21984.47061 True
4 32 male 28.880 0 no northwest 3866.85520 True

Smoker northeast northwest southeast southwest


0 True False False False True
1 False False False True False
2 False False False True False
3 False False True False False Boxplot of 'charges' by 'region' and 'Smoker' status:
4 False False True False False
Counting missing values in each column:
Boxplot of 'charges' by 'region' and 'sex':
age 0 Count plot for the 'sex' column:
sex 0 <ipython-input-13-b2a8a110e2e3>:73: FutureWarning:
bmi 0
children 0 Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0.
smoker 0 Assign the `x` variable to `hue` and set `legend=False` for the same effect.
region 0
charges 0 sns.countplot(x='sex', data=df, palette='GnBu')

10 11 12

Exercise-3
Module name : Understanding and Using CNN : Image
recognition
Design a CNN for Image Recognition which includes hyper
parameter tuning.

Introduction
Deep Learning models have important applications in image processing. However,
one of the challenges in this field is the definition of hyperparameters. Thus, the
objective of this work is to propose a rigorous methodology for hyperparameter
Scatter plot of 'bmi' vs 'charges' by 'sex': tuning of Convolutional Neural Network for building construction image
classification.
Scatter plot of 'bmi' vs 'charges' by 'Smoker' status:

Neural Network Hyperparameters (Deep Learning)

Neural network hyperparameters are like settings you choose before teaching a
neural network to do a task. They control things like how many layers the network
has, how quickly it learns, and how it adjusts its internal values. Picking the right
hyperparameters is important to help the network learn effectively and solve the
task accurately. It’s a bit like adjusting the knobs on a machine to make it work just
right for a particular job.

Neural Network is a Deep Learning technic to build a model according to training


data to predict unseen data using many layers consisting of neurons. This is similar
First few rows of the dataset after dropping unnecessary columns: to other Machine Learning algorithms, except for the use of multiple layers. The use
age bmi children charges male Smoker northeast northwest \
0 19 27.900 0 16884.92400 False True False False
of multiple layers is what makes it Deep Learning. Instead of directly building
1 18 33.770 1 1725.55230 True False False False Machine Learning in 1 line, Neural Network requires users to build the architecture
2 28 33.000 3 4449.46200 True False False False
3 33 22.705 0 21984.47061 True False False True before compiling them into a model. Users will have to arrange how many layers
4 32 28.880 0 3866.85520 True False False True and how many nodes or neurons to build. This is not found in other conventional
southeast Machine Learning algorithms.
0 False Root Mean Squared Error(RMSE) on Test Data: 12304.159184071554
1 True
2 True Root Mean Squared Error(RMSE) on Subset Data: 11313.821472559966 Different datasets require different sets of hyperparameters to predict accurately.
3 False
4 False But, the large number of hyperparameters makes users difficult to decided which
Heatmap of the correlation matrix:
one to choose. There is no answer to how many layers are the most suitable, how
many neurons are the best, or which optimizer suits the best for all datasets.
Hyperparameter-tuning is important to find the possible best sets ofhyperparameters
13 14 15

to build the model from a specific dataset. Hyperparameter Tuning in Deep Learning Source code:
For installing keras-tuner (run in another cell)
!pip install keras-tuner
Here we will demonstrate the process to tune 2 things of Neural Network: (1) the The first hyperparameter to tune is the number of neurons in each hidden
hyperparameters and (2) the layers. I find it more difficult to find the latter tutorials layer. In this case, the number of neurons in every layer is set to be the same. It
than the former. The first one is the same as other conventional Machine Learning also can be made different. The number of neurons should be adjusted to the import tensorflow as tf
algorithms. The hyperparameters to tune are the number of neurons, activation solution complexity. The task with a more complex level to predict needs more from tensorflow import keras
function, optimizer, learning rate, batch size, and epochs. The second step is to tune from tensorflow.keras import layers
neurons. The number of neurons range is set to be from 10 to 100. from tensorflow.keras.datasets import cifar10
the number of layers. This is what other conventional algorithms do not have. from kerastuner.tuners import RandomSearch
Different layers can affect the accuracy. Fewer layers may give an underfitting An activation function is a parameter in each layer. Input data are fed to the input
result while too many layers may make it overfitting. layer, followed by hidden layers, and the final output layer. The output layer # Step 1: Load and preprocess the dataset (CIFAR-10)
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
contains the output value. The input values moving from a layer to another layer x_train = x_train.astype('float32') / 255.0
For the hyperparameter-tuning demonstration, I use a dataset provided by Kaggle. I keep changing according to the activation function. The activation function decides x_test = x_test.astype('float32') / 255.0
build a simple Multilayer Perceptron (MLP) neural network to do a binary how to compute the input values of a layer into output values. The output values of
# Convert labels to one-hot encoding
classification task with prediction probability. The used package in Python is Keras a layer are then passed to the next layer as input values again. The next layer then y_train = keras.utils.to_categorical(y_train, 10) # 10 classes in CIFAR10
built on top of Tensorflow. The dataset has an input dimension of 10. There are two computes the values into output values for another layer again. There are 9 y_test = keras.utils.to_categorical(y_test, 10)
hidden layers, followed by one output layer. The accuracy metric is the accuracy activation functions to tune in to this demonstration. Each activation function has its
score. The callback of EarlyStopping is used to stop the learning process if there is own formula (and graph) to compute the input values # Step 2: Define the CNN architecture
def build_model(hp):
no accuracy improvement in 20 epochs. Below is the illustration. model = keras.Sequential()
The layers of a neural network are compiled and an optimizer is assigned. The
optimizer is responsible to change the learning rate and weights of neurons in the # Tune the number of convolutional layers and filters
for i in range(hp.Int('num_conv_layers', 2, 4)):
neural network to reach the minimum loss function. Optimizer is very important to model.add(layers.Conv2D(hp.Int(f'conv_{i}_filters', 32, 256, 32), (3, 3),
achieve the possible highest accuracy or minimum loss. There are 7 optimizers to activation='relu', padding='same'))
choose from. Each has a different concept behind it. model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Flatten())
One of the hyperparameters in the optimizer is the learning rate. We will also tune
the learning rate. Learning rate controls the step size for a model to reach the # Tune the number of dense layers and units
minimum loss function. A higher learning rate makes the model learn faster, but it for i in range(hp.Int('num_dense_layers', 1, 3)):
model.add(layers.Dense(units=hp.Int(f'dense_{i}_units', 64, 512, 32),
may miss the minimum loss function and only reach the surrounding of it. A lower activation='relu'))
learning rate gives a better chance to find a minimum loss function. As a tradeoff
lower learning rate needs higher epochs, or more time and memory capacity # Output layer
model.add(layers.Dense(10, activation='softmax'))
resources.
# Compile the model
model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', [1e-3,
1e-4])),
loss='categorical_crossentropy', metrics=['accuracy'])

return model

# Step 3: Initialize the Keras Tuner RandomSearch and search for the best
hyperparameters

16 17 18
tuner = RandomSearch(
Exercise-4 which are formed from feedforward networks, are similar to human brains in their
build_model,
behaviour. Simply said, recurrent neural networks can anticipate sequential data in a
objective='val_accuracy', Module name: Predicting Sequential Data way that other algorithms can’t.
max_trials=5, # Number of different hyperparameter combinations to try
directory='hyperparameter_tuning', # Directory to store the results Exercise: Implement a Recurrence Neural Network for
)
project_name='cifar10_tuner' # Name of the project Predicting Sequential Data.
# Search for the best hyperparameter configuration Introduction
tuner.search(x_train, y_train, epochs=1, validation_data=(x_test, y_test))

# Step 4: Get the best hyperparameters and build the final model A Deep Learning approach for modelling sequential data is Recurrent Neural
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0] Networks (RNN). RNNs were the standard suggestion for working with sequential
final_model = build_model(best_hps)
data before the advent of attention models. Specific parameters for each element of All of the inputs and outputs in standard neural networks are independent of one
# Step 5: Train and evaluate the final model the sequence may be required by a deep feedforward model. It may also be unable another, however in some circumstances, such as when predicting thenext word of a
final_model.fit(x_train, y_train, epochs=1, validation_data=(x_test, y_test)) to generalize to variable-length sequences. phrase, the prior words are necessary, and so the previous
# Evaluate the final model on the test set
words must be remembered. As a result, RNN was created, which used a Hidden
loss, accuracy = final_model.evaluate(x_test, y_test) Layer to overcome the problem. The most important component of RNN is the
print(f'Test accuracy: {accuracy}') Hidden state, which remembers specific information about a sequence.

Output: RNNs have a Memory that stores all information about the calculations. It employs
the same settings for each input since it produces the same outcome by performing
Test accuracy: 0.5800999999046326 the same task on all inputs or hidden layers.

Recurrent Neural Networks use the same weights for each element of the sequence, The Architecture of a Traditional RNN
decreasing the number of parameters and allowing the model to generalize to
sequences of varying lengths. RNNs generalize to structured data other than RNNs are a type of neural network that has hidden states and allows past
sequential data, such as geographical or graphical data, because of its design. outputs to be used as inputs. They usually go like this:

Recurrent neural networks, like many other deep learning techniques, are relatively
old. They were first developed in the 1980s, but we didn’t appreciate their full
potential until lately. The advent of long short-term memory (LSTM) in the 1990s,
combined with an increase in computational power and the vast amounts of data
that we now have to deal with, has really pushed RNNs to the forefront.

What is a Recurrent Neural Network (RNN)?

Neural networks imitate the function of the human brain in the fields of AI,
machine learning, and deep learning, allowing computer programs to recognize
patterns and solve common issues.
RNNs are a type of neural network that can be used to model sequence data. RNNs,
19 20 21

# Compile the model


model.compile(loss='mean_squared_error', optimizer='adam')

# Train the model


epochs = 50
batch_size = 32
model.fit(train_input, train_target, epochs=epochs, batch_size=batch_size)

# Evaluate the model


RNN architecture can vary depending on the problem you’re trying to solve. From Source code: loss = model.evaluate(test_input, test_target)
import numpy as np print("Test loss:", loss)
those with a single input and output to those with many (with variations between). import tensorflow as tf
from tensorflow.keras.models import Sequential
# Make predictions
Below are some examples of RNN architectures that can help you better from tensorflow.keras.layers import Dense, SimpleRNN
num_predictions = 10
understand this. # Generate some sample sequential data initial_input = test_input[0:1, :, :] # Take the first sequence from the test set
def generate_data(num_samples, time_steps): and add batch dimension
• One To One: There is only one pair here. A one-to-one architecture is used in data = np.arange(0, num_samples * time_steps) predictions = []
data = np.sin(data * 0.1) # A simple sine wave as sequential data for _ in range(num_predictions):
traditional neural networks. # Predict the next value and update the input sequence
• One To Many: A single input in a one-to-many network might result in numerous
data = data.reshape(num_samples, time_steps, 1)
return data prediction = model.predict(initial_input)
outputs. One too many networks are used in the production of music, for example. predictions.append(prediction[0, 0]) # Store the predicted value (assuming 1

• Many To One: In this scenario, a single output is produced by combining many num_samples = 1000 output unit)
initial_input = np.concatenate((initial_input[:, 1:, :], prediction[:,
time_steps = 10 # Number of time steps in each input sequence
inputs from distinct time steps. Sentiment analysis and emotion identification use np.newaxis, :]), axis=1)
such networks, in which the class label is determined by a sequence of words. data = generate_data(num_samples, time_steps) print("Predictions:", predictions)
• Many To Many: For many to many, there are numerous options. Two inputs
# Split the data into training and testing sets
yield three outputs. Machine translation systems, such as English to French or vice
train_ratio = 0.8
versa translation systems, use many to many networks. train_size = int(train_ratio * num_samples)
train_data = data[:train_size]
test_data = data[train_size:]
Output:
Common Activation Functions
# Prepare the data for training and testing Test loss: 0.03264236077666283
A neuron’s activation function dictates whether it should be turned on or off. train_input = train_data[:, :-1, :] # Input data for training (all time steps except Predictions: [0.55553997, 0.42471844, 0.5136012, 0.57316655, 0.4701597,
0.71725094, 0.5853208, 0.44183043, 0.7576916, 0.41853464]
Nonlinear functions usually transform a neuron’s output to a number between the last one)
train_target = train_data[:, 1:, :] # Target data for training (all time steps
0 and 1 or -1 and 1. except the first one)

test_input = test_data[:, :-1, :] # Input data for testing (all time steps except
the last one)
test_target = test_data[:, 1:, :] # Target data for testing (all time steps except
the first one)

# Build the RNN model


model = Sequential()
model.add(SimpleRNN(32, input_shape=(time_steps - 1, 1))) # 32 units in the hidden
layer
model.add(Dense(1)) # Output layer with 1 unit

22 23 24

Exercise-5 and thus three input nodes and the hidden layer has three nodes. The output layer }

gives two outputs, therefore there are two output nodes. The nodes in the input layer
Module name: Removing noise from the images take input and forward it for further process, in the diagram above the nodes in the
# Create the GridSearchCV object to perform hyperparameter tuning
grid_search = GridSearchCV(mlp, param_grid, cv=3,
Exercise: Implement Multi-Layer Perceptron algorithm for input layer forwards their output to each of the three nodes in the hidden layer, and scoring='neg_mean_squared_error', n_jobs=-1)
Image denoising in the same way, the hidden layer processes the information and passes it to the
# Perform hyperparameter tuning
hyperparameter tuning. output layer. grid_search.fit(X_train, y_train)

Introduction Every node in the multi-layer perception uses a sigmoid activation function. The # Get the best hyperparameters and the best MLP model
best_params = grid_search.best_params_
sigmoid activation function takes real values as input and converts them to numbers best_mlp = grid_search.best_estimator_
Multi-layer perception is also known as MLP. It is fully connected dense layers, between 0 and 1 using the sigmoid formula.
# Predict the denoised image using the best model
which transform any input dimension to the desired dimension. A multi-layer
perception is a neural network that has multiple layers. To create a neural (𝒙) = 𝟏 ∕ (𝟏 + 𝐞𝐱𝐩⁡(−𝒙)⁡) denoised_image = best_mlp.predict(X_train).reshape(image_size)

network we combine neurons together so that the outputs of some neurons are # Calculate mean squared error between the denoised image and the clean image

inputs of other neurons. Now that we are done with the theory part of multi-layer perception, let’s go mse = mean_squared_error(clean_image, denoised_image)
ahead and implement some code in python using the TensorFlow library.
print("Best Hyperparameters:", best_params)
A gentle introduction to neural networks and TensorFlow can be found here: print("Mean Squared Error:", mse)
• Neural Networks
• Introduction to TensorFlow Source code: Output:
A multi-layer perceptron has one input layer and for each input, there is one import numpy as np Best Hyperparameters: {'activation': 'relu', 'alpha': 0.001,
neuron(or node), it has one output layer with a single node for each output and from sklearn.neural_network import MLPRegressor 'hidden_layer_sizes': (50, 50)}
from sklearn.model_selection import GridSearchCV Mean Squared Error: 0.0007964874470608723
it can have any number of hidden layers and each hidden layer can have any from sklearn.metrics import mean_squared_error
number of nodes. A schematic diagram of a Multi-Layer Perceptron (MLP) is
depicted below. # Generate synthetic noisy image data for demonstration purposes
np.random.seed(42)
image_size = (100, 100)
num_samples = 1000
noise_level = 0.1
clean_image = np.random.random(image_size)
noisy_image = clean_image + noise_level * np.random.random(image_size)

# Flatten the image data for MLP input


X_train = noisy_image.flatten().reshape(-1, 1)
y_train = clean_image.flatten()

# Create the MLP regressor


mlp = MLPRegressor(max_iter=200, random_state=42)

# Define the hyperparameters and their values to search


param_grid = {
'hidden_layer_sizes': [(100,), (50, 50), (25, 25, 25)],
'activation': ['relu', 'tanh'],
In the multi-layer perceptron diagram above, we can see that there are three inputs 'alpha': [0.0001, 0.001, 0.01],

25 26 27
Exercise-6 Object detection algorithms are broadly classified into two categories based on how state-of-theart results, beating other real-time object detection algorithms by a large
many times the same input image is passed through a network. margin.
Module name : Advanced Deep Learning Architectures
Exercise: Implement Object Detection Using YOLO. Single-shot object detection While algorithms like Faster RCNN work by detecting possible regions of interest
using the Region Proposal Network and then performing recognition on those
Introduction Single-shot object detection uses a single pass of the input image to make regions separately, YOLO performs all of its predictions with the help of a single
predictions about the presence and location of objects in the image. It processes an fully connected layer.
Object detection is a popular task in computer vision. entire image in a single pass, making them computationally efficient.
It deals with localizing a region of interest within an image and classifying this Methods that use Region Proposal Networks perform multiple iterations for the
region like a typical image classifier. One image can include several regions of However, single-shot object detection is generally less accurate than other methods, same image, while YOLO gets away with a single iteration.
interest pointing to different objects. This makes object detection a more advanced and it’s less effective in detecting small objects. Such algorithms can be used to
problem of image classification. detect objects in real time in resource-constrained environments. Several new versions of the same model have been proposed since the initial release
of YOLO in 2015, each building on and improving its predecessor. Here's a
YOLO (You Only Look Once) is a popular object detection model known for its YOLO is a single-shot detector that uses a fully convolutional neural network timeline showcasing YOLO's development in recent years.
speed and accuracy. It was first introduced by Joseph Redmon et al. in 2016 and has (CNN) to process an image. We will dive deeper into the YOLO model in the next
since undergone several iterations, the latest being YOLO v7. section.

What is object detection? Two-shot object detection


Object detection is a computer vision task that involves identifying and locating
objects in images or videos. It is an important part of many applications, such as Two-shot object detection uses two passes of the input image to make predictions
surveillance, self-driving cars, or robotics. Object detection algorithms can be about the presence and location of objects. The first pass is used to generate a set of
divided into two main categories: singleshot detectors and two-stage detectors. proposals or potential object locations, and the second pass is used to refine these
proposals and make final predictions. This approach is more accurate than single- How does YOLO work? YOLO Architecture
One of the earliest successful attempts to address the object detection problem using shot object detection but is also more computationally expensive.
deep learning was the R-CNN (Regions with CNN features) model, developed by The YOLO algorithm takes an image as input and then uses a simple deep
Ross Girshick and his team at Microsoft Research in 2014. This model used a Overall, the choice between single-shot and two-shot object detection depends on convolutional neural network to detect objects in the image. The architecture of the
combination of region proposal algorithms and convolutional neural networks the specific requirements and constraints of the application. CNN model that forms the backbone of YOLO is shown below.
(CNNs) to detect and localize objects in images.
Generally, single-shot object detection is better suited for real-time applications,
while two-shot object detection is better for applications where accuracy is more
important.

What is YOLO?
You Only Look Once (YOLO) proposes using an end-to-end neural network that
makes predictions of bounding boxes and class probabilities all at once. It differs
from the approach taken by previous object detection algorithms, which repurposed
classifiers to perform detection.
Following a fundamentally different approach to object detection, YOLO achieved The first 20 convolution layers of the model are pre-trained using ImageNet by

28 29 30

plugging in a temporary average pooling and fully connected layer. Then, this pre- Source code: center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
trained model is converted to perform detection since previous research showcased w = int(detection[2] * width)
that adding convolution and connected layers to a pre-trained network improves import cv2 h = int(detection[3] * height)
performance. YOLO’s final fully connected layer predicts both class probabilities import numpy as np
from google.colab.patches import cv2_imshow # Rectangle coordinates
and bounding box coordinates.
x = int(center_x - w / 2)
# Load Yolo y = int(center_y - h / 2)
YOLO divides an input image into an S × S grid. If the center of an object falls into print("LOADING YOLO")
a grid cell, that grid cell is responsible for detecting that object. Each grid cell net = cv2.dnn.readNet("/content/drive/MyDrive/4-1 DL lab/yolov3.weights", boxes.append([x, y, w, h])
"/content/drive/MyDrive/4-1 DL lab/yolov3.cfg") confidences.append(float(confidence))
predicts B bounding boxes and confidence scores for those boxes. These confidence class_ids.append(class_id)
scores reflect how confident the model is that the box contains an object and how # Save all the names in file to the list classes
accurate it thinks the predicted box is. classes = [] # We use NMS function in opencv to perform Non-maximum Suppression
with open("/content/drive/MyDrive/4-1 DL lab/coco.names", "r") as f: # we give it score threshold and nms threshold as arguments.
classes = [line.strip() for line in f.readlines()] indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
YOLO predicts multiple bounding boxes per grid cell. At training time, we only colors = np.random.uniform(0, 255, size=(len(classes), 3))
want one bounding box predictor to be responsible for each object. YOLO assigns # Get layers of the network for i in range(len(boxes)):
one predictor to be “responsible” for predicting an object based on which prediction layer_names = net.getLayerNames() if i in indexes:
x, y, w, h = boxes[i]
has the highest current IOU with the ground truth. This leads to specialization # Determine the output layer names from the YOLO model label = str(classes[class_ids[i]])
between the bounding box predictors. Each predictor gets better at forecasting output_layers = [layer_names[i - 1] for i in color = colors[class_ids[i]]
certain sizes, aspect ratios, or classes of objects, improving the net.getUnconnectedOutLayers()]
print("YOLO LOADED") # Draw the bounding box
overall recall score. One key technique used in the YOLO models is non-maximum cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
suppression (NMS). NMS is a post-processing step that is used to improve the # Capture frame-by-frame
accuracy and efficiency of object detection. In object detection, it is common for img = cv2.imread("/content/drive/MyDrive/4-1 DL lab/sample2.jpg") # Draw the label
img = cv2.resize(img, None, fx=0.4, fy=0.4) cv2.putText(img, label, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 1/2, color, 2)
multiple bounding boxes to be generated for a single object in an image. These
height, width, channels = img.shape print(f"Detected object: {label}, confidence: {confidences[i]}, box: {x},
bounding boxes may overlap or be located at different positions, but they all print("Input Image:") {y}, {w}, {h}")
represent the same object. NMS is used to identify and remove redundant or cv2_imshow(img)
incorrect bounding boxes and to output a single bounding box for each object in the # Using blob function of opencv to preprocess image # Display the image
blob = cv2.dnn.blobFromImage(img, 1 / 255.0, (416, 416), swapRB=True, cv2_imshow(img)
image. Now, let us look into the improvements that the later versions of YOLO crop=False) cv2.waitKey(0)
have brought to the parent model.
# Detecting objects
net.setInput(blob)
outs = net.forward(output_layers)

# Showing informations on the screen


Download the following files for the code:
class_ids = []
confidences = [] Output:
Yolov3.weights boxes = []
for out in outs: LOADING YOLO
for detection in out: YOLO LOADED
Yolov3.cfg scores = detection[5:] Input Image:
class_id = np.argmax(scores)
Coco.names confidence = scores[class_id]
if confidence > 0.3:
# Object detected

31 32 33

Exercise-7 grow infinitely large, the training may become unstable, as the network will tend to
"overreact" to such misclassifications (in extreme cases, this might lead to gradients
Module name : Optimization of Training in Deep Learning exploding).
Exercise: Design a Deep learning Network for Robust Bi-
Tempered Logistic Loss. An important note on naming - logistic loss is the same as binary cross-entropy.
Cross-entropy is a general term applicable to multiclass classification, but you
Introduction might encounter the same function called logistic loss in the literature, which is not
entirely correct as the logistic loss is for binary classification only.
Logistic loss
In order to understand the bi-tempered logistic loss, let's recap the standard version Softmax
of this loss function. Logistic loss is the most common loss function used for
training binary classifiers. You can look up the exact formula and other details here. Logistic loss work by comparing the expected probability distribution to the
The plot below displays logistic loss responses for different network predictions: network response. In order for this to be feasible, it has to be ensured that the
network outputs can be interpreted as a probability distribution over multiple
classes, i.e. individual elements are from the range of [0,1] and add up to one. To
meet this requirement, at the end of a typical classification network, a softmax layer
is used.
Detected object: car, confidence: 0.9670625925064087, box: 144, 92, 87, 100
Detected object: person, confidence: 0.975329577922821, box: 69, 76, 131, 213 Softmax is a pretty straightforward function that normalizes a vector of numbers in
Detected object: bicycle, confidence: 0.9977372884750366, box: 68, 189, 163, 212
such a way as to conform to the aforementioned probability axioms. The
normalization uses the exponential function, so it doesn't preserve relative ratios
between individual elements. For the exact formula and more details, you can refer
to this article.

The plots below show the results of passing a set of neuron activations through the
softmax function.
As can be seen, the loss is zero if the response is correct (the activation value is
equal to 1), and grows exponentially when the error increases, reaching its limit of
infinitely large loss when the activation is equal to zero (i.e the loss is an
unbounded function of error). This shape results in large errors being much more
heavily penalized than relatively small ones (e.g the error for 0.45 is seven times
greater than the one for 0.9).

In the context of noisy data, it's reasonable to expect that the network will produce
huge errors for incorrectly labelled (i.e. noisy) data. As the majority of data
annotations are correct, the network progressively learns the general function
underlying the data. When encountering a mislabelled sample, a well-trained
network will produce a correct label that doesn't match with the wrong annotation Notice how all the values became positive, they sum up to 1, and the relative
-1
and causes the error to be very large. By allowing the losses in such scenarios to magnitude of each value is preserved, i.e. if a given value is the 3rd largest in the
34 35 36
input vector, it stays at this position after the transformation. If we consider this set from tensorflow.keras.models import Sequential model.compile(optimizer='adam', loss=bi_tempered_logistic_loss,
of 10 neurons to be an output layer of a classifier network, we can read from the ``` metrics=['accuracy'])
second plot that the network considers the fourth class (denoted with "3") to be the ```
most probable, with the probability of around 30%. The ninth class, with the lowest ### Step 3: Create the Network
value prior to the transformation, stays the least probable. ```python ### Step 6: Train the Model
def create_network(input_shape, num_classes): ```python
Bi-tempered logistic loss model = Sequential([ model.fit(x_train, y_train, epochs=num_epochs, batch_size=batch_size ,
Bi-tempered logistic loss is a modification of the standard logistic loss. Conv2D(32, (3, 3), activation='relu', input_shape=input_shape), validation_data=(x_val, y_val))
MaxPooling2D((2, 2)), ```
The authors propose a slightly altered, parametric version of the typical logistic Conv2D(64, (3, 3), activation='relu'), ### Step 7: Evaluate the Model
loss. The new function has some desirable features which make it useful for training MaxPooling2D((2, 2)), ```python
neural networks that are more robust against noisy data. This means that Flatten(), test_loss, test_accuracy = model.evaluate(x_test, y_test)
imperfections in the training data don't impact the model quality as much as they Dense(128, activation='relu'), print(f'Test accuracy: {test_accuracy*100:.2f}%')
would normally if the network was trained with the typical loss. Dense(num_classes, activation='softmax') ```
])
The function differs from its standard counterpart by being parametric. Under the return model ### Step 8: Fine-tuning (Optional)
hood, it uses parametric alternatives to the logarithm and softmax functions, called ``` Depending on the performance of your model, you may need to fine-tune the
tempered logarithm and tempered exponential functions. Let's look into each of Replace this architecture with one that suits your specific problem. architecture, hyperparameters, or even experiment with different variants of the Bi-
them to build an understanding of their impact on the final loss function. Tempered Logistic Loss.
### Step 4: Define Bi-Tempered Logistic Loss
Procedure : You'll need to define the Bi-Tempered Logistic Loss function. Make sure to define Remember to preprocess your data appropriately, use data augmentation if
the loss and the associated gradient. necessary, and adjust hyperparameters like learning rate, batch size, etc., based on
Designing a deep learning network involves several steps, and incorporating a your specific problem.
custom loss function like the Bi-Tempered Logistic Loss requires careful ```python
consideration. Below, I'll outline the steps you can take to design a deep learning def bi_tempered_logistic_loss(y_true, y_pred, t1=0.8, t2=1.2,label_smoothing=0.1): This is a basic template to get you started. You might need to customize and adapt
network using the Bi-Tempered Logistic Loss: """ these steps based on the specific details of your problem, dataset, and requirements.
Bi-Tempered Logistic Loss.
### Step 1: Define the Network Architecture """ Source code:
Choose an architecture suitable for your specific task. This could be a y_pred = tf.clip_by_value(y_pred, 1e-7, 1.0 - 1e-7)
Convolutional Neural Network (CNN) for computer vision tasks, a Recurrent temp1 = (1 - y_true) * (y_pred ** (t1 - 1)) / t1 import numpy as np
Neural Network (RNN) for sequential data, or a feedforward neural network for temp2 = (1 - y_true) * ((1 - y_pred) ** (t2 - 1)) / t2 import tensorflow as tf
loss_values = temp1 + temp2 from tensorflow.keras.layers import Dense
simpler tasks. Make sure to tailor the architecture to the requirements of your from tensorflow.keras.models import Sequential
problem. loss = tf.reduce_sum(loss_values, axis=-1)
loss = tf.reduce_mean(loss) # Step 1: Define a simple network architecture

### Step 2: Import Libraries and Modules return loss def create_network(input_shape):
model = Sequential([
```python ``` Dense(64, activation='relu', input_shape=(input_shape,)),
import tensorflow as tf ### Step 5: Compile the Model Dense(1, activation='sigmoid') ]) # Added the sigmoid activation here
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten ```python return model
model = create_network(input_shape, num_classes)
37 38 39

# Step 4: Define Bi-Tempered Logistic Loss MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'),


def bi_tempered_logistic_loss(y_true, y_pred, t1=0.8, t2=1.2):
Exercise-8
y_pred = tf.clip_by_value(y_pred, 1e-7, 1.0 - 1e-7) Module name : Advanced CNN # Third Convolutional Layer
temp1 = (1 - y_true) * (y_pred ** (t1 - 1)) / t1 Conv2D(384, (3,3), activation='relu', padding='same'),
temp2 = (1 - y_true) * ((1 - y_pred) ** (t2 - 1)) / t2 Exercise: Build AlexNet using Advanced CNN
loss_values = temp1 + temp2 # Fourth Convolutional Layer
loss = tf.reduce_sum(loss_values, axis=-1) Conv2D(384, (3,3), activation='relu', padding='same'),
loss = tf.reduce_mean(loss)
Introduction
return loss # Fifth Convolutional Layer
The main content of this exercise will present how the AlexNet Convolutional Conv2D(256, (3,3), activation='relu', padding='same'),
# Step 5: Compile the model MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'),
input_shape = 10 # Example input shape for a feature vector of size 10
Neural Network(CNN) architecture is implemented using TensorFlow and Keras.
model = create_network(input_shape) # Flatten the CNN output to feed it into a Fully-Connected Dense Layer
model.compile(optimizer='adam', loss=bi_tempered_logistic_loss) Here are some of the key learning objectives from this article: Flatten(),
1. Introduction to neural network implementation with Keras and TensorFlow
# Step 6: Generate synthetic data for binary classification # First Dense Layer
np.random.seed(0) 2. Data preprocessing with TensorFlow Dense(4096, activation='relu'),
X = np.random.rand(1000, input_shape) 3. Training visualization with TensorBoard
y = np.random.rand(1000) # Changed y to be float32 4. Description of standard machine learning terms and terminologies # Add a Dropout layer to reduce overfitting
Dropout(0.5),
# Step 7: Train the model
5. AlexNet Implementation
model.fit(X, y, epochs=10, batch_size=32) # Second Dense Layer
AlexNet CNN is probably one of the simplest methods to approach understanding Dense(4096, activation='relu'),
# Step 8: Evaluate the model
test_loss = model.evaluate(X, y)
deep learning concepts and techniques. # Add another Dropout layer
print(f'Test loss: {test_loss}') Dropout(0.5),
AlexNet is not a complicated architecture when it is compared with some state of
the art CNN architectures that have emerged in the more recent years. # Output Layer
Dense(num_classes, activation='softmax')
Output: ])
AlexNet is simple enough for beginners and intermediate deep learning
practitioners to pick up some good practices on model implementation techniques. return model
Test loss: 0.6651512384414673
# Define the input shape and number of classes
Source code: input_shape = (227, 227, 3) # Adjust input shape if necessary
num_classes = 1000 # Number of ImageNet classes

import tensorflow as tf
# Create an instance of the model
from tensorflow.keras.models import Sequential model = alexnet_model(input_shape, num_classes)
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# Print the model summary
def alexnet_model(input_shape, num_classes):
model.summary()
model = Sequential([
# First Convolutional Layer
Conv2D(96, (11,11), strides=(4,4), activation='relu',
input_shape=input_shape, padding='valid'),
MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'),

# Second Convolutional Layer


Conv2D(256, (5,5), activation='relu', padding='same'),

40 41 42

Output: Exercise-9 Source code:


Module name : Autoencoders Advanced
Model: "sequential_2" import numpy as np
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ Exercise: Demonstration of Application of Autoencoders import tensorflow as tf
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ from tensorflow.keras.layers import Input, Dense
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ from tensorflow.keras.models import Model
│ conv2d (Conv2D) │ (None, 55, 55, 96) │ 34,944 │ Introduction
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
# Generate synthetic high-dimensional data
│ max_pooling2d (MaxPooling2D) │ (None, 27, 27, 96) │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ Autoencoders are a type of neural network that can be used for various tasks, such np.random.seed(0)
│ conv2d_1 (Conv2D) │ (None, 27, 27, 256) │ 614,656 │ as dimensionality reduction, anomaly detection, and even generation of new data. X = np.random.rand(1000, 100)
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 13, 13, 256) │ 0 │
I'll demonstrate how autoencoders can be applied for dimensionality reduction # Define the autoencoder architecture
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ using a simple example. input_dim = X.shape[1]
│ conv2d_2 (Conv2D) │ (None, 13, 13, 384) │ 885,120 │ encoding_dim = 10
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_3 (Conv2D) │ (None, 13, 13, 384) │ 1,327,488 │ Example: Dimensionality Reduction with Autoencoders # Encoder
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
input_layer = Input(shape=(input_dim,))
│ conv2d_4 (Conv2D) │ (None, 13, 13, 256) │ 884,992 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ Let's say you have a dataset with high-dimensional features, and you want to reduce encoded = Dense(encoding_dim, activation='relu')(input_layer)
│ max_pooling2d_2 (MaxPooling2D) │ (None, 6, 6, 256) │ 0 │ the dimensionality while retaining as much information as possible.
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ # Decoder
│ flatten (Flatten) │ (None, 9216) │ 0 │ decoded = Dense(input_dim, activation='sigmoid')(encoded)
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ In this example, we:
│ dense_3 (Dense) │ (None, 4096) │ 37,752,832 │ 1. Generate synthetic high-dimensional data (100 features per sample). # Autoencoder model
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ autoencoder = Model(input_layer, decoded)
│ dropout (Dropout) │ (None, 4096) │ 0 │ 2. Define an autoencoder architecture with one hidden layer in both the encoder
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ and decoder. # Compile the autoencoder
│ dense_4 (Dense) │ (None, 4096) │ 16,781,312 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ 3. Compile the autoencoder with Mean Squared Error (MSE) loss. autoencoder.compile(optimizer='adam', loss='mse')
│ dropout_1 (Dropout) │ (None, 4096) │ 0 │ 4. Train the autoencoder on the same data as input and output (unsupervised
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ # Train the autoencoder
│ dense_5 (Dense) │ (None, 1000) │ 4,097,000 │
learning). autoencoder.fit(X, X, epochs=50, batch_size=32)
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘ 5. Extract the encoder part as it represents the learned low-dimensional
Total params: 62,378,344 (237.95 MB) representation of the data. # Use the encoder for dimensionality reduction
Trainable params: 62,378,344 (237.95 MB) encoder = Model(input_layer, encoded)
Non-trainable params: 0 (0.00 B) encoded_features = encoder.predict(X)
Keep in mind that in a real-world scenario, you'd replace the synthetic data with
your actual dataset and adapt the architecture and hyperparameters based on your print(f"Original shape: {X.shape}")
print(f"Encoded shape: {encoded_features.shape}")
specific problem.

The encoded_features will contain the reduced-dimensional representations of the


input data. You can use these for downstream tasks like clustering, classification, or Output:
visualization.
Original shape: (1000, 100)
Encoded shape: (1000, 10)

43 44 45
Exercise-10 Source code: # Build and compile the discriminator
discriminator = build_discriminator(image_shape)
Module name : Advanced GANs discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5),
import numpy as np metrics=['accuracy'])
Exercise: Demonstration of GAN import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist # Build and compile the generator
from tensorflow.keras.models import Sequential, Model generator = build_generator(latent_dim)
Introduction from tensorflow.keras.layers import Dense, LeakyReLU, BatchNormalization, Reshape, discriminator.trainable = False
Flatten, Input gan = build_gan(generator, discriminator)
Generative Adversarial Networks (GANs) are a type of deep learning model used from tensorflow.keras.optimizers import Adam gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))

for generating new data samples that are similar to a given dataset. GANs consist of # Load MNIST data # Define training loop
two neural networks, a generator and a discriminator, which are trained (X_train, _), (_, _) = mnist.load_data() def train_gan(epochs=1, batch_size=128):
simultaneously through adversarial training. X_train = (X_train.astype(np.float32) - 127.5) / 127.5 batch_count = X_train.shape[0] // batch_size
X_train = X_train.reshape(X_train.shape[0], 784)
for e in range(epochs):
In this example, I'll demonstrate how to create a simple GAN to generate # Define GAN architecture for _ in range(batch_count):
handwritten digits (similar to those in the MNIST dataset). def build_generator(latent_dim): noise = np.random.normal(0, 1, size=[batch_size, latent_dim])
model = Sequential([ generated_images = generator.predict(noise)
Dense(128, input_dim=latent_dim), image_batch = X_train[np.random.randint(0, X_train.shape[0],
In this example, we: LeakyReLU(alpha=0.2), size=batch_size)]
BatchNormalization(momentum=0.8),
Load the MNIST dataset and preprocess it. Dense(256), X = np.concatenate([image_batch, generated_images])
LeakyReLU(alpha=0.2), y_dis = np.zeros(2 * batch_size)
BatchNormalization(momentum=0.8), y_dis[:batch_size] = 0.9
Define the generator, discriminator, and the GAN. Dense(512),
LeakyReLU(alpha=0.2), discriminator.trainable = True
BatchNormalization(momentum=0.8), d_loss = discriminator.train_on_batch(X, y_dis)
Compile the discriminator and the GAN.
Dense(784, activation='tanh')
]) noise = np.random.normal(0, 1, size=[batch_size, latent_dim])
Train the GAN in a loop. return model y_gen = np.ones(batch_size)
discriminator.trainable = False
def build_discriminator(image_shape): g_loss = gan.train_on_batch(noise, y_gen)
After training, you'll see generated images saved in the current directory with model = Sequential([
filenames like gan_generated_image_epoch_X.png. Dense(512, input_dim=image_shape), print(f"Epoch {e} - D loss: {d_loss[0]} | D accuracy: {100 * d_loss[1]} | G
LeakyReLU(alpha=0.2), loss: {g_loss}")
Dense(256),
Keep in mind that this is a very basic GAN example. Real-world GAN applications LeakyReLU(alpha=0.2), if e % 1 == 0:
require more advanced architectures, training techniques, and careful Dense(1, activation='sigmoid') plot_generated_images(e, generator, latent_dim)
hyperparameter tuning. ])
return model # Function to plot generated images
def plot_generated_images(epoch, generator, latent_dim, examples=10, dim=(1, 10),
def build_gan(generator, discriminator): figsize=(10, 1)):
model = Sequential([generator, discriminator]) noise = np.random.normal(0, 1, size=[examples, latent_dim])
return model generated_images = generator.predict(noise)
generated_images = generated_images.reshape(examples, 28, 28)
# Define hyperparameters
latent_dim = 100 plt.figure(figsize=figsize)
image_shape = 784 for i in range(generated_images.shape[0]):
plt.subplot(dim[0], dim[1], i+1)

46 47 48

plt.imshow(generated_images[i], interpolation='nearest', cmap='gray_r')


plt.axis('off')

plt.tight_layout()
plt.savefig(f'gan_generated_image_epoch_{epoch}.png')

# Train the GAN


train_gan(epochs=1, batch_size=128)

Output:

Epoch 0 - D loss: 0.3723898231983185 | D accuracy:


44.20572817325592 | G loss: [array(0.37238982, dtype=float32),
array(0.37238982, dtype=float32), array(0.44205728,
dtype=float32)]

B.Tony (21HP1A0562)
ALIET, Vijayawada

49

You might also like