Training of Recurrent Neural Networks (RNN) in TensorFlow

Recurrent Neural Networks (RNNs) are neural networks designed to process sequential data by maintaining hidden states that store information from previous steps. In this implementation, TensorFlow is used to build and train an RNN model for sequence learning tasks.

Implementation

1. Importing Libraries

We will be importing Pandas, NumPy, Matplotlib, Seaborn, TensorFlow, Keras, NLTK and Scikit-learn for implementation.

Python

import warnings
from tensorflow.keras.utils import pad_sequences
from tensorflow.keras.preprocessing.text import Tokenizer
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import numpy as np

import re
import nltk
nltk.download('all')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
lemm = WordNetLemmatizer()

warnings.filterwarnings("ignore")

2. Loading the Dataset

The dataset is loaded using pd.read_csv() and cleaned by removing rows with null values in the Class Name column.

Loads dataset using Pandas
Displays first 7 rows using data.head(7)
Removes null values from Class Name column

Python

data = pd.read_csv("Clothing Review.csv")
data.head(7)

data = data[data['Class Name'].isnull() == False]

Output:

3. Performing Exploratory Data Analysis

EDA helps understand the distribution and patterns in the dataset before building the model using different visualization techniques.

Count Plot of Class Name Distribution

sns.countplot() is used to visualize the count of each category in the Class Name column. The x-axis labels are rotated using plt.xticks(rotation=90) for better readability.

Python

sns.countplot(data=data, x='Class Name', palette='rainbow')
plt.xticks(rotation=90)
plt.show()

Output:

Count Plot of Rating and Recommendation Distribution

A figure of size 12×5 is created using plt.subplots() to visualize the distribution of ratings and recommendation indicators.

Python

plt.subplots(figsize=(12, 5))
plt.subplot(1, 2, 1)
sns.countplot(data=data, x='Rating',palette="deep")

plt.subplot(1, 2, 2)
sns.countplot(data=data, x="Recommended IND", palette="deep")
plt.show()

Output:

Countplot for the Rating and Recommended IND category

Histogram of Age Distribution

A histogram is created using px.histogram() to visualize the frequency distribution of age. The plot also includes a box plot to show spread and outliers.

Python

fig = px.histogram(data, marginal='box',
                   x="Age", title="Age Group",
                   color="Recommended IND",
                   nbins=65-18,
                   color_discrete_sequence=['green', 'red'])
fig.update_layout(bargap=0.2)

Output:

Training of Recurrent Neural Networks (RNN) in TensorFlow — Histogram of Age Distribution

Interpretation of Age Distribution Plot

The histogram shows age distribution for recommended and non-recommended individuals, while the box plots display the spread and outliers for each group.

Green bars represent recommended individuals
Red bars represent non-recommended individuals
Box plots show spread and outliers of age values
Helps compare age distribution between groups
Can also be used to analyze age distribution with ratings

Python

fig = px.histogram(data,
                   x="Age",
                   marginal='box',
                   title="Age Group",
                   color="Rating",
                   nbins=65-18,
                   color_discrete_sequence
                   =['black', 'green', 'blue', 'red', 'yellow'])
fig.update_layout(bargap=0.2)

Output:

4. Prepare the Data to build Model

Since the dataset is NLP-based, text columns are used as features and the Rating column is used for sentiment analysis. To handle class imbalance, ratings above 3 are converted to 1 (positive) and ratings below 3 are converted to 0 (negative).

Uses text columns as input features
Uses Rating column for sentiment analysis
Handles imbalance in rating distribution
Converts ratings >3 to positive class (1)
Converts ratings <3 to negative class (0)

Python

def filter_score(rating):
    return int(rating > 3)

features = ['Class Name', 'Title', 'Review Text']

X = data[features]
y = data['Rating']
y = y.apply(filter_score)

5. Text Preprocessing

Text preprocessing is performed to clean and standardize the text data before training the model. The text is converted to lowercase, lemmatized and cleaned by removing stopwords and punctuation.

Converts text to lowercase for consistency
Applies lemmatization to normalize words
Removes stopwords and punctuation
Reduces noise and improves text quality for training

Python

def toLower(data):
    if isinstance(data, float):
        return '<UNK>'
    else:
        return data.lower()

stop_words = stopwords.words("english")

def remove_stopwords(text):
    no_stop = []
    for word in text.split(' '):
        if word not in stop_words:
            no_stop.append(word)
    return " ".join(no_stop)

def remove_punctuation_func(text):
    return re.sub(r'[^a-zA-Z0-9]', ' ', text)

X['Title'] = X['Title'].apply(toLower)
X['Review Text'] = X['Review Text'].apply(toLower)

X['Title'] = X['Title'].apply(remove_stopwords)
X['Review Text'] = X['Review Text'].apply(remove_stopwords)

X['Title'] = X['Title'].apply(lambda x: lemm.lemmatize(x))
X['Review Text'] = X['Review Text'].apply(lambda x: lemm.lemmatize(x))

X['Title'] = X['Title'].apply(remove_punctuation_func)
X['Review Text'] = X['Review Text'].apply(remove_punctuation_func)

X['Text'] = list(X['Title']+X['Review Text']+X['Class Name'])


X_train, X_test, y_train, y_test = train_test_split(
    X['Text'], y, test_size=0.25, random_state=42)

6. Tokenization

Tokenization converts text data into numerical vectors that can be processed by the neural network. Keras provides a Tokenizer API to create word indices from the text data.

Converts text into numerical sequences
Uses Keras Tokenizer for preprocessing
num_words defines vocabulary size
OOV handles out-of-vocabulary words
fit_on_texts() is applied only on training data

Python

tokenizer = Tokenizer(num_words=10000, oov_token='<OOV>')
tokenizer.fit_on_texts(X_train)

7. Padding the Text Data

Padding is used to make all text sequences the same length before feeding them into the neural network. Extra zeros are added to shorter sequences, while longer sequences can be truncated if needed.

Makes all text sequences equal in length
Adds zeros to shorter sequences
Longer sequences can be truncated
Padding and tokenization are general NLP preprocessing techniques
Helps in efficient training of neural network models

Python

train_seq = tokenizer.texts_to_sequences(X_train)
test_seq = tokenizer.texts_to_sequences(X_test)

train_pad = pad_sequences(train_seq,
                          maxlen=40,
                          truncating="post",
                          padding="post")
test_pad = pad_sequences(test_seq,
                         maxlen=40,
                         truncating="post",
                         padding="post")

8. Building a Recurrent Neural Network (RNN) in TensorFlow

After preprocessing the data, a Simple Recurrent Neural Network (SimpleRNN) is built for training. Before entering the RNN layer, the text data is passed through an Embedding layer to generate fixed-size word vectors.

Builds a SimpleRNN model using TensorFlow
Uses an Embedding layer before the RNN layer
Embedding converts words into dense vector representations
Fixed-size vectors help improve sequence learning

Python

from tensorflow import keras

model = keras.models.Sequential()
model.add(keras.layers.Embedding(input_dim=10000, output_dim=128, input_length=40))
model.add(keras.layers.SimpleRNN(64, return_sequences=True))
model.add(keras.layers.SimpleRNN(64))
model.add(keras.layers.Dense(128, activation="relu"))
model.add(keras.layers.Dropout(0.4))
model.add(keras.layers.Dense(1, activation="sigmoid"))

model.build(input_shape=(None, 40))

model.summary()

Output:

Summary of the architecture of the model

9. Training the Model

After building the model, it is compiled using an optimizer, loss function and evaluation metric. The model is then trained on the preprocessed training data for multiple epochs.

Compiles model using optimizer, loss function and evaluation metric
Trains the model on train_pad data
Uses y_train as target labels
Runs training for 5 epochs to evaluate accuracy

Python

model.compile(loss="binary_crossentropy",
              optimizer="adam",
              metrics=["accuracy"])
              

history = model.fit(train_pad,
                    y_train,
                    epochs=5)

Output:

Download full code from here

Training of Recurrent Neural Networks (RNN) in TensorFlow

Implementation

1. Importing Libraries

2. Loading the Dataset

3. Performing Exploratory Data Analysis

4. Prepare the Data to build Model

5. Text Preprocessing

6. Tokenization

7. Padding the Text Data

8. Building a Recurrent Neural Network (RNN) in TensorFlow

9. Training the Model

Explore