Open In App

Implementing Recurrent Neural Networks in PyTorch

Last Updated : 27 Feb, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Recurrent Neural Networks (RNNs) are a class of neural networks that are particularly effective for sequential data. Unlike traditional feedforward neural networks RNNs have connections that form loops allowing them to maintain a hidden state that can capture information from previous inputs. This makes them suitable for tasks such as time series prediction, natural language processing and many more task. In this article we will explore how to implement RNNs using PyTorch.

Building an RNN from Scratch in Pytorch

Setting Up the Environment

Before we start implementing the RNN we need to set up our environment. Ensure you have PyTorch installed. You can install it using pip:

pip install torch

Predicting Sequential Data

To use an RNN to predict the next value in a series of numbers we will build a basic synthetic dataset. This will help us in understanding the fundamentals of RNN operation.

Step 1: Import Libraries

First we need to import the necessary libraries.

Python
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

Step 2: Create Synthetic Dataset

We will create a simple sine wave dataset. The goal is to predict the next value in the sine wave sequence.

Python
def generate_data(seq_length, num_samples):
    X = []
    y = []
    for i in range(num_samples):
        x = np.linspace(i * 2 * np.pi, (i + 1) * 2 * np.pi, seq_length + 1)
        sine_wave = np.sin(x)
        X.append(sine_wave[:-1])  
        y.append(sine_wave[1:])   
    return np.array(X), np.array(y)

seq_length = 50
num_samples = 1000
X, y = generate_data(seq_length, num_samples)

X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32)

print(X.shape, y.shape)

Output:

torch.Size([1000, 50]) torch.Size([1000, 50])

Step 3: Define the RNN Model

Next we will define the RNN model.

Python
class SimpleRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleRNN, self).__init__()
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
        out, _ = self.rnn(x, h0)
        out = self.fc(out)
        return out

input_size = 1
hidden_size = 20
output_size = 1
model = SimpleRNN(input_size, hidden_size, output_size)

Step 4: Train the Model

Now we will train the model using Mean Squared Error (MSE) loss and Adam optimizer.

Python
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 100
for epoch in range(num_epochs):
    model.train()
    outputs = model(X.unsqueeze(2))
    loss = criterion(outputs, y.unsqueeze(2))
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

Output:

Epoch [10/100], Loss: 0.3548
Epoch [20/100], Loss: 0.2653
Epoch [30/100], Loss: 0.1757
Epoch [40/100], Loss: 0.0921
Epoch [50/100], Loss: 0.0592
Epoch [60/100], Loss: 0.0421
Epoch [70/100], Loss: 0.0306
Epoch [80/100], Loss: 0.0222
Epoch [90/100], Loss: 0.0151
Epoch [100/100], Loss: 0.0093

Step 5: Visualize the Results

Finally, we will visualize the predictions made by the model.

Python
model.eval()
with torch.no_grad():
    predictions = model(X.unsqueeze(2)).squeeze(2).numpy()

plt.figure(figsize=(10, 6))
plt.plot(y[0].numpy(), label='True')
plt.plot(predictions[0], label='Predicted')
plt.legend()
plt.show()

Output:

download
Predicting Sequential Data

Plot shows us how well the model's predictions (orange curve) match the true values (blue curve). The closeness of the two curves suggests that the RNN model is performing well and capturing the sequential patterns in data effectively.

Now that we have worked on synthetic dataset we will now be using real world dataset for analysis.

Classifying Movie Reviews Using RNN

In this example we will use a public dataset to perform sentiment analysis on movie reviews. The goal is to classify each review as positive or negative using an RNN. Step-by-Step Implementation:

Step 1: Import Libraries

Python
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from torch.utils.data import Dataset, DataLoader

Step 2: Load and Preprocess the Dataset

We will load the IMDB movie reviews dataset directly from a URL and preprocess it.

Python
url = "https://raw.githubusercontent.com/justmarkham/DAT8/master/data/sms.tsv"
df = pd.read_csv(url, delimiter='\t', header=None, names=['label', 'text'])

def preprocess_text(text):
    return text.lower().split()

df['text'] = df['text'].apply(preprocess_text)
df = df[['text', 'label']]

le = LabelEncoder()
df['label'] = le.fit_transform(df['label'])

train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)

vocab = set([word for phrase in df['text'] for word in phrase])
word_to_idx = {word: idx for idx, word in enumerate(vocab, 1)}

def encode_phrase(phrase):
    return [word_to_idx[word] for word in phrase]

train_data['text'] = train_data['text'].apply(encode_phrase)
test_data['text'] = test_data['text'].apply(encode_phrase)

max_length = max(df['text'].apply(len))

def pad_sequence(seq, max_length):
    return seq + [0] * (max_length - len(seq))

train_data['text'] = train_data['text'].apply(lambda x: pad_sequence(x, max_length))
test_data['text'] = test_data['text'].apply(lambda x: pad_sequence(x, max_length))

Step 3: Create Dataset and Data Loader

Python
class SentimentDataset(Dataset):
    def __init__(self, data):
        self.texts = data['text'].values
        self.labels = data['label'].values
    
    def __len__(self):
        return len(self.texts)
    
    def __getitem__(self, idx):
        text = self.texts[idx]
        label = self.labels[idx]
        return torch.tensor(text, dtype=torch.long), torch.tensor(label, dtype=torch.long)

train_dataset = SentimentDataset(train_data)
test_dataset = SentimentDataset(test_data)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

Step 4: Define the RNN Model

The SentimentRNN is a type of neural network designed to understand sequences of words and determine if a piece of text (like a sentence) is positive or negative. Think of it like a brain that can read and understand emotions in text.

Python
class SentimentRNN(nn.Module):
    def __init__(self, vocab_size, embed_size, hidden_size, output_size):
        super(SentimentRNN, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embed_size)
        self.rnn = nn.RNN(embed_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        x = self.embedding(x)
        h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])
        return out

vocab_size = len(vocab) + 1
embed_size = 128
hidden_size = 128
output_size = 2 
model = SentimentRNN(vocab_size, embed_size, hidden_size, output_size)

Step 5: Train the Model

Python
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    epoch_loss = 0
    for texts, labels in train_loader:
        outputs = model(texts)
        loss = criterion(outputs, labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_loss += loss.item()
    
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')

Output:

Epoch [1/10], Loss: 0.4016
Epoch [2/10], Loss: 0.3999
Epoch [3/10], Loss: 0.4004
Epoch [4/10], Loss: 0.3954
Epoch [5/10], Loss: 0.3969
Epoch [6/10], Loss: 0.3978
Epoch [7/10], Loss: 0.3960
Epoch [8/10], Loss: 0.3959
Epoch [9/10], Loss: 0.3967
Epoch [10/10], Loss: 0.3953

Step 6: Evaluate the Model

Python
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for texts, labels in test_loader:
        outputs = model(texts)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print(f'Accuracy: {accuracy:.2f}%')

Output:

Accuracy: 86.64%

Step 7: Visualize Training Loss

Python
losses = []

for epoch in range(num_epochs):
    model.train()
    epoch_loss = 0
    for texts, labels in train_loader:
        outputs = model(texts)
        loss = criterion(outputs, labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_loss += loss.item()
    
    losses.append(epoch_loss / len(train_loader))
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')

plt.figure(figsize=(10, 6))
plt.plot(range(1, num_epochs + 1), losses, marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.show()

Output :

Epoch [1/10], Loss: 0.3946
Epoch [2/10], Loss: 0.3990
Epoch [3/10], Loss: 0.3968
Epoch [4/10], Loss: 0.3988
Epoch [5/10], Loss: 0.3949
Epoch [6/10], Loss: 0.3983
Epoch [7/10], Loss: 0.3997
Epoch [8/10], Loss: 0.3991
Epoch [9/10], Loss: 0.3991
Epoch [10/10], Loss: 0.3956

download-(4)
Movie Reviews Using RNN

Based on the training loss plot RNN model demonstrates good performance. Although the loss fluctuates throughout the training process it shows a steady decrease by the final epochs indicating that the model is improving over time. The consistent reduction in loss towards the end tells that the model is learning effectively and converging towards a more optimized state.

You can also make RNN model using Tenserflow and for that you can refer to this article: Training of Recurrent Neural Networks (RNN) in TensorFlow


Next Article

Similar Reads