0% found this document useful (0 votes)
68 views

Time Series Prediction With Recurrent Neural Networks

The document discusses using recurrent neural networks (RNNs) for time series prediction. RNNs can capture sequence dependence in time series data using recurrent layers like LSTMs and GRUs. They also learn complex nonlinear patterns without requiring feature engineering unlike regression models. The document provides the mathematical formulations of RNNs, describes the training process using gradient descent, and includes a Python code example for time series prediction with an RNN model.

Uploaded by

ghoshayan1003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Time Series Prediction With Recurrent Neural Networks

The document discusses using recurrent neural networks (RNNs) for time series prediction. RNNs can capture sequence dependence in time series data using recurrent layers like LSTMs and GRUs. They also learn complex nonlinear patterns without requiring feature engineering unlike regression models. The document provides the mathematical formulations of RNNs, describes the training process using gradient descent, and includes a Python code example for time series prediction with an RNN model.

Uploaded by

ghoshayan1003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Time Series Prediction with Recurrent

Neural Networks
Shubhayan
Dept.of.Math, Jadavpur University
12 Nov 2023

1 Introduction
Time series prediction is a crucial task in various domains, from finance to
weather forecasting. Accurate predictions enable better decision-making. This
document explores the use of Recurrent Neural Networks (RNNs) for effective
time series forecasting. We’ll delve into the mathematical foundation, present
algorithmic steps, and provide practical Python code examples.

2 Advantage over Classic Regression Analysis


Time series analysis is the process of modeling and forecasting the behavior of
a sequence of data points over time. Regression analysis is a statistical method
of estimating the relationship between a dependent variable and one or more
independent variables.
Neural networks are a type of machine learning model that can learn com-
plex nonlinear patterns from data. Regression models are usually linear or
polynomial, which means they can only capture simple relationships.
One advantage of neural networks over regression models for time series
analysis is that they can handle sequence dependence, which means that the
current value of the data depends on the previous values. Neural networks can
do this by using recurrent layers, such as Long Short-Term Memory (LSTM)
or Gated Recurrent Unit (GRU), which have a memory mechanism that allows
them to store and access information from the past. Regression models, on
the other hand, assume that the data points are independent and identically
distributed, which is often not true for time series data.
Another advantage of neural networks over regression models for time series
analysis is that they can learn multiple features and interactions from the data,
without requiring prior knowledge or assumptions. Neural networks can do this
by using convolutional layers, which can extract local and global patterns from
the data, or attention mechanisms, which can focus on the most relevant parts
of the data. Regression models, on the other hand, require the user to specify

1
the features and interactions to include in the model, which can be difficult and
time-consuming.
Mathematically, a neural network for time series analysis can be described
as a function that maps an input sequence x1:T = (x1 , x2 , ..., xT ) to an output
sequence y1:T = (y1 , y2 , ..., yT ), where T is the length of the sequence. The func-
tion can be composed of multiple layers, each with a different activation function
and parameters. For example, a recurrent neural network can be defined as:
ht = f (Whh ht−1 + Wxh xt + bh )
yt = g(Why ht + by )
where ht is the hidden state at time t, f and g are activation functions, and
Whh , Wxh , bh , Why , by are the parameters of the network.
Algorithmically, a neural network for time series analysis can be trained using
a variant of gradient descent, which is an optimization method that iteratively
updates the parameters of the network to minimize a loss function that measures
the difference between the predicted output and the true output. For example,
a common loss function for time series prediction is the mean squared error
(MSE), which is defined as:
T
1X
MSE = (yt − ŷt )2
T t=1

where ŷt is the predicted output at time t. The gradient descent algorithm can
be summarized as:
1. Initialize the parameters of the network randomly.
2. For each epoch (a full pass over the data):
(a) For each input-output pair (x1:T , y1:T ) in the data:
i. Feed the input sequence x1:T to the network and compute the
output sequence ŷ1:T .
ii. Compute the loss function MSE(y1:T , ŷ1:T ) and its gradient with
respect to the parameters of the network.
iii. Update the parameters of the network using a learning rate α
∂W
and the gradient: W ← W − α ∂MSE , where W represents any
parameter of the network.
(b) Evaluate the performance of the network on a validation set and
adjust the learning rate or stop the training if necessary.

3 Mathematical Formulation
3.1 RNN Update Equation
In an RNN, the hidden state ht at time t evolves through the following update
equation:

2
ht = σ(Whh ht−1 + Wxh Xt + bh )
Here, ht is the hidden state, σ is the activation function, Whh and Wxh are
weight matrices, Xt is the input, and bh is the bias for the hidden layer.

3.2 Output Calculation


The predicted output Ŷt is computed by applying weights Why to the hidden
state and adding a bias term by :

Ŷt = Why ht + by

This formulation represents the core of how information is processed in an


RNN.

4 Algorithmic Overview
4.1 RNN Training Algorithm
The training of an RNN involves iterative steps:

1. Initialization: Set the initial hidden state h0 and configure model pa-
rameters.
2. Forward Pass: For each time step t,
(a) Update the hidden state using the RNN update equation.
(b) Compute the predicted output using the updated hidden state.
(c) Backward Pass: Calculate the loss between predicted and actual
values.
(d) Gradient Descent: Update weights and biases to minimize the loss.
This iterative process fine-tunes the model parameters to make more ac-
curate predictions over time.

5 Graphical Representation
5.1 Recurrent Structure
The graphical representation in Figure 1 illustrates the recurrent connec-
tions within an RNN. Each block corresponds to a time step, and arrows
depict the flow of information through time. These connections allow the
RNN to capture dependencies in the time series data.

3
Xt Xt+1 Xt+2

RNN RNN RNN

Yt Yt+1 Yt+2

Figure 1: Recurrent Structure of an RNN

6 Python Code Example


Here is a sample Python code for time series prediction using an RNN:
1

2 import numpy as np
3 import matplotlib . pyplot as plt
4 from tensorflow . keras . models import Sequential
5 from tensorflow . keras . layers import SimpleRNN ,
Dense
6

7 # Generate synthetic time series data


8 np . random . seed (42)
9 timesteps = 100
10 series = np . sin (0.1 * np . arange ( timesteps ) ) +
np . random . randn ( timesteps ) * 0.1
11

12 # Prepare data for RNN


13 X = []
14 Y = []
15 for i in range ( timesteps - 10) :
16 X . append ( series [ i : i +10])
17 Y . append ( series [ i +10])
18

19 X = np . array ( X ) . reshape ( -1 , 10 , 1) # Reshape for


RNN input
20 Y = np . array ( Y )
21

22 # Build the RNN model


23 model = Sequential ([
24 SimpleRNN (20 , activation = ’ relu ’ ,
input_shape =[10 , 1]) ,

4
25 Dense (1)
26 ])
27

28 # Compile the model


29 model . compile ( optimizer = ’ adam ’ , loss = ’ mse ’)
30

31 # Train the model


32 model . fit (X , Y , epochs =20)
33

34 # Generate predictions
35 future_steps = 10
36 future_data = series [ -10:] # Use the last 10
steps as input for predicting the next steps
37

38 for _ in range ( future_steps ) :


39 X_future = future_data [ -10:]. reshape (1 , 10 , 1)
40 Y_pred = model . predict ( X_future )
41 future_data = np . append ( future_data ,
Y_pred [0 , -1])
42

43 # Plot the original and predicted time series


44 plt . plot ( series , label = ’ Original ␣ Data ’)
45 plt . plot ( np . arange ( timesteps , timesteps +
future_steps ) , future_data [ - future_steps :] ,
label = ’ Predicted ␣ Data ’)
46 plt . legend ()
47 plt . show ()
48 ‘‘‘

7 Appendix
7.1 Introduction to Neural Networks
Neural networks are a class of machine learning models inspired by the
human brain. They consist of interconnected nodes, or neurons, organized
into layers: input, hidden, and output layers. The network learns to make
predictions or decisions based on input data.

7.2 Mathematics of Neurons


7.3 Neuron Activation
A neuron’s output is determined by an activation function applied to the
weighted sum of its inputs and a bias:

5
n
!
X
Output = Activation inputi × weighti + bias
i=1

The activation function introduces non-linearity, crucial for the network


to learn complex patterns.

7.4 Neural Network Structure

7.5 Feedforward Architecture


In a feedforward neural network, information flows in one direction, from
the input layer to the output layer. Each layer’s output becomes the input
for the next layer. The entire process can be represented mathematically
as a composition of functions.

7.6 Backpropagation
Training a neural network involves minimizing a loss function, typically
calculated using the mean squared error for regression or cross-entropy
for classification. Backpropagation is used to adjust weights and biases to
minimize this loss. It consists of a forward pass to calculate outputs and
a backward pass to update weights using the chain rule.

7.6.1 Forward Pass

During the forward pass, the output of each neuron is calculated layer by
layer until the final output is obtained. For a given layer l, the weighted
(l) (l)
sum zi and the activated output ai are computed as:

(l−1)
nX
(l) (l) (l−1) (l)
zi = wij aj + bi
j=1

(l) (l)
ai = Activation(zi )
(l)
Here, wij represents the weight from neuron j in layer (l − 1) to neuron
(l)
i in layer l, and bi is the bias for neuron i in layer l.

6
7.6.2 Backward Pass

The backward pass involves computing the gradients of the loss with re-
spect to the weights and biases, starting from the output layer and mov-
ing backward through the network. The gradients are used to update the
weights and biases to minimize the loss.
The update rule for weights using gradient descent is given by:

(l) ∂L
∆wij = −η (l)
∂wij

∂L
Here, η is the learning rate, and (l) represents the partial derivative of
∂wij
(l)
the loss with respect to the weight wij .

7.7 Algorithm

Algorithm 1 Neural Network Training (Stochastic Gradient Descent)


1: Input: Training data {(X (i) , Y (i) )}, learning rate η, number of epochs
Nepochs
2: Initialize: Randomly initialize weights and biases in the neural network
3: for epoch in range(Nepochs do:
4: for each training example (X, Y ) in (X (i) , Y (i) ) do:
5: // Forward Pass
6: for each layer l: do
7: Compute z (l) = W (l) · a(l−1) + b(l)
8: Compute a(l) = Activation(z (l) )
9: // Compute Loss
10: Compute L = Loss(Y, a(output) )
11: // Backward Pass (Backpropagation)
12: for each layer l in reverse order: do
∂L ′ (l)
13: Compute δ (l) = ∂z (l) ⊙ Activation (z )
14: Update weights: W = W − η · δ (l) · (a(l−1) )T
(l) (l)

15: Update biases: b(l) = b(l) − η · δ (l)

You might also like