Open In App

Time Series Forecasting Using TensorFlow in R

Last Updated : 20 Aug, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Time series forecasting involves using past data collected at regular intervals to predict future values of a variable that changes over time. By analyzing historical data, we can understand trends, seasonal patterns, and cyclical behaviors, which helps in making more informed decisions.

Applications and Use Cases

Time series forecasting to plan better, manage inventory, and allocate resources efficiently. By looking at the future demand, companies can streamline their supply chains and operations.

  • Retail and E-commerce: Forecasting sales and demand helps manage inventory levels and plan promotions effectively.
  • Finance: Predicting stock prices, currency exchange rates, and market trends supports investment decisions and financial planning.
  • Energy Sector: Forecasting energy consumption and production helps in efficient energy management and maintaining grid stability.
  • Transportation and Logistics: Predicting travel demand, shipment arrivals, and fleet needs enhances operational efficiency.
  • Weather Forecasting: Accurate weather predictions are crucial for planning and preparing for weather-related events and disasters.
  • Financial Analysis: In finance, forecasting helps predict stock prices, interest rates, and economic indicators, helping investors and policymakers make informed financial decisions and assess risks.
  • Economic and Business Insights: Forecasting identifies economic trends and cycles, supporting budgeting, policy-making, and economic analysis.
  • Health and Medicine: It helps predict patient outcomes, disease outbreaks, and resource needs in healthcare settings, improving planning and response.

Methods for Time Series Forecasting

1. Statistical Methods:

  • ARIMA (AutoRegressive Integrated Moving Average): Combines autoregressive and moving average components with differencing to make the time series stationary.
  • Exponential Smoothing Methods: Includes Simple Exponential Smoothing, Holt’s Linear Trend Model, and Holt-Winters Seasonal Model, which use weighted averages of past observations to forecast future values.
  • Seasonal Decomposition of Time Series (STL): Decomposes the time series into seasonal, trend, and residual components to understand and forecast based on these components.

2. Machine Learning Methods:

  • Regression Models: Linear regression and its variants (e.g., Lasso, Ridge) can be used with time lags as features to forecast future values.
  • Decision Trees and Random Forests: Use tree-based methods to capture non-linear relationships in time series data.
  • Support Vector Machines (SVM): Used for regression tasks with time series data to forecast future values.

3. Deep Learning Methods:

  • Recurrent Neural Networks (RNNs): RNNs, especially Long Short-Term Memory (LSTM) networks, are designed to learn sequences and temporal dependencies in time series data.
  • Convolutional Neural Networks (CNNs): Used for extracting features from time series data, often combined with RNNs for improved forecasting performance.
  • Transformers: Modern architectures that handle long-range dependencies and have shown success in various time series forecasting tasks.

Now we will discuss step by step implementation of Time Series Forecasting Using TensorFlow in R Programming Language.

1. Installing TensorFlow and Required Packages in R

  • Install R and RStudio: Ensure you have R and RStudio installed on your system. You can download R from CRAN and RStudio from RStudio's website.
  • Install the tensorflow R Package: Open RStudio or your R console and install the tensorflow package using the following command:
R
install.packages("tensorflow")

Install TensorFlow: After installing the tensorflow package, you need to install TensorFlow itself. You can do this using the following command in R. This command installs TensorFlow and its dependencies, including Keras, which is a high-level neural networks API that runs on top of TensorFlow.

2: Load and Inspect the Dataset

  • First, you need to load and prepare your time series data.
  • Dataset link: yahoo.csv
R
library(tidyverse)
library(keras)

# Load the data
data_path <- 'C:/Users/hp/Desktop/Shalini/yahoo_stock.csv'
data <- read_csv(data_path)

# Inspect the data
head(data)

Output:

# A tibble: 6 × 7
Date High Low Open Close Volume `Adj Close`
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2015-11-23 2096. 2081. 2089. 2087. 3587980000 2087.
2 2015-11-24 2094. 2070. 2084. 2089. 3884930000 2089.
3 2015-11-25 2093 2086. 2089. 2089. 2852940000 2089.
4 2015-11-26 2093 2086. 2089. 2089. 2852940000 2089.
5 2015-11-27 2093. 2084. 2089. 2090. 1466840000 2090.
6 2015-11-28 2093. 2084. 2089. 2090. 1466840000 2090.

3. Preprocessing the Data

Remove or impute missing values as necessary and If you want to keep the date for analysis, but it won’t be used as a feature in training. Scaling helps in better convergence during training.

R
# Remove rows with missing values
data <- na.omit(data)

data <- data %>% select(-Date)

# Normalize the data
data_scaled <- scale(data)

4. Transform Data for TensorFlow

Create Sequences for transform Data for TensorFlow.

R
# Parameters
window_size <- 10  # Length of each sequence
num_samples <- nrow(data_scaled) - window_size
n_features <- ncol(data_scaled)

# Initialize arrays
X_train <- array(0, dim = c(num_samples, window_size, n_features))
y_train <- numeric(num_samples)

# Populate the arrays
for (i in 1:num_samples) {
  X_train[i, , ] <- data_scaled[i:(i + window_size - 1), , drop = FALSE]
  y_train[i] <- data_scaled[i + window_size, 1] 
}

Split Data into Training and Validation Sets

R
# Split data
split_index <- floor(0.8 * num_samples)
X_train_split <- X_train[1:split_index, , ]
y_train_split <- y_train[1:split_index]
X_val_split <- X_train[(split_index + 1):num_samples, , ]
y_val_split <- y_train[(split_index + 1):num_samples]

5. Build and Train the Model

Define and compile the LSTM model for forecasting.

R
# Define the LSTM model
model <- keras_model_sequential() %>%
  layer_lstm(units = 50, input_shape = c(window_size, n_features)) %>%
  layer_dense(units = 1)

# Compile the model
model %>% compile(
  loss = 'mean_squared_error',
  optimizer = optimizer_adam()
)

# Train the model
history <- model %>% fit(
  x = X_train_split,
  y = y_train_split,
  epochs = 50,
  batch_size = 32,
  validation_data = list(X_val_split, y_val_split)
)

Output:

Epoch 1/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 1s 44ms/step - loss: 0.0016
10/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.0010
19/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 9.6880e-04
28/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 9.7456e-04
36/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 9.9099e-04
44/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 9.9884e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - loss: 9.9943e-04 - val_loss: 0.0176
Epoch 2/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 55ms/step - loss: 9.6261e-04
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 8.2002e-04
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.3458e-04
24/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.4164e-04
30/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.4608e-04
36/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.5371e-04
42/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.6408e-04
45/46 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 8.6953e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 16ms/step - loss: 8.7285e-04 - val_loss: 0.0170
Epoch 3/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 31s 699ms/step - loss: 9.8494e-04
8/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.8535e-04
16/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.0111e-04
24/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.1450e-04
31/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.2214e-04
37/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.2583e-04
43/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.3055e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 9.3312e-04 - val_loss: 0.0180
Epoch 4/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 3s 83ms/step - loss: 0.0013
7/46 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0011
14/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0011
21/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
27/46 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0010
35/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
43/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.9980e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 14ms/step - loss: 9.9692e-04 - val_loss: 0.0191
Epoch 5/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 3s 67ms/step - loss: 0.0010
7/46 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 9.1444e-04
13/46 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0010
21/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
29/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
37/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
44/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - loss: 0.0010 - val_loss: 0.0165
Epoch 6/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 57ms/step - loss: 8.0748e-04
8/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.6443e-04
14/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.4415e-04
21/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.5149e-04
28/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.5182e-04
36/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.5236e-04
44/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.5593e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 9.5749e-04 - val_loss: 0.0156
Epoch 7/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 63ms/step - loss: 0.0014
8/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0012
16/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0011
24/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
32/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
40/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 0.0010 - val_loss: 0.0150
Epoch 8/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 3s 86ms/step - loss: 8.7992e-04
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.0010
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
25/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
33/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
39/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 0.0010 - val_loss: 0.0178
Epoch 9/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 4s 103ms/step - loss: 3.3443e-04
5/46 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 5.8341e-04
10/46 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 7.2833e-04
16/46 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - loss: 7.9645e-04
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 8.0055e-04
22/46 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - loss: 8.2456e-04
28/46 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - loss: 8.4083e-04
36/46 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 8.6441e-04
44/46 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step - loss: 8.8198e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - loss: 8.8574e-04 - val_loss: 0.0175
Epoch 10/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 58ms/step - loss: 7.8024e-04
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 9.9869e-04
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.9453e-04
25/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
33/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
41/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 0.0010 - val_loss: 0.0166
Epoch 11/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 4s 103ms/step - loss: 0.0016
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0011
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
24/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0010
32/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.8625e-04
40/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.7340e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 12ms/step - loss: 9.6707e-04 - val_loss: 0.0153
Epoch 12/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 6s 137ms/step - loss: 8.7191e-04
8/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
15/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0010
23/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.6932e-04
31/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.4658e-04
40/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.3508e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 11ms/step - loss: 9.3460e-04 - val_loss: 0.0164
Epoch 13/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 61ms/step - loss: 9.3391e-04
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 8.8349e-04
16/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.9441e-04
24/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.1544e-04
32/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.2831e-04
40/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.3122e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 9.3045e-04 - val_loss: 0.0148
Epoch 14/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 64ms/step - loss: 8.9970e-04
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.0996e-04
17/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.5897e-04
25/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.8976e-04
33/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 9.0438e-04
39/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 9.0974e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - loss: 9.1485e-04 - val_loss: 0.0203
Epoch 15/15

1/46 ━━━━━━━━━━━━━━━━━━━━ 2s 51ms/step - loss: 0.0011
9/46 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 8.1371e-04
15/46 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 8.0618e-04
21/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.1400e-04
26/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.3199e-04
34/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.4762e-04
42/46 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 8.6486e-04
46/46 ━━━━━━━━━━━━━━━━━━━━ 1s 14ms/step - loss: 8.7287e-04 - val_loss: 0.0145
lossss
Loss graph
  • keras_model_sequential(): Creates a sequential model in Keras.
  • layer_lstm: Adds an LSTM (Long Short-Term Memory) layer to the model.
  • units = 50: Specifies the number of units (or neurons) in the LSTM layer. In this case, the layer has 50 units.
  • return_sequences = TRUE: Indicates that the layer should return the full sequence of outputs for each time step, rather than just the output for the last time step.
  • input_shape = c(window_size, 1): Defines the shape of the input data to this layer. window_size is the length of the time window used for each sequence, and 1 represents the number of features (in this case, a single feature per time step).
  • layer_lstm(units = 50): Adds a second LSTM layer with 50 units.
  • layer_dense(units = 1): Adds a Dense layer with 1 unit. The Dense layer is used to produce the final output of the model.
  • compile(): Configures the model for training by specifying the optimizer and loss function.
  • optimizer = 'adam': Sets the Adam optimizer for training. It adapts the learning rate during training.
  • loss = 'mean_squared_error': Specifies the loss function used to evaluate the model's performance.

6. Evaluate and Make Predictions

After training, evaluate the model and make predictions.

R
# Evaluate the model
model %>% evaluate(X_val_split, y_val_split)

# Make predictions
predictions <- model %>% predict(X_val_split)

Output:

$loss
[1] 0.0145312

7. Post-Processing

Convert predictions back to the original scale and visualize results.

R
# Assuming `predictions` is a matrix or array
predictions <- as.array(predictions)

# Check if predictions are in the right format
if (is.list(predictions)) {
  predictions <- as.numeric(unlist(predictions))
}

# Calculate mean and standard deviation from the original data
mean_value <- mean(data$Close, na.rm = TRUE)
sd_value <- sd(data$Close, na.rm = TRUE)

# Denormalize predictions
predictions_denorm <- predictions * sd_value + mean_value

# Check the result
head(predictions_denorm)

# Plot results
library(ggplot2)

# Assuming you have actual values for validation
actual_values <- y_val_split * sd_value + mean_value

# Create a comparison dataframe
comparison <- data.frame(
  Actual = actual_values,
  Predicted = predictions_denorm
)

# Plot the comparison
ggplot(comparison, aes(x = 1:nrow(comparison))) +
  geom_line(aes(y = Actual, color = "Actual")) +
  geom_line(aes(y = Predicted, color = "Predicted")) +
  labs(title = "Actual vs Predicted Stock Prices", x = "Time", y = "Price")

Output:

         [,1]
[1,] 3103.118
[2,] 3102.912
[3,] 3119.219
[4,] 3131.339
[5,] 3141.476
[6,] 3139.734
graph
Actual vs predicted graph

Conclusion

TensorFlow offers several advantages for time series forecasting in R. It provides flexibility to build complex and customized neural network architectures, such as LSTM and GRU models, which are particularly effective for capturing temporal dependencies. TensorFlow's scalability allows it to handle large datasets efficiently, and its integration with Keras simplifies the process of defining, training, and evaluating models. The comprehensive ecosystem of TensorFlow also includes various tools and libraries for model optimization and deployment.


Next Article

Similar Reads