Open In App

R Time Series Modeling on Weekly Data Using ts() Object

Last Updated : 22 Aug, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Time series modeling refers to the analysis and forecasting of data points collected or recorded at specific time intervals. This type of modeling focuses on identifying patterns, trends, and seasonal variations within a dataset that is sequentially ordered in time. It is important because many real-world processes and phenomena change over time. It helps in forecasting future events, optimizing resources, and making strategic decisions in various fields. For example, it can predict stock prices, weather conditions, sales performance, and economic indicators, which are all essential for planning and decision-making.

What is ts() Object?

A ts() object in R represents a time series data structure. It is specifically designed for storing and handling time series data, which consists of sequential observations recorded over time at regular intervals.

  • Time Series Data: The data in a ts() object is expected to be ordered and equally spaced in time, such as daily, monthly, quarterly, or yearly data.
  • Attributes:
    • Start: Indicates the time at which the series begins.
    • End: Specifies when the series ends (calculated automatically based on the data and frequency).
    • Frequency: Refers to the number of observations per unit of time. For example:
  • Methods: Various functions are available in R for analyzing, visualizing, and forecasting time series data stored in a ts() object. Examples include plot(), summary(), window(), diff(), and forecast().

Time Series Components

  1. Trend: The trend component represents the long-term movement or direction in the data. Trends can be upward, downward, or flat.
  2. Seasonality: Seasonality refers to regular and predictable variations that occur within specific time intervals, such as daily, monthly, or quarterly. These variations repeat over a consistent period and are influenced by external factors like holidays, weather, or business cycles.
  3. Noise: Noise represents the random fluctuations or irregularities in the data that cannot be attributed to trend or seasonality. It occurs due to random events or errors.

Now we will discuss step by step implementation of Time Series Modeling on Weekly Data Using ts() Object in R Programming Language.

Step 1: Loading the Packages and Dataset

To import or create weekly time series data in R, you can use various functions depending on your data source. Here’s how to load data from a CSV file or create a simple dataset:

Dataset link: timeseries.csv

R
# Install and load necessary packages
install.packages("readr")
install.packages("forecast")
library(readr)
library(forecast)

# Load data from the URL
url <- "https://github.com/plotly/datasets/raw/master/timeseries.csv"
data <- read_csv(url)

# Display the first few rows of the dataset
head(data)

Output:

Rows: 11 Columns: 8                                                                       
── Column specification ────────────────────────────────────────────────────
Delimiter: ","
dbl (7): A, B, C, D, E, F, G
date (1): Date

# A tibble: 6 × 8
Date A B C D E F G
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2008-03-18 24.7 165. 115. 26.3 19.2 28.9 63.4
2 2008-03-19 24.2 165. 115. 26.2 19.1 27.8 60.0
3 2008-03-20 24.0 165. 115. 25.8 19.0 27.0 59.6
4 2008-03-25 24.1 164. 115. 27.4 19.6 27.8 59.4
5 2008-03-26 24.4 163. 115. 26.9 19.5 28.0 60.1
6 2008-03-27 24.4 163. 115. 27.1 19.7 28.2 59.6
  • install.packages("readr") and install.packages("forecast"): Install the readr package for reading CSV files and the forecast package for time series modeling.
  • library(readr) and library(forecast): Load the installed packages into the R session.
  • read_csv(url): Read the CSV file directly from the URL into a data frame called data.
  • head(data): Display the first few rows of the dataset to understand its structure and contents.

Step 2: Preparing the Data

Convert the Date column and extract values from column A for time series analysis.

R
# Convert Date column to Date type
data$Date <- as.Date(data$Date, format = "%Y-%m-%d")

# Extract the 'A' column and ensure it's numeric
values <- as.numeric(data$A)

# Check for any NA values in the 'A' column
sum(is.na(values))

# Determine the start date and frequency
start_date <- min(data$Date)
start_year <- as.numeric(format(start_date, "%Y"))
start_week <- as.numeric(format(start_date, "%U")) + 1

# Define frequency (assuming weekly data)
frequency <- 52

# Create the time series object
ts_data <- ts(values, start = c(start_year, start_week), frequency = frequency)

Output:

[1] 0
  • Use min(data$Date) to find the start date and extract the year and week.
  • Define the frequency (52 for weekly data).
  • Create the ts object with the values, start, and frequency.

Step 3: Performing Basic Modeling

  • Fit ARIMA and Exponential Smoothing models to the ts_data.
  • Generate forecasts for the next 10 periods using both models.
R
# Fit ARIMA model
arima_model <- auto.arima(ts_data)
summary(arima_model)
# Forecast the next 10 periods
forecast_arima <- forecast(arima_model, h = 10)
forecast_arima
# Fit Exponential Smoothing model
ets_model <- ets(ts_data)

# Forecast the next 10 periods
forecast_ets <- forecast(ets_model, h = 10)

Output:

Series: ts_data 
ARIMA(2,0,0) with non-zero mean

Coefficients:
ar1 ar2 mean
0.6441 -0.7652 24.2385
s.e. 0.2781 0.1941 0.0476

sigma^2 = 0.03232: log likelihood = 4.07
AIC=-0.13 AICc=6.53 BIC=1.46

Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.003029856 0.1533215 0.1179522 0.008829293 0.4871729 NaN -0.3090926

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2008.423 24.46344 24.23304 24.69385 24.11107 24.81582
2008.442 24.30574 24.03167 24.57980 23.88659 24.72489
2008.462 24.10969 23.82398 24.39539 23.67274 24.54664
2008.481 24.10409 23.77388 24.43430 23.59907 24.60910
2008.500 24.25050 23.91725 24.58375 23.74085 24.76016
2008.519 24.34910 24.00180 24.69640 23.81796 24.88024
2008.538 24.30057 23.93989 24.66125 23.74896 24.85218
2008.558 24.19386 23.83298 24.55474 23.64194 24.74578
2008.577 24.16226 23.79211 24.53240 23.59617 24.72835
2008.596 24.22356 23.85084 24.59628 23.65354 24.79359
  • auto.arima(ts_data): Automatically fits the best ARIMA model to the time series data by selecting the optimal parameters.
  • forecast(arima_model, h = 10): Forecasts the next 10 periods using the ARIMA model.
  • ets(ts_data): Fits an Exponential Smoothing model to the time series data.
  • forecast(ets_model, h = 10): Forecasts the next 10 periods using the Exponential Smoothing model.

Step 4: Visualizing Results

Plot the original time series data and forecasts to visualize the results.

R
# Define a 3-row layout
par(mfrow = c(3, 1))  # 3 rows, 1 column

# Plot the time series data
plot(ts_data, main = "Time Series Data", ylab = "Values", xlab = "Time")

# Plot ARIMA forecast
plot(forecast_arima, main = "ARIMA Forecast")

# Plot Exponential Smoothing forecast
plot(forecast_ets, main = "Exponential Smoothing Forecast")

# Reset to default layout
par(mfrow = c(1, 1))  # Back to 1 plot per page

Output:

fg
R Time Series Modeling on Weekly Data Using ts() Object
  • plot(ts_data, main = "Time Series Data", ylab = "Values", xlab = "Time"): Plots the original time series data with appropriate labels for the title, y-axis, and x-axis.
  • plot(forecast_arima, main = "ARIMA Forecast"): Plots the forecast results from the ARIMA model, including predictions and confidence intervals.
  • plot(forecast_ets, main = "Exponential Smoothing Forecast"): Plots the forecast results from the Exponential Smoothing model, including predictions and confidence intervals.

Conclusion

When working with time series data in R, it is essential to ensure that the data is properly cleaned and formatted before modeling. Always verify the column names and data types to avoid errors during analysis. Use appropriate frequency settings to accurately capture the temporal structure of the data, and choose the modeling technique that best fits the characteristics of your time series. Additionally, regularly visualize your data and forecasts to interpret results effectively and identify any anomalies or patterns.


Next Article

Similar Reads