Root-Mean-Square Error in R Programming
Last Updated :
15 Jul, 2025
Root Mean Squared Error (RMSE) is the square root of the mean of the squared errors. It is a useful error metric for numerical predictions, primarily to compare prediction errors of different models or configurations for the same variable, as it is scale-dependent. RMSE measures how well a regression line fits the data.
\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}
Where:
- y_i = actual value
- \hat{y_i} = predicted value
- n = number of observations
Note: The difference between the actual values and the predicted values is known as residuals.
Significance of RMSE
Here are some significance of RMSE.
- Scale-Dependent: RMSE has the same units as the target variable. A lower RMSE indicates better model performance, but the value must be compared with the scale of the target variable to make sense.
- Sensitive to Outliers: Since RMSE squares the error terms, larger errors have a disproportionately large effect, making RMSE sensitive to outliers.
- Comparing Models: RMSE can be used to compare models. A model with a lower RMSE value is generally considered better at predicting the target variable.
Computing RMSE in R
Now we will discuss different method to compute RMSE in R Programming Language.
1. Simple RMSE Calculation
Let’s first compute the RMSE between two vectors (actual and predicted values) manually.
R
actual = c(1.5, 1.0, 2.0, 7.4, 5.8, 6.6)
predicted = c(1.0, 1.1, 2.5, 7.3, 6.0, 6.2)
rmse <- sqrt(mean((actual - predicted)^2))
rmse
Output:
[1] 0.3464102
The above code calculates the RMSE between the actual and predicted values manually by following the RMSE formula.
2. Calculating RMSE Using the Metrics Package
The Metrics package offers a convenient rmse() function. First, install and load the package.
R
install.packages("Metrics")
library(Metrics)
rmse_value <- rmse(actual, predicted)
rmse_value
Output:
[1] 0.3464102
3. Calculating RMSE Using the caret Package
The caret package is a popular package for machine learning and model evaluation. It provides a similar RMSE() function.
R
install.packages("caret")
library(caret)
rmse_value <- RMSE(predicted, actual)
rmse_value
Output:
[1] 0.3464102
4. Calculating RMSE for Regression Models
In regression models, RMSE is used to evaluate the performance of the model. Let’s fit a linear regression model in R and compute the RMSE for the predicted values.
R
data(mtcars)
model <- lm(mpg ~ hp, data = mtcars)
predicted_values <- predict(model, mtcars)
actual_values <- mtcars$mpg
rmse_regression <- sqrt(mean((actual_values - predicted_values)^2))
rmse_regression
Output:
[1] 3.740297
This example fits a linear regression model predicting the miles per gallon (mpg) of cars based on horsepower (hp) and computes the RMSE to evaluate the model's prediction accuracy.
Interpreting RMSE involves understanding its relationship with the data.
- Low RMSE: Indicates that the model's predictions are close to the actual values.
- High RMSE: Indicates large errors in prediction.
However, the RMSE value should always be interpreted in the context of the data. For example, an RMSE of 10 might be considered good for a dataset where the target variable ranges between 100 and 500, but it could indicate poor performance if the target variable ranges between 0 and 20.
5. Visualizing RMSE
Visualizing the performance of our model can help in understanding where the model is underperforming. A scatter plot of actual vs. predicted values can provide insights into how well the model fits the data.
R
plot(actual_values, predicted_values, xlab = "Actual", ylab = "Predicted",
main = "Actual vs Predicted Values")
abline(0, 1, col = "red")
Output:
Root-Mean-Square Error in R ProgrammingThe closer the points are to the red line (where actual = predicted), the better the model's predictions.
Explore
Introduction
Fundamentals of R
Variables
Input/Output
Control Flow
Functions
Data Structures
Object Oriented Programming
Error Handling
File Handling