Open In App

How to Plot a Correlation Matrix into a Graph Using R

Last Updated : 31 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

A correlation matrix is a table showing correlation coefficients between sets of variables. It's a powerful tool for understanding relationships among variables in a dataset. Visualizing a correlation matrix as a graph can provide clearer insights into the data. This article will guide you through the steps to plot a correlation matrix using R Programming Language.

Introduction to Correlation Matrix

A correlation matrix is a symmetric matrix with correlation coefficients, which measure the linear relationship between pairs of variables. The values range from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no linear correlation.

Libraries for Correlation Matrix Visualization

Several R packages can help you visualize a correlation matrix:

  • corrplot: Specialized for plotting correlation matrices.
  • ggcorrplot: Based on ggplot2, providing an elegant and customizable way to plot correlation matrices.
  • ggplot2: Used for creating complex and customizable visualizations.
  • PerformanceAnalytics: Contains functions to create charts of correlations.

We'll use the mtcars dataset for this demonstration.

R
# Load necessary libraries
library(ggplot2)
library(corrplot)
library(ggcorrplot)
library(PerformanceAnalytics)

# Load the mtcars dataset
data(mtcars)

Step 1: Calculate the Correlation Matrix

First, compute the correlation matrix using the cor function.

R
# Calculate the correlation matrix
cor_matrix <- cor(mtcars)
print(cor_matrix)

Output:

            mpg        cyl       disp         hp        drat         wt        qsec
mpg 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.68117191 -0.8676594 0.41868403
cyl -0.8521620 1.0000000 0.9020329 0.8324475 -0.69993811 0.7824958 -0.59124207
disp -0.8475514 0.9020329 1.0000000 0.7909486 -0.71021393 0.8879799 -0.43369788
hp -0.7761684 0.8324475 0.7909486 1.0000000 -0.44875912 0.6587479 -0.70822339
drat 0.6811719 -0.6999381 -0.7102139 -0.4487591 1.00000000 -0.7124406 0.09120476
wt -0.8676594 0.7824958 0.8879799 0.6587479 -0.71244065 1.0000000 -0.17471588
qsec 0.4186840 -0.5912421 -0.4336979 -0.7082234 0.09120476 -0.1747159 1.00000000
vs 0.6640389 -0.8108118 -0.7104159 -0.7230967 0.44027846 -0.5549157 0.74453544
am 0.5998324 -0.5226070 -0.5912270 -0.2432043 0.71271113 -0.6924953 -0.22986086
gear 0.4802848 -0.4926866 -0.5555692 -0.1257043 0.69961013 -0.5832870 -0.21268223
carb -0.5509251 0.5269883 0.3949769 0.7498125 -0.09078980 0.4276059 -0.65624923
vs am gear carb
mpg 0.6640389 0.59983243 0.4802848 -0.55092507
cyl -0.8108118 -0.52260705 -0.4926866 0.52698829
disp -0.7104159 -0.59122704 -0.5555692 0.39497686
hp -0.7230967 -0.24320426 -0.1257043 0.74981247
drat 0.4402785 0.71271113 0.6996101 -0.09078980
wt -0.5549157 -0.69249526 -0.5832870 0.42760594
qsec 0.7445354 -0.22986086 -0.2126822 -0.65624923
vs 1.0000000 0.16834512 0.2060233 -0.56960714
am 0.1683451 1.00000000 0.7940588 0.05753435
gear 0.2060233 0.79405876 1.0000000 0.27407284
carb -0.5696071 0.05753435 0.2740728 1.00000000

Step 2: Visualize with corrplot

The corrplot package provides a straightforward way to visualize a correlation matrix.

R
# Install and load the corrplot package
# install.packages("corrplot")
library(corrplot)

# Plot the correlation matrix
corrplot(cor_matrix, method = "circle", type = "upper", 
         tl.col = "black", tl.srt = 45, addCoef.col = "black")

Output:

fg
Plot a Correlation Matrix into a Graph Using R

Step 3: Visualize with ggcorrplot

The ggcorrplot package offers a ggplot2-based approach for plotting correlation matrices.

R
# Install and load the ggcorrplot package
# install.packages("ggcorrplot")
library(ggcorrplot)

# Plot the correlation matrix
ggcorrplot(cor_matrix, method = "circle", type = "lower", 
           lab = TRUE, lab_size = 3, colors = c("red", "white", "blue"))

Output:

fg
Plot a Correlation Matrix into a Graph Using R

Step 4: Visualize with ggplot2

For more customized plots, you can use ggplot2. Here, we'll reshape the correlation matrix using the reshape2 package.

R
# Install and load the reshape2 package
# install.packages("reshape2")
library(reshape2)

# Melt the correlation matrix
melted_cor_matrix <- melt(cor_matrix)

# Plot the correlation matrix using ggplot2
ggplot(data = melted_cor_matrix, aes(x = Var1, y = Var2, fill = value)) +
  geom_tile() +
  scale_fill_gradient2(low = "red", high = "blue", mid = "white", 
                       midpoint = 0, limit = c(-1,1), space = "Lab", 
                       name="Correlation") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, 
                                   size = 12, hjust = 1)) +
  coord_fixed()

Output:

fg
Plot a Correlation Matrix into a Graph Using R

Step 5: Visualize with PerformanceAnalytics

The PerformanceAnalytics package can also be used to create charts of correlations with scatter plots, densities, and correlation values.

R
# Install and load the PerformanceAnalytics package
# install.packages("PerformanceAnalytics")
library(PerformanceAnalytics)

# Plot the correlation matrix using chart.Correlation
chart.Correlation(mtcars, histogram = TRUE, pch = 19)

Output:

fg
Plot a Correlation Matrix into a Graph Using R

Conclusion

Plotting a correlation matrix in R can provide valuable insights into the relationships between variables in your dataset. This article demonstrated how to calculate a correlation matrix and visualize it using four different packages: corrplot, ggcorrplot, ggplot2, and PerformanceAnalytics. Each method has its own strengths and customization options, allowing you to choose the best approach for your specific needs.


Next Article

Similar Reads