How to Plot a Correlation Matrix into a Graph Using R
Last Updated :
31 Jul, 2024
A correlation matrix is a table showing correlation coefficients between sets of variables. It's a powerful tool for understanding relationships among variables in a dataset. Visualizing a correlation matrix as a graph can provide clearer insights into the data. This article will guide you through the steps to plot a correlation matrix using R Programming Language.
Introduction to Correlation Matrix
A correlation matrix is a symmetric matrix with correlation coefficients, which measure the linear relationship between pairs of variables. The values range from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no linear correlation.
Libraries for Correlation Matrix Visualization
Several R packages can help you visualize a correlation matrix:
corrplot
: Specialized for plotting correlation matrices.ggcorrplot
: Based on ggplot2
, providing an elegant and customizable way to plot correlation matrices.ggplot2
: Used for creating complex and customizable visualizations.PerformanceAnalytics
: Contains functions to create charts of correlations.
We'll use the mtcars
dataset for this demonstration.
R
# Load necessary libraries
library(ggplot2)
library(corrplot)
library(ggcorrplot)
library(PerformanceAnalytics)
# Load the mtcars dataset
data(mtcars)
Step 1: Calculate the Correlation Matrix
First, compute the correlation matrix using the cor
function.
R
# Calculate the correlation matrix
cor_matrix <- cor(mtcars)
print(cor_matrix)
Output:
mpg cyl disp hp drat wt qsec
mpg 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.68117191 -0.8676594 0.41868403
cyl -0.8521620 1.0000000 0.9020329 0.8324475 -0.69993811 0.7824958 -0.59124207
disp -0.8475514 0.9020329 1.0000000 0.7909486 -0.71021393 0.8879799 -0.43369788
hp -0.7761684 0.8324475 0.7909486 1.0000000 -0.44875912 0.6587479 -0.70822339
drat 0.6811719 -0.6999381 -0.7102139 -0.4487591 1.00000000 -0.7124406 0.09120476
wt -0.8676594 0.7824958 0.8879799 0.6587479 -0.71244065 1.0000000 -0.17471588
qsec 0.4186840 -0.5912421 -0.4336979 -0.7082234 0.09120476 -0.1747159 1.00000000
vs 0.6640389 -0.8108118 -0.7104159 -0.7230967 0.44027846 -0.5549157 0.74453544
am 0.5998324 -0.5226070 -0.5912270 -0.2432043 0.71271113 -0.6924953 -0.22986086
gear 0.4802848 -0.4926866 -0.5555692 -0.1257043 0.69961013 -0.5832870 -0.21268223
carb -0.5509251 0.5269883 0.3949769 0.7498125 -0.09078980 0.4276059 -0.65624923
vs am gear carb
mpg 0.6640389 0.59983243 0.4802848 -0.55092507
cyl -0.8108118 -0.52260705 -0.4926866 0.52698829
disp -0.7104159 -0.59122704 -0.5555692 0.39497686
hp -0.7230967 -0.24320426 -0.1257043 0.74981247
drat 0.4402785 0.71271113 0.6996101 -0.09078980
wt -0.5549157 -0.69249526 -0.5832870 0.42760594
qsec 0.7445354 -0.22986086 -0.2126822 -0.65624923
vs 1.0000000 0.16834512 0.2060233 -0.56960714
am 0.1683451 1.00000000 0.7940588 0.05753435
gear 0.2060233 0.79405876 1.0000000 0.27407284
carb -0.5696071 0.05753435 0.2740728 1.00000000
Step 2: Visualize with corrplot
The corrplot
package provides a straightforward way to visualize a correlation matrix.
R
# Install and load the corrplot package
# install.packages("corrplot")
library(corrplot)
# Plot the correlation matrix
corrplot(cor_matrix, method = "circle", type = "upper",
tl.col = "black", tl.srt = 45, addCoef.col = "black")
Output:
Plot a Correlation Matrix into a Graph Using RStep 3: Visualize with ggcorrplot
The ggcorrplot
package offers a ggplot2
-based approach for plotting correlation matrices.
R
# Install and load the ggcorrplot package
# install.packages("ggcorrplot")
library(ggcorrplot)
# Plot the correlation matrix
ggcorrplot(cor_matrix, method = "circle", type = "lower",
lab = TRUE, lab_size = 3, colors = c("red", "white", "blue"))
Output:
Plot a Correlation Matrix into a Graph Using RStep 4: Visualize with ggplot2
For more customized plots, you can use ggplot2
. Here, we'll reshape the correlation matrix using the reshape2
package.
R
# Install and load the reshape2 package
# install.packages("reshape2")
library(reshape2)
# Melt the correlation matrix
melted_cor_matrix <- melt(cor_matrix)
# Plot the correlation matrix using ggplot2
ggplot(data = melted_cor_matrix, aes(x = Var1, y = Var2, fill = value)) +
geom_tile() +
scale_fill_gradient2(low = "red", high = "blue", mid = "white",
midpoint = 0, limit = c(-1,1), space = "Lab",
name="Correlation") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 12, hjust = 1)) +
coord_fixed()
Output:
Plot a Correlation Matrix into a Graph Using RStep 5: Visualize with PerformanceAnalytics
The PerformanceAnalytics
package can also be used to create charts of correlations with scatter plots, densities, and correlation values.
R
# Install and load the PerformanceAnalytics package
# install.packages("PerformanceAnalytics")
library(PerformanceAnalytics)
# Plot the correlation matrix using chart.Correlation
chart.Correlation(mtcars, histogram = TRUE, pch = 19)
Output:
Plot a Correlation Matrix into a Graph Using RConclusion
Plotting a correlation matrix in R can provide valuable insights into the relationships between variables in your dataset. This article demonstrated how to calculate a correlation matrix and visualize it using four different packages: corrplot
, ggcorrplot
, ggplot2
, and PerformanceAnalytics
. Each method has its own strengths and customization options, allowing you to choose the best approach for your specific needs.
Similar Reads
Create a correlation Matrix using Python
A Correlation matrix is a table that shows how different variables are related to each other. Each cell in the table displays a number i.e. correlation coefficient which tells us how strongly two variables are together. It helps in quickly spotting patterns, understand relationships and making bette
3 min read
How to plot a graph in R using CSV file ?
To plot a graph in R using a CSV file, we need a CSV file with two-column, the values in the first column will be considered as the points at the x-axis and the values in the second column will be considered as the points at the y-axis. In this article, we will be looking at the way to plot a graph
2 min read
How to Create an Animated Line Graph using Plotly
An animated line graph is a visual representation of data that changes over time or over a categorical variable. It can be a powerful tool for visualizing trends and patterns in data and can help to communicate complex ideas in a clear and concise way. In this tutorial, we will learn how to create a
5 min read
Visualization of a correlation matrix using ggplot2 in R
In this article, we will discuss how to visualize a correlation matrix using ggplot2 package in R programming language. In order to do this, we will install a package called ggcorrplot package. With the help of this package, we can easily visualize a correlation matrix. We can also compute a matrix
7 min read
How to Create Grouped Line Chart Using ggplot and plotly in R
Creating grouped line charts in R allows you to visualize multiple trends or series in the same plot. By using the combination of ggplot2 plotting and plotly for interactivity, you can create rich, dynamic visualizations that let you explore your data in depth. In this article, we will explore how t
4 min read
How to Graph a Linear Equation using Excel?
The equation having the highest degree is 1 is known as a linear equation. If we plot the graph for a linear equation, it always comes out to be a straight line. There are different forms of linear equations such as linear equations in one variable, and linear equations in two variables. Linear equa
3 min read
How to create a plot using ggplot2 with Multiple Lines in R ?
In this article, we will discuss how to create a plot using ggplot2 with multiple lines in the R programming language. Method 1: Using geom_line() function In this approach to create a ggplot with multiple lines, the user need to first install and import the ggplot2 package in the R console and then
3 min read
How to Add a Diagonal Line to a Plot Using R
Creating plots is a fundamental aspect of data visualization in R Programing Language. Sometimes, it's useful to add reference lines, such as diagonal lines, to the plots for better interpretation of the data. Here, we will explore how to add a diagonal line to a plot using R. Importance of Diagonal
3 min read
How to Calculate Partial Correlation in R?
In this article, we will discuss how to calculate Partial Correlation in the R Programming Language. Partial Correlation helps measure the degree of association between two random variables when there is the effect of other variables that control them. in partial correlation in machine learning It g
3 min read
How do I plot a classification graph of a SVM in R
The challenge of visualizing complex classification boundaries in machine learning can be effectively addressed with graphical representations. In R, the e1071 package, which interfaces with the libsvm library, is commonly used for creating SVM models, while graphical functions help visualize these
3 min read