Visualize correlation matrix using correlogram in R Programming
Last Updated :
05 Sep, 2020
A graph of the correlation matrix is known as Correlogram. This is generally used to highlight the variables in a data set or data table that are correlated most. The correlation coefficients in the plot are colored based on the value. Based on the degree of association among the variables, we can reorder the correlation matrix accordingly.
Correlogram in R
In R, we shall use the “corrplot” package to implement a correlogram. Hence, to install the package from the R Console we should execute the following command:
install.packages("corrplot")
Once we have installed the package properly, we shall load the package in our R script using the library() function as follows:
library("corrplot")
We shall now see how to implement the correlogram in R programming. We shall see the detailed explanation of the implementation with an example in a step by step manner.
Example:
Step 1: [Data for Correlation Analysis]: The first job is to select a proper dataset to implement the concept. For our example, we will be using the “mtcars” data set which is an inbuilt data set of R. We will see some of the data in this data set.
R
library (corrplot)
head (mtcars)
|
Output:
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Step 2: [Computing Correlation Matrix]: We will now compute a correlation matrix for which we want to plot the correlogram. We shall use the cor() function for computing a correlation matrix.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
|
Output:
head(round(M,2))
mpg cyl disp hp drat wt qsec vs am gear carb
mpg 1.00 -0.85 -0.85 -0.78 0.68 -0.87 0.42 0.66 0.60 0.48 -0.55
cyl -0.85 1.00 0.90 0.83 -0.70 0.78 -0.59 -0.81 -0.52 -0.49 0.53
disp -0.85 0.90 1.00 0.79 -0.71 0.89 -0.43 -0.71 -0.59 -0.56 0.39
hp -0.78 0.83 0.79 1.00 -0.45 0.66 -0.71 -0.72 -0.24 -0.13 0.75
drat 0.68 -0.70 -0.71 -0.45 1.00 -0.71 0.09 0.44 0.71 0.70 -0.09
wt -0.87 0.78 0.89 0.66 -0.71 1.00 -0.17 -0.55 -0.69 -0.58 0.43
Step 3: [Visualizing using Method argument]: At first, we shall see how to visualize the correlogram in different shapes like circles, pie, ellipse, and so on. We shall use the corrplot() function and mention the shape in its method arguments.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
corrplot (M, method= "circle" )
corrplot (M, method= "pie" )
corrplot (M, method= "color" )
corrplot (M, method= "number" )
|
Output:




Step 4: [Visualizing using type argument]: We shall see how to visualize the correlogram in different types like upper and lower triangular matrices. We shall use the corrplot() function and mention the type argument.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
corrplot (M, type= "upper" )
corrplot (M, type= "lower" )
|
Output:


Step 5: [Reordering the correlogram]: We shall see how to reorder the correlogram. We shall use the corrplot() function and mention the order argument. We are going to use the “hclust” ordering for hierarchical clustering.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M, 2))
corrplot (M, type = "upper" , order = "hclust" )
col<- colorRampPalette ( c ( "red" , "white" , "blue" ))(20)
corrplot (M, type= "upper" , order = "hclust" , col = col)
corrplot (M, type= "upper" , order= "hclust" ,
col = c ( "black" , "white" ),
bg = "lightblue" )
|
Output:



Step 6: [Changing the color in correlogram]: We shall now see how to change the color in correlogram. For this purpose, we have installed the “RColorBrewer” package and added it to our R script to use its palette colors.
R
library (corrplot)
library (RColorBrewer)
head (mtcars)
M<- cor (mtcars)
head ( round (M, 2))
corrplot (M, type= "upper" , order = "hclust" ,
col= brewer.pal (n = 8, name = "RdBu" ))
corrplot (M, type= "upper" , order = "hclust" ,
col= brewer.pal (n = 8, name = "RdYlBu" ))
corrplot (M, type= "upper" , order = "hclust" ,
col= brewer.pal (n = 8, name = "PuOr" ))
|
Output:



Step 7: [Changing the color and rotation of the text labels]: For this purpose, we shall include the tl.col and tl.str arguments in the corrplot() function.
R
library (corrplot)
library (RColorBrewer)
head (mtcars)
M<- cor (mtcars)
head ( round (M, 2))
corrplot (M, type = "upper" , order = "hclust" ,
tl.col = "black" , tl.srt = 45)
|
Output:

Step 8: [Computing the p-value of correlations]: Before we can add significance test to the correlogram we shall compute the p-value of the correlations using a custom R function as follows:
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
cor.mtest <- function (mat, ...)
{
mat <- as.matrix (mat)
n <- ncol (mat)
p.mat<- matrix ( NA , n, n)
diag (p.mat) <- 0
for (i in 1:(n - 1))
{
for (j in (i + 1):n)
{
tmp <- cor.test (mat[, i], mat[, j], ...)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
}
}
colnames (p.mat) <- rownames (p.mat) <- colnames (mat)
p.mat
}
p.mat <- cor.mtest (mtcars)
head (p.mat[, 1:5])
|
Output:
head(p.mat[, 1:5])
mpg cyl disp hp drat
mpg 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07 1.776240e-05
cyl 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09 8.244636e-06
disp 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08 5.282022e-06
hp 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00 9.988772e-03
drat 1.776240e-05 8.244636e-06 5.282022e-06 9.988772e-03 0.000000e+00
wt 1.293959e-10 1.217567e-07 1.222320e-11 4.145827e-05 4.784260e-06
Step 9: [Add Significance Test]: We need to add the sig.level and insig argument in the corrplot() function. If the p-value is greater than 0.01 then it is an insignificant value for which the cells are either blank or crossed.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M, 2))
library (corrplot)
cor.mtest <- function (mat, ...)
{
mat <- as.matrix (mat)
n <- ncol (mat)
p.mat<- matrix ( NA , n, n)
diag (p.mat) <- 0
for (i in 1:(n - 1))
{
for (j in (i + 1):n)
{
tmp <- cor.test (mat[, i], mat[, j], ...)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
}
}
colnames (p.mat) <- rownames (p.mat) <- colnames (mat)
p.mat
}
p.mat <- cor.mtest (mtcars)
head (p.mat[, 1:5])
corrplot (M, type = "upper" , order = "hclust" ,
p.mat = p.mat, sig.level = 0.01)
corrplot (M, type = "upper" , order = "hclust" ,
p.mat = p.mat, sig.level = 0.01,
insig = "blank" )
|
Output:


Step 10: [Customizing the Correlogram]: We can customize our correlogram using the required arguments in corrplot() function and adjusting their values.
R
library (corrplot)
library (RColorBrewer)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
library (corrplot)
col <- colorRampPalette ( c ( "#BB4444" , "#EE9988" ,
"#FFFFFF" , "#77AADD" ,
"#4477AA" ))
corrplot (M, method = "color" , col = col (200),
type = "upper" , order = "hclust" ,
addCoef.col = "black" ,
tl.col= "black" , tl.srt = 45,
p.mat = p.mat, sig.level = 0.01, insig = "blank" ,
diag = FALSE
)
|
Output:

Similar Reads
Visualize Correlation Matrix using symnum function in R Programming
Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This relation can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negative) relationship
6 min read
Visualization of a correlation matrix using ggplot2 in R
In this article, we will discuss how to visualize a correlation matrix using ggplot2 package in R programming language. In order to do this, we will install a package called ggcorrplot package. With the help of this package, we can easily visualize a correlation matrix. We can also compute a matrix
7 min read
Correlation Matrix in R Programming
Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This Correlation Matrix in R can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negativ
4 min read
Visualize Confusion Matrix Using Caret Package in R
In this article, we are going to visualize a confusion matrix using a caret package in R programming language. What is Confusion Matrix?The Confusion Matrix is a type of matrix that is used to visualize the predicted values against the actual Values. The row headers in the confusion matrix represent
5 min read
Pearson Correlation Testing in R Programming
Correlation is a statistical measure that indicates how strongly two variables are related. It involves the relationship between multiple variables as well. For instance, if one is interested to know whether there is a relationship between the heights of fathers and sons, a correlation coefficient c
5 min read
Array vs Matrix in R Programming
The data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks. Data structures in R programming are tools for holding multiple values. The two most important data structures in R ar
3 min read
Kendall Correlation Testing in R Programming
Correlation is a statistical measure that indicates how strongly two variables are related. It involves the relationship between multiple variables as well. For instance, if one is interested to know whether there is a relationship between the heights of fathers and sons, a correlation coefficient c
4 min read
Spearman Correlation Testing in R Programming
Correlation is a key statistical concept used to measure the strength and direction of the relationship between two variables. Unlike Pearsonâs correlation, which assumes a linear relationship and continuous data, Spearmanâs rank correlation coefficient is a non-parametric measure that assesses how
3 min read
How to Plot a Correlation Matrix into a Graph Using R
A correlation matrix is a table showing correlation coefficients between sets of variables. It's a powerful tool for understanding relationships among variables in a dataset. Visualizing a correlation matrix as a graph can provide clearer insights into the data. This article will guide you through t
4 min read
Covariance and Correlation in R Programming
Covariance and Correlation are terms used in statistics to measure relationships between two random variables. Both of these terms measure linear dependency between a pair of random variables or bivariate data. They both capture a different component of the relationship, despite the fact that they b
5 min read