
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Group-wise Correlation Coefficient in R
If we have two continuous and one categorical column in an R data frame then we can find the correlation coefficient between continuous values for the categories in the categorical column. For this purpose, we can use by function and pass the cor function with the spearman method as shown in the below examples.
Example1
Consider the below data frame:
> x1<-sample(c("A","B","C"),20,replace=TRUE) > y1<-rnorm(20,1,0.24) > z1<-rpois(20,2) > df1<-data.frame(x1,y1,z1) > df1
Output
x1 y1 z1 1 A 1.1155324 2 2 C 0.9801564 3 3 B 0.9116162 1 4 A 0.8406772 3 5 C 0.8009355 2 6 A 0.9331637 2 7 B 1.0642089 1 8 B 1.1633515 0 9 B 1.1599037 5 10 B 1.0509981 2 11 B 0.7574267 1 12 B 0.8456225 1 13 B 0.8926751 2 14 B 0.6074419 3 15 C 0.7999792 0 16 A 1.0685236 2 17 B 0.9756677 3 18 A 0.9495342 0 19 C 1.0109747 2 20 A 0.9090985 4
Finding the correlation between y1 and z1 for categories in x1:
Example
> by(df1,df1$x1,FUN=function(x) cor(df1$y1,df1$z1,method="spearman")) df1$x1: A
Output
[1] 0.03567607
Example
df1$x1: B
Output
[1] 0.03567607
Example
df1$x1: C
Output
[1] 0.03567607
Example2
> x2<-sample(c("India","China","France"),20,replace=TRUE) > y2<-rexp(20,0.335) > z2<-runif(20,2,10) > df2<-data.frame(x2,y2,z2) > df2
Output
x2 y2 z2 1 France 2.31790394 2.649538 2 China 10.61012173 8.340615 3 France 5.00085220 6.602884 4 France 1.67707140 7.722530 5 India 9.60663732 9.837268 6 France 1.46030289 5.370930 7 France 10.44614704 9.035748 8 India 0.39506766 6.318701 9 China 1.83071453 7.282782 10 China 0.23080001 7.210144 11 India 2.27763766 9.233019 12 China 18.21276888 9.928614 13 France 1.72085517 9.176826 14 India 4.77786071 8.899026 15 China 8.55501571 7.240147 16 China 0.19832026 5.641800 17 India 0.03113389 6.928705 18 China 0.56958471 3.496314 19 China 0.72728737 6.903436 20 India 8.73571474 5.286486
Finding the correlation between y2 and z2 for categories in x2:
Example
> by(df2,df2$x2,FUN=function(x) cor(df2$y2,df2$z2,method="spearman")) df2$x2: China
Output
[1] 0.487218
Example
df2$x2: France
Output
[1] 0.487218
Example
df2$x2: India
Output
[1] 0.487218
Advertisements