
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Two-Factor Interaction Variables in R Data Frame
If we have a data frame called df that contains four columns say x, y, z, and a then the two factor interaction columns will be xy, xz, xa, yz, ya, za. To find how many two factor interaction variables can be created using data frame columns, we can make use of combn function as shown in the below examples.
Consider the below data frame −
Example
x1<-rpois(20,2) x2<-rpois(20,2) x3<-rpois(20,1) x4<-rpois(20,2) x5<-rpois(20,5) x6<-rpois(20,2) df1<-data.frame(x1,x2,x3,x4,x5,x6) df1
Output
x1 x2 x3 x4 x5 x6 1 3 1 1 2 5 0 2 1 2 3 4 6 0 3 3 2 1 4 5 1 4 1 2 0 2 3 3 5 0 0 2 1 4 3 6 4 1 0 8 3 0 7 3 2 1 0 8 3 8 3 2 1 2 6 3 9 4 4 0 1 5 0 10 1 1 1 3 3 2 11 3 2 0 4 3 1 12 0 0 2 1 4 2 13 4 4 0 2 3 3 14 2 3 0 3 3 1 15 1 4 3 1 8 2 16 2 3 1 1 4 2 17 2 3 0 2 4 3 18 2 5 1 1 10 3 19 0 2 0 1 9 3 20 0 3 0 1 4 2
Finding two factor interaction variables in df1 −
combn(colnames(df1),2,FUN=paste,collapse='_')
[1] "x1_x2" "x1_x3" "x1_x4" "x1_x5" "x1_x6" "x2_x3" "x2_x4" "x2_x5" "x2_x6" [10] "x3_x4" "x3_x5" "x3_x6" "x4_x5" "x4_x6" "x5_x6"
Example
y1<-round(rnorm(20),2) y2<-round(rnorm(20),2) y3<-round(rnorm(20),2) y4<-round(rnorm(20),2) y5<-round(rnorm(20),2) y6<-round(rnorm(20),2) df2<-data.frame(y1,y2,y3,y4,y5,y6) df2
Output
y1 y2 y3 y4 y5 y6 1 0.37 -0.25 -2.60 1.56 -0.64 -0.80 2 0.68 0.65 2.06 -0.54 0.16 -0.22 3 0.51 -0.37 0.16 -2.23 -0.42 0.52 4 -0.01 -0.32 1.65 -2.59 1.01 -1.86 5 -0.65 -0.56 -0.41 -0.88 0.50 -0.66 6 -0.42 0.55 0.26 0.02 -1.52 -0.34 7 -0.89 -0.91 -1.28 0.26 -1.27 -1.04 8 0.12 0.59 -0.80 -1.24 1.57 -0.53 9 -0.26 -1.09 0.65 -0.40 0.18 0.16 10 -1.10 -0.70 2.30 0.31 -0.46 -0.16 11 -0.42 -0.06 -0.76 0.45 0.28 -0.10 12 -0.07 2.08 -0.17 -0.16 -0.54 2.06 13 -0.91 0.37 -1.19 -2.44 -0.45 0.46 14 0.74 1.06 0.42 0.85 -0.12 -0.21 15 1.51 0.29 -0.14 0.28 0.76 -0.45 16 0.11 -0.66 -1.70 1.88 -1.16 1.05 17 0.49 0.44 -1.38 -0.39 -1.47 -1.12 18 0.67 -0.29 1.40 0.80 -0.25 1.23 19 0.45 1.57 1.34 1.75 0.25 -0.89 20 1.05 0.23 -0.06 -0.29 1.50 1.20
Finding two factor interaction variables in df2 −
combn(colnames(df2),2,FUN=paste,collapse='_')
[1] "y1_y2" "y1_y3" "y1_y4" "y1_y5" "y1_y6" "y2_y3" "y2_y4" "y2_y5" "y2_y6" [10] "y3_y4" "y3_y5" "y3_y6" "y4_y5" "y4_y6" "y5_y6"
Advertisements