
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Subset R Data Frame Based on Categorical Column Value
To subset an R data frame with condition based on only one value from categorical column, we can follow the below steps −
- First of all, create a data frame.
- Then, subset the data frame with condition using filter function of dplyr package.
Create the data frame
Let's create a data frame as shown below −
Class<-sample(c("First","Second","Third","Fourth"),25,replace=TRUE) x<-sample(1:10,25,replace=TRUE) y<-sample(1:10,25,replace=TRUE) z<-sample(1:10,25,replace=TRUE) df<-data.frame(Class,x,y,z) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Class x y z 1 Fourth 10 6 7 2 First 10 1 5 3 Third 3 5 9 4 First 2 8 5 5 Third 4 9 9 6 First 2 5 3 7 Second 2 7 7 8 Third 6 4 4 9 First 2 9 3 10 First 10 7 4 11 Fourth 1 9 3 12 First 8 7 8 13 First 7 5 3 14 First 10 4 2 15 First 8 9 2 16 First 9 9 10 17 Third 1 1 10 18 Third 5 9 6 19 First 3 2 9 20 Third 8 5 4 21 Third 9 2 7 22 Second 5 9 3 23 Third 10 3 6 24 First 10 6 9 25 Third 1 10 4
Subset the data frame with condition based on a categorical column
Using filter function to subset df when x is greater than 5 and Class is First −
Class<-sample(c("First","Second","Third","Fourth"),25,replace=TRUE) x<-sample(1:10,25,replace=TRUE) y<-sample(1:10,25,replace=TRUE) z<-sample(1:10,25,replace=TRUE) df<-data.frame(Class,x,y,z) library(dplyr) df %>% group_by(Class) %>% filter(x>5 & Class=="First")
Output
# A tibble: 8 x 4 # Groups: Class [1] Class x y z <chr> <int> <int> <int> 1 First 10 1 5 2 First 10 7 4 3 First 8 7 8 4 First 7 5 3 5 First 10 4 2 6 First 8 9 2 7 First 9 9 10 8 First 10 6 9
Subset the data frame with condition based on a categorical column
Using filter function to subset df when y is greater than 5 and Class is First −
Class<-sample(c("First","Second","Third","Fourth"),25,replace=TRUE) x<-sample(1:10,25,replace=TRUE) y<-sample(1:10,25,replace=TRUE) z<-sample(1:10,25,replace=TRUE) df<-data.frame(Class,x,y,z) library(dplyr) df %>% group_by(Class) %>% filter(y>5 & Class=="First")
Output
# A tibble: 7 x 4 # Groups: Class [1] Class x y z <chr> <int> <int> <int> 1 First 2 8 5 2 First 2 9 3 3 First 10 7 4 4 First 8 7 8 5 First 8 9 2 6 First 9 9 10 7 First 10 6 9
Subset the data frame with condition based on a categorical column
Using filter function to subset df when z is greater than 5 and Class is First −
Class<-sample(c("First","Second","Third","Fourth"),25,replace=TRUE) x<-sample(1:10,25,replace=TRUE) y<-sample(1:10,25,replace=TRUE) z<-sample(1:10,25,replace=TRUE) df<-data.frame(Class,x,y,z) library(dplyr) df %>% group_by(Class) %>% filter(z>5 & Class=="First")
Output
# A tibble: 4 x 4 # Groups: Class [1] Class x y z <chr> <int> <int> <int> 1 First 8 7 8 2 First 9 9 10 3 First 3 2 9 4 First 10 6 9
Advertisements