
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Add New Column in R Data Frame with Count Based on Factor Column
While doing the data analysis, often we have to deal with factor data and we might want to find the frequency or count of a level of factor and the other variable combination. This helps us to make comparison within and between factor levels. Therefore, we can add a new column as count to find the required frequency and it can be done by using group_by and mutate function of dplyr package.
Example
Consider the below data frame −
> Group<-rep(c("A","B","C","D","E"),times=10) > Rating<-sample(1:10,50,replace=TRUE) > df<-data.frame(Group,Rating) > head(df,20)
Output
Group Rating 1 A 1 2 B 6 3 C 2 4 D 4 5 E 9 6 A 3 7 B 5 8 C 7 9 D 1 10 E 9 11 A 9 12 B 8 13 C 9 14 D 2 15 E 6 16 A 2 17 B 2 18 C 2 19 D 2 20 E 2
> tail(df,20)
Output
Group Rating 31 A 1 32 B 7 33 C 10 34 D 8 35 E 6 36 A 8 37 B 4 38 C 4 39 D 10 40 E 4 41 A 6 42 B 4 43 C 3 44 D 7 45 E 5 46 A 1 47 B 6 48 C 7 49 D 1 50 E 6
Loading dplyr package and finding the count −
> library(dplyr) > df_with_count<-df%>%group_by(Group,Rating)%>%mutate(count=n()) > head(df_with_count,20) # A tibble: 20 x 3 # Groups: Group, Rating [17]
Output
Group Rating count <fct> <int> <int> 1 A 1 4 2 B 6 3 3 C 2 3 4 D 4 1 5 E 9 2 6 A 3 1 7 B 5 1 8 C 7 2 9 D 1 3 10 E 9 2 11 A 9 1 12 B 8 1 13 C 9 1 14 D 2 3 15 E 6 3 16 A 2 1 17 B 2 1 18 C 2 3 19 D 2 3 20 E 2 1
> tail(df_with_count,20) # A tibble: 20 x 3 # Groups: Group, Rating [17]
Output
Group Rating count <fct> <int> <int> 1 A 1 4 2 B 7 1 3 C 10 2 4 D 8 1 5 E 6 3 6 A 8 1 7 B 4 2 8 C 4 1 9 D 10 1 10 E 4 1 11 A 6 1 12 B 4 2 13 C 3 1 14 D 7 1 15 E 5 2 16 A 1 4 17 B 6 3 18 C 7 2 19 D 1 3 20 E 6 3
Advertisements