
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Frequency of Unique and Missing Values in R Data Frame
To find the frequency of unique values and missing values for each column in an R data frame, we can use apply function with table function and useNA argument set to always.
For Example, if we have a data frame called df then we can find the frequency of unique values and missing values for each column in df by using the below mentioned command −
apply(df,2,table,useNA="always")
Example 1
Following snippet creates a sample data frame −
x1<-sample(c(NA,1,2),20,replace=TRUE) x2<-sample(c(NA,1,2),20,replace=TRUE) df1<-data.frame(x1,x2) df1
The following dataframe is created
x1 x2 1 1 NA 2 1 1 3 2 2 4 2 2 5 NA 1 6 1 1 7 1 1 8 1 NA 9 NA 1 10 1 2 11 2 1 12 2 NA 13 1 2 14 1 NA 15 1 NA 16 NA NA 17 NA 1 18 1 2 19 2 1 20 NA NA
To find the frequency of unique values and missing values for each column in df1 on the above created data frame, add the following code to the above snippet −
x1<-sample(c(NA,1,2),20,replace=TRUE) x2<-sample(c(NA,1,2),20,replace=TRUE) df1<-data.frame(x1,x2) apply(df1,2,table,useNA="always")
Output
If you execute all the above given snippets as a single program, it generates the following Output −
x1 x2 1 10 8 2 5 5 <NA 5 7
Example 2
Following snippet creates a sample data frame −
y1<-sample(c(NA,5,10),20,replace=TRUE) y2<-sample(c(NA,5,10,20),20,replace=TRUE) df2<-data.frame(y1,y2) df2
The following dataframe is created
y1 y2 1 5 NA 2 NA NA 3 10 NA 4 5 5 5 5 NA 6 5 5 7 5 10 8 NA 10 9 NA 20 10 5 10 11 10 NA 12 NA 5 13 NA NA 14 10 10 15 10 10 16 10 5 17 NA 10 18 10 10 19 5 20 20 NA 10
To find the frequency of unique values and missing values for each column in df2 on the above created data frame, add the following code to the above snippet −
y1<-sample(c(NA,5,10),20,replace=TRUE) y2<-sample(c(NA,5,10,20),20,replace=TRUE) df2<-data.frame(y1,y2) apply(df2,2,table,useNA="always")
Output
If you execute all the above given snippets as a single program, it generates the following Output −
$y1 5 10 <NA 7 6 7 $y2 5 10 20 <NA 4 8 2 6
Example 3
Following snippet creates a sample data frame −
z1<-sample(c(NA,25,45),20,replace=TRUE) z2<-sample(c(NA,25,45),20,replace=TRUE) df3<-data.frame(z1,z2) df3
The following dataframe is created
z1 z2 1 45 NA 2 NA NA 3 25 25 4 25 25 5 NA NA 6 25 NA 7 NA 45 8 25 NA 9 25 25 10 NA 45 11 45 25 12 25 25 13 25 45 14 NA 25 15 45 NA 16 NA 45 17 25 45 18 25 NA 19 45 NA 20 NA 45
To find the frequency of unique values and missing values for each column in df3 on the above created data frame, add the following code to the above snippet −
z1<-sample(c(NA,25,45),20,replace=TRUE) z2<-sample(c(NA,25,45),20,replace=TRUE) df3<-data.frame(z1,z2) apply(df3,2,table,useNA="always")
Output
If you execute all the above given snippets as a single program, it generates the following Output −
z1 z2 25 9 6 45 4 6 <NA 7 8