Missing values represent data that is unknown or unavailable. In R, missing values are denoted by NA (Not Available) and NaN (Not a Number). Handling missing values is an important step in data preprocessing because they can affect analysis results and model performance.
- Missing values can distort statistical calculations and visualizations.
- Proper detection and treatment of missing data improves data quality and reliability.
1. Detecting Missing Values Using is.na()
A logical vector is returned by this function that indicates all the NA values present. It returns a Boolean value. If NA is present in a vector it returns TRUE else FALSE.
x<- c(NA, 3, 4, NA, NA, NA)
is.na(x)
Output:Â
[1] TRUE FALSE FALSE TRUE TRUE TRUE
2. Extracting Non-Missing Values
We can also keep only the values which are available by discarding NA or NaN values.
x <- c(1, 2, NA, 3, NA, 4)
d <- is.na(x)
x[! d]
Output:Â
[1] 1 2 3 4
3. Missing Value Filter Functions (na.action)
Many R modeling functions include the na.action argument, which controls how missing values are handled.
Common options include:
- na.omit() : Removes rows containing any NA
- na.fail() : Stops execution if missing values are found
- na.exclude() : Removes missing rows but keeps their positions for predictions
- na.pass() : Leaves missing values unchanged
df <- data.frame (c1 = 1:8,
c2 = factor (c("B", "A", "B", "C",
"A", "C", "B", "A")))
df[4, 1] <- df[6, 2] <- NA
levels(df$c2)
na.fail(df)
na.exclude(a)
Output:Â
[1] "A" "B" "C"
Error in na.fail.default(df) : missing values in object
Calls: na.fail -> na.fail.default
Execution halted
Finding and Removing Missing Values in a Dataset
In R we can remove and find missing values from the entire dataset and here's how we can perform it.
1. Creating Sample Data
First we will create one data frame containing NA values.
data <- data.frame(
A = c(1, 2, NA, 4, 5),
B = c(NA, 2, 3, NA, 5),
C = c(1, 2, 3, NA, NA)
)
data
Output:

2. Count Total Missing Values
We will find all the missing values which are present in the data using the is.na() function.
sum(is.na(data))
Output:
[1] 5
3. Count Missing Values Column-Wise
We will find missing values which are present in the data column wise. We will use colSums() function to display the number of NA values in each column.
x <- colSums(is.na(data))
x <- as.data.frame(x)
x
Output:

4. Visualizing missing values
We will visualize the NA values using the visdat package in R. The vis_miss() function will plot a diagram representing the missing and present values.
install.packages("visdat")
library(visdat)
data <- data.frame(
A = c(1, NA, 3, NA, 5),
B = c(NA, 2, NA, 4, NA),
C = c(1, 2, 3, NA, NA)
)
vis_miss(data)
Output:

5. Removing missing values
We will remove the outliers using the na.omit() function.
data<- na.omit(data)
data
Output:
