How to count number of NaN values in Pandas?

Last Updated : 15 Nov, 2024

Let's discuss how to count the number of NaN values in Pandas DataFrame. In Pandas, NaN (Not a Number) values represent missing data in a DataFrame.

Counting NaN values of Each Column of Pandas DataFrame

To find the number of missing (NaN) values in each column, use the isnull() function followed by sum(). This will provide a count of NaN values per column.

Python

import pandas as pd
import numpy as np

# Example dataset
data = {
    'A': [1, 2, np.nan, 4],
    'B': [np.nan, 2, np.nan, 3],
    'C': [1, np.nan, np.nan, np.nan]
}
df = pd.DataFrame(data)

# Count NaNs in each column
column_nan_count = df.isnull().sum()
print("NaN count per column:")
print(column_nan_count)

Output:

DataFrame-with-Null-Values- — Pandas DataFrame with NaN Values

NaN count per column:
A    1
B    2
C    3
dtype: int64

Counting NaN Values of Specific Rows

To count NaNs in specific rows, use loc or iloc to select the row and then call isnull().sum().

Python

# Count NaNs in the first row
row_nan_count = df.iloc[0].isnull().sum()
print("NaN count in the first row:", row_nan_count)

Output:

NaN count in the first row: 1

Counting NaN Values in the Entire DataFrame

To get the total count of NaN values across the entire DataFrame, use isnull().sum().sum(). This performs a summation of NaNs per column, then sums these totals to get an overall count.

Python

# Count total NaNs in the DataFrame
total_nan_count = df.isnull().sum().sum()
print("Total NaN count:", total_nan_count)

Output:

Total NaN count: 6

Using `isna()` as an Alternative

The isna() function works similarly to isnull() for detecting NaN values, and you can use it interchangeably for the same results.

Python

# Using isna() to count NaNs in each column
column_nan_count_isna = df.isna().sum()
print("NaN count per column using isna():") 
print(column_nan_count_isna)

Output:

NaN count per column using isna():
A    1
B    2
C    3
dtype: int64

Using describe() to find non-NaN Values in Each Column

The describe() method provides a quick overview of each column, including the non-NaN count. Subtracting this count from the total number of rows can give you the NaN count.

Python

# Using describe() for additional insights
non_nan_count = df.describe().loc['count']
nan_count_using_describe = len(df) - non_nan_count
print("NaN count per column using describe():")
print(nan_count_using_describe)

Output:

NaN count per column using describe():
A    1.0
B    2.0
C    3.0
Name: count, dtype: float64

This approach provides flexibility for deciding whether to drop rows, drop columns, or fill missing values based on the proportion of NaNs in each feature.

Identifying Rows or Columns with NaN Values

Sometimes you might need to identify which rows or columns contain any NaN values, rather than counting them.

1. Check for Columns with Any NaN values

To check for columns that contain at least one NaN value, use isna().any() on the DataFrame

Python

columns_with_nan = df.isna().any()
print("Columns with NaN values:") 
print(columns_with_nan)

Output:

Columns with NaN values:
A    True
B    True
C    True
dtype: bool

2. Check for Rows with any NaN Values

To check for rows that contain NaNs, use isna().any(axis=1), which checks along the row axis.

Python

rows_with_nan = df.isna().any(axis=1)
print("Rows with NaN values") 
print(rows_with_nan)

Output:

Rows with NaN values
0    True
1    True
2    True
3    True
dtype: bool

Knowing how to count and locate NaNs in your data is essential for cleaning and preprocessing.