How to Categorize Floating Values in Python Using Pandas Library

Last Updated : 05 Apr, 2025

Categorizing floating values in a dataset is something that you will come across while working with numerical data, especially when you divide a continuous variable into distinct groups or bins.

In pandas, pd.cut() is a common method for categorizing floating-point values.

Python

import pandas as pd
data = {
    'Values': [0.1, 0.5, 1.2, 1.9, 2.4, 3.3, 4.7, 5.6]
}

df = pd.DataFrame(data)

# Define the bin edges
bins = [0, 1, 2, 3, 4, 5, 6]

# Categorize the 'Values' column into bins
df['Categories'] = pd.cut(df['Values'], bins)
print(df)

Output:

The pd.cut() function categorizes each value based on the range it falls into.

Categorizing floating values can help in:

Simplifying complex data for better analysis.
Grouping continuous data into discrete intervals or bins.
Preparing data for machine learning models that require categorical input.

Other than pd.cut() method, there are different ways to categorize floating values:

1. Using pd.qcut() for Quantile-Based Categorization

pd.qcut()divides data into quantiles, ensuring that each bin contains an equal number of data points. It is useful when you want to create bins based on the distribution of the data.

Python

import pandas as pd
data = {
    'Values': [0.1, 0.5, 1.2, 1.9, 2.4, 3.3, 4.7, 5.6]
}
df = pd.DataFrame(data)

# Categorize using pd.qcut (3 quantiles)
df['Category'] = pd.qcut(df['Values'], q=3)
print(df)

Output:

pd.qcut() function automatically calculates the bin edges based on the data distribution.

2. Creating Custom Categorization Logic with apply()

For more advanced categorization, you can use apply() with a custom function. This method gives you full control over how you categorize the data based on your own logic.

Python

import pandas as pd

data = {
    'Values': [0.1, 0.5, 1.2, 1.9, 2.4, 3.3, 4.7, 5.6]
}
df = pd.DataFrame(data)

# Custom categorization function
def categorize(value):
    if value < 1:
        return 'Low'
    elif value < 3:
        return 'Medium'
    elif value < 5:
        return 'High'
    else:
        return 'Very High'

# Apply the custom function to categorize the values
df['Category'] = df['Values'].apply(categorize)
print(df)

Output:

The categorization logic checks the value and assigns it to one of the categories: 'Low', 'Medium', 'High', or 'Very High'.

3. Labeling Data with Custom Categories

If you prefer more intuitive labels, you can use pd.cut() with custom labels. This makes the output more readable, especially when you want your categories to reflect a specific interpretation of the bins.

Python

import pandas as pd
data = {
    'Values': [0.1, 0.5, 1.2, 1.9, 2.4, 3.3, 4.7, 5.6]
}
df = pd.DataFrame(data)

# Define custom labels
labels = ['Very Low', 'Low', 'Medium', 'High']

# Categorize using pd.cut with custom labels
df['Category'] = pd.cut(df['Values'], bins=[0, 1, 2, 4, 6], labels=labels)
print(df)

Output:

We used custom labels such as 'Very Low', 'Low', 'Medium', and 'High' to describe the bins. pd.cut() function automatically assigns the appropriate label based on the bin the value falls into.

Implementing SVM from Scratch in Python

error204

Improve

Article Tags :

How to Categorize Floating Values in Python Using Pandas Library

1. Using pd.qcut() for Quantile-Based Categorization

2. Creating Custom Categorization Logic with apply()

3. Labeling Data with Custom Categories

Similar Reads

Thank You!

What kind of Experience do you want to share?