How to Perform Ordinal Encoding Using Sklearn

Last Updated : 5 Aug, 2025

Ordinal encoding assigns a unique integer to each category in a feature, reflecting their order. For example, in a dataset containing shirt sizes (smallmediumlarge) it can assign 1, 2 and 3 respectively. The point is not just to encode but to preserve the inherent ranking among categories. It is used in machine learning as various algorithms work best with numerical data only.

Let's see the implementation of Ordinal Encoding using Sklearn with the help of examples,

Example 1: Using Custom Dataset

Step 1: Import libraries

Import Pandas and Scikit learn

Python
from sklearn.preprocessing import OrdinalEncoder
import pandas as pd
!pip install scikit - learn

Step 2: Creating a Dataset

  • Sets up a sample dataset with students and their grades.
  • Converts the data into a pandas DataFrame.
Python
data = {
    'Student': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Grade': ['A', 'B', 'C', 'A', 'B']
}
df = pd.DataFrame(data)
print(df)

Output:

Screenshot-2025-07-30-143104
dataset

Step 3: Initialize and apply Ordinal Encoder

  • Initializes OrdinalEncoder and explicitly sets the order: 'A' < 'B' < 'C'.
  • Transforms the 'Grade' column into numeric codes (0, 1, 2).
  • Stores the result in a new column Grade_encoded.
Python
encoder = OrdinalEncoder(categories=[['A', 'B', 'C']])
df['Grade_encoded'] = encoder.fit_transform(df[['Grade']])
print(df)
Screenshot-2025-07-30-143117
ordinal encoding

Example 2: Loading External Dataset

Step 1: Import the required Libraries

Python
import pandas as pd
from sklearn.preprocessing import OrdinalEncoder
import matplotlib.pyplot as plt
import seaborn as sns

Step 2: Load the Titanic Dataset

  • Loads the Titanic dataset from a public URL.
  • Prints the first 5 rows to inspect the data.
Python
url = "https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv"
df = pd.read_csv(url)
print(df.head())
Screenshot-2025-07-30-143132

Step 3: Encode the "Sex" column.

  • Creates an OrdinalEncoder with the order: 'female' = 0, 'male' = 1.
  • Adds a new column Sex_encoded with the results.
Python
encoder = OrdinalEncoder(categories=[['female', 'male']])
df['Sex_encoded'] = encoder.fit_transform(df[['Sex']])
df[['Sex', 'Sex_encoded']].head()
Screenshot-2025-07-30-143141

Step 4: Visualize the Encoded Feature

Plots the new encoded values (0 and 1), confirming our transformation.

Python
sns.countplot(x='Sex_encoded', data=df)
plt.title('Encoded Sex Distribution')
plt.show()
encoded
Encoded Result

By using scikit-learn's OrdinalEncoder, we can easily encode features that have a natural hierarchy, ensuring our models interpret the underlying order correctly.

Comment