Ordinal encoding assigns a unique integer to each category in a feature, reflecting their order. For example, in a dataset containing shirt sizes (small, medium, large) it can assign 1, 2 and 3 respectively. The point is not just to encode but to preserve the inherent ranking among categories. It is used in machine learning as various algorithms work best with numerical data only.
Let's see the implementation of Ordinal Encoding using Sklearn with the help of examples,
Example 1: Using Custom Dataset
Step 1: Import libraries
Import Pandas and Scikit learn
from sklearn.preprocessing import OrdinalEncoder
import pandas as pd
!pip install scikit - learn
Step 2: Creating a Dataset
- Sets up a sample dataset with students and their grades.
- Converts the data into a pandas DataFrame.
data = {
'Student': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Grade': ['A', 'B', 'C', 'A', 'B']
}
df = pd.DataFrame(data)
print(df)
Output:

Step 3: Initialize and apply Ordinal Encoder
- Initializes
OrdinalEncoderand explicitly sets the order: 'A' < 'B' < 'C'. - Transforms the 'Grade' column into numeric codes (0, 1, 2).
- Stores the result in a new column
Grade_encoded.
encoder = OrdinalEncoder(categories=[['A', 'B', 'C']])
df['Grade_encoded'] = encoder.fit_transform(df[['Grade']])
print(df)

Example 2: Loading External Dataset
Step 1: Import the required Libraries
- Imports pandas and sklearn as above.
- Imports matplotlib and seaborn for data visualization.
import pandas as pd
from sklearn.preprocessing import OrdinalEncoder
import matplotlib.pyplot as plt
import seaborn as sns
Step 2: Load the Titanic Dataset
- Loads the Titanic dataset from a public URL.
- Prints the first 5 rows to inspect the data.
url = "https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv"
df = pd.read_csv(url)
print(df.head())

Step 3: Encode the "Sex" column.
- Creates an
OrdinalEncoderwith the order: 'female' = 0, 'male' = 1. - Adds a new column
Sex_encodedwith the results.
encoder = OrdinalEncoder(categories=[['female', 'male']])
df['Sex_encoded'] = encoder.fit_transform(df[['Sex']])
df[['Sex', 'Sex_encoded']].head()

Step 4: Visualize the Encoded Feature
Plots the new encoded values (0 and 1), confirming our transformation.
sns.countplot(x='Sex_encoded', data=df)
plt.title('Encoded Sex Distribution')
plt.show()

By using scikit-learn's OrdinalEncoder, we can easily encode features that have a natural hierarchy, ensuring our models interpret the underlying order correctly.