
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Fill Missing Column Values with Median in Python Pandas
Median separates the higher half from the lower half of the data. Use the fillna() method and set the median to fill missing columns with median. At first, let us import the required libraries with their respective aliases −
import pandas as pd import numpy as np
Create a DataFrame with 2 columns. We have set the NaN values using the Numpy np.NaN −
dataFrame = pd.DataFrame( { "Car": ['Lexus', 'BMW', 'Audi', 'Bentley', 'Mustang', 'Tesla'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } )
Find median of the column values with NaN i.e, for Units columns here. Replace NaNs with the median of the column where it is located using median() on Units column −
dataFrame.fillna(dataFrame['Units'].median(), inplace = True)
Example
Following is the code −
import pandas as pd import numpy as np # Create DataFrame dataFrame = pd.DataFrame( { "Car": ['Lexus', 'BMW', 'Audi', 'Bentley', 'Mustang', 'Tesla'],"Units": [100, 150, np.NaN, 80, np.NaN, np.NaN] } ) print"DataFrame ...\n",dataFrame # finding median of the column values with NaN i.e, for Units columns here # Replace NaNs with the median of the column where it is located dataFrame.fillna(dataFrame['Units'].median(), inplace = True) print"\nUpdated Dataframe after filling NaN values with median...\n",dataFrame
Output
This will produce the following output −
DataFrame ... Car Units 0 Lexus 100.0 1 BMW 150.0 2 Audi NaN 3 Bentley 80.0 4 Mustang NaN 5 Tesla NaN Updated Dataframe after filling NaN values with median... Car Units 0 Lexus 100.0 1 BMW 150.0 2 Audi 100.0 3 Bentley 80.0 4 Mustang 100.0 5 Tesla 100.0
Advertisements