Count unique values with Pandas per groups
Last Updated :
29 Jul, 2021
Prerequisites: Pandas
In this article, we are finding and counting the unique values present in the group/column with Pandas. Unique values are the distinct values that occur only once in the dataset or the first occurrences of duplicate values counted as unique values.
Approach:
- Import the pandas library.
- Import or create dataframe using DataFrame() function in which pass the data as a parameter on which you want to create dataframe, let it be named as “df”, or for importing dataset use pandas.read_csv() function in which pass the path and name of the dataset.
- Select the column in which you want to check or count the unique values.
- For finding unique values we are using unique() function provided by pandas and stored it in a variable, let named as ‘unique_values’.
Syntax: pandas.unique(df(column_name)) or df[‘column_name’].unique()
- It will give the unique values present in that group/column.
- For counting the number of unique values, we have to first initialize the variable let named as ‘count’ as 0, then have to run the for loop for ‘unique_values’ and count the number of times loop runs and increment the value of ‘count’ by 1
- Then print the ‘count’, this stored value is the number of unique values present in that particular group/column.
- For finding the number of times the unique value is repeating in the particular column we are using value_counts() function provided by Pandas.
Syntax: pandas.value_counts(df[‘column_name’] or df[‘column_name’].value_counts()
- This will give the number of times each unique values is repeating in that particular column.
For a better understanding of the topic. Let’s take some examples and implement the functions as discussed above in the approach.
Example 1: Creating DataFrame using pandas library.
Python
import pandas as pd
car_data = { 'Model Name' : [ 'Valiant' ,
'Duster 360' ,
'Merc 240D' ,
'Merc 230' ,
'Merc 280' ,
'Merc 280C' ,
'Merc 450SE' ,
'Merc 450SL' ,
'Merc 450SLC' ,
'Cadillac Fleetwood' ,
'Lincoln Continental' ,
'Chrysler Imperial' ,
'Fiat 128' ,
'Honda Civic' ,
'Toyota Corolla' ],
'Gear' : [ 3 , 3 , 4 , 4 , 5 , 4 , 3 , 3 ,
3 , 3 , 3 , 3 , 4 , 4 , 4 ],
'Cylinder' : [ 6 , 8 , 4 , 4 , 6 , 6 , 8 ,
8 , 8 , 8 , 8 , 8 , 4 , 4 , 4 ]}
car_df = pd.DataFrame(car_data)
car_df
|
Output:
Example 2: Printing Unique values present in the per groups.
Python
import pandas as pd
car_data = { 'Model Name' : [ 'Valiant' ,
'Duster 360' ,
'Merc 240D' ,
'Merc 230' ,
'Merc 280' ,
'Merc 280C' ,
'Merc 450SE' ,
'Merc 450SL' ,
'Merc 450SLC' ,
'Cadillac Fleetwood' ,
'Lincoln Continental' ,
'Chrysler Imperial' ,
'Fiat 128' ,
'Honda Civic' ,
'Toyota Corolla' ],
'Gear' : [ 3 , 3 , 4 , 4 , 5 , 4 , 3 , 3 ,
3 , 3 , 3 , 3 , 4 , 4 , 4 ],
'Cylinder' : [ 6 , 8 , 4 , 4 , 6 , 6 , 8 ,
8 , 8 , 8 , 8 , 8 , 4 , 4 , 4 ]}
car_df = pd.DataFrame(car_data)
print (f "Unique values present in Gear column are: {car_df['Gear'].unique()}" )
print (f "Unique values present in Cylinder column are: {car_df['Cylinder'].unique()}" )
|
Output:
From the above output image, we can observe that we are getting three unique value from both of the groups.
Example 3: Another way of finding unique values present in per groups.
Python
import pandas as pd
car_data = { 'Model Name' : [ 'Valiant' ,
'Duster 360' ,
'Merc 240D' ,
'Merc 230' ,
'Merc 280' ,
'Merc 280C' ,
'Merc 450SE' ,
'Merc 450SL' ,
'Merc 450SLC' ,
'Cadillac Fleetwood' ,
'Lincoln Continental' ,
'Chrysler Imperial' ,
'Fiat 128' ,
'Honda Civic' ,
'Toyota Corolla' ],
'Gear' : [ 3 , 3 , 4 , 4 , 5 , 4 , 3 , 3 ,
3 , 3 , 3 , 3 , 4 , 4 , 4 ],
'Cylinder' : [ 6 , 8 , 4 , 4 , 6 , 6 , 8 , 8 ,
8 , 8 , 8 , 8 , 4 , 4 , 4 ]}
car_df = pd.DataFrame(car_data)
unique_gear = pd.unique(car_df.Gear)
unique_cyl = pd.unique(car_df.Cylinder)
print (f "Unique values present in Gear column are: {unique_gear}" )
print (f "Unique values present in Cylinder column are: {unique_cyl}" )
|
Output:
The output is similar but the difference is that in this example we had founded the unique values present in per groups by using pd.unique() function in which we had passed our dataframe column.
Example 4: Counting the number of times each unique value is repeating.
Python
import pandas as pd
car_data = { 'Model Name' : [ 'Valiant' ,
'Duster 360' ,
'Merc 240D' ,
'Merc 230' ,
'Merc 280' ,
'Merc 280C' ,
'Merc 450SE' ,
'Merc 450SL' ,
'Merc 450SLC' ,
'Cadillac Fleetwood' ,
'Lincoln Continental' ,
'Chrysler Imperial' ,
'Fiat 128' ,
'Honda Civic' ,
'Toyota Corolla' ],
'Gear' : [ 3 , 3 , 4 , 4 , 5 , 4 , 3 ,
3 , 3 , 3 , 3 , 3 , 4 , 4 , 4 ],
'Cylinder' : [ 6 , 8 , 4 , 4 , 6 , 6 , 8 ,
8 , 8 , 8 , 8 , 8 , 4 , 4 , 4 ]}
car_df = pd.DataFrame(car_data)
gear_count = pd.value_counts(car_df.Gear)
cyl_count = pd.value_counts(car_df.Cylinder)
g_count = car_df[ 'Gear' ].value_counts()
cy_count = car_df[ 'Cylinder' ].value_counts()
print ( '----Output from first method-----' )
print (gear_count)
print (cyl_count)
print ( '----Output from second method----' )
print (g_count)
print (cy_count)
|
Output:
From the above output image, we are getting the same result from both of the methods of writing the code.
We can observe that in Gear column we are getting unique values 3,4 and 5 which are repeating 8,6 and 1 time respectively whereas in Cylinder column we are getting unique values 8,4 and 6 which are repeating 7,5 and 3 times respectively.
Example 5: Counting number of unique values present in the group.
Python
import pandas as pd
car_data = { 'Model Name' : [ 'Valiant' ,
'Duster 360' ,
'Merc 240D' ,
'Merc 230' ,
'Merc 280' ,
'Merc 280C' ,
'Merc 450SE' ,
'Merc 450SL' ,
'Merc 450SLC' ,
'Cadillac Fleetwood' ,
'Lincoln Continental' ,
'Chrysler Imperial' ,
'Fiat 128' ,
'Honda Civic' ,
'Toyota Corolla' ],
'Gear' : [ 3 , 3 , 4 , 4 , 5 , 4 , 3 , 3 ,
3 , 3 , 3 , 3 , 4 , 4 , 4 ],
'Cylinder' : [ 6 , 8 , 4 , 4 , 6 , 6 , 8 ,
8 , 8 , 8 , 8 , 8 , 4 , 4 , 4 ]}
car_df = pd.DataFrame(car_data)
name_count = pd.unique(car_df[ 'Model Name' ])
gear_count = pd.unique(car_df.Gear)
cyl_count = pd.unique(car_df.Cylinder)
name_unique = 0
gear_unique = 0
cyl_unique = 0
for item in name_count:
name_unique + = 1
for item in gear_count:
gear_unique + = 1
for item in gear_count:
cyl_unique + = 1
print (f 'Number of unique values present in Model Name: {name_unique}' )
print (f 'Number of unique values present in Gear: {gear_unique}' )
print (f 'Number of unique values present in Cylinder: {cyl_unique}' )
|
Output:
From the above output image, we can observe that we are getting 15,3 and 3 unique values present in Model Name, Gear and Cylinder columns respectively.
Similar Reads
How to count unique values in a Pandas Groupby object?
Here, we can count the unique values in Pandas groupby object using different methods. This article depicts how the count of unique values of some attribute in a data frame can be retrieved using Pandas. Method 1: Count unique values using nunique() The Pandas dataframe.nunique() function returns a
3 min read
Pandas GroupBy - Count last value
A groupby operation involves grouping large amounts of data and computing operations on these groups. It is generally involved in some combination of splitting the object, applying a function, and combining the results. In this article let us see how to get the count of the last value in the group u
5 min read
Python | Pandas Series.value_counts()
Pandas is one of the most widely used library for data handling and analysis. It simplifies many data manipulation tasks especially when working with tabular data. In this article, we'll explore the Series.value_counts() function in Pandas which helps you quickly count the frequency of unique values
2 min read
Pandas Shift Down Values by One Row within a Group
Shifting data is a common task in data analysis, especially when working with time series or grouped data. In some cases, we may want to shift the values of a column down by one row, but only within the confines of a particular group. This article will explain how to shift values down by one row wit
5 min read
Get unique values from a column in Pandas DataFrame
In Pandas, retrieving unique values from DataFrame is used for analyzing categorical data or identifying duplicates. Let's learn how to get unique values from a column in Pandas DataFrame. Get the Unique Values of Pandas using unique()The.unique()method returns a NumPy array. It is useful for identi
5 min read
How to Plot Value Counts in Pandas
In this article, we'll learn how to plot value counts using provide, which can help us quickly understand the frequency distribution of values in a dataset. Table of Content Concepts Related to Plotting Value CountsSteps to Plot Value Counts in Pandas1. Install Required Libraries2. Import Required L
3 min read
Pandas Groupby - Sort within groups
Pandas Groupby is used in situations where we want to split data and set into groups so that we can do various operations on those groups like - Aggregation of data, Transformation through some group computations or Filtration according to specific conditions applied on the groups. In similar ways,
2 min read
Pandas - Groupby value counts on the DataFrame
Prerequisites: Pandas Pandas can be employed to count the frequency of each value in the data frame separately. Let's see how to Groupby values count on the pandas dataframe. To count Groupby values in the pandas dataframe we are going to use groupby() size() and unstack() method. Functions Used:gro
3 min read
Python | Pandas Panel.count()
In Pandas, Panel is a very important container for three-dimensional data. The names for the 3 axes are intended to give some semantic meaning to describing operations involving panel data and, in particular, econometric analysis of panel data. Panel.count() function is used to return number of obse
1 min read
Getting Unique values from a column in Pandas dataframe
Let's see how can we retrieve the unique values from pandas dataframe. Let's create a dataframe from CSV file. We are using the past data of GDP from different countries. You can get the dataset from here. # import pandas as pd import pandas as pd gapminder_csv_url ='http://bit.ly/2cLzoxH' # load th
2 min read