Binary operations on Pandas DataFrame and Series
Last Updated :
02 Jan, 2025
Binary operations involve applying mathematical or logical operations on two objects, typically DataFrames or Series, to produce a new result. Let's learn how binary operations work in Pandas, focusing on their usage with DataFrames and Series.
The most common binary operations include:
- Arithmetic operations: Addition, subtraction, multiplication, division, etc.
- Comparison operations: Equal to, not equal to, greater than, less than, etc.
- Logical operations: And, or, etc.
Pandas makes it easy to perform these operations element-wise (i.e., on a per-row or per-column basis), which is particularly useful when working with large datasets.
Binary Operations on Pandas Series
1. Arithmetic Operations on Series
Arithmetic operations between two Series is applied element-wise. The index labels must align for the operation to work. If the indexes don’t match, Pandas will fill in missing values with NaN.
Example: Adding Two Series
Python
import pandas as pd
s1 = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
s2 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
# Adding the two Series
result = s1 + s2
print(result)
Outputa 11
b 22
c 33
dtype: int64
2. Comparison Operations on Series
Comparison operations return a Series of boolean values, indicating whether the comparison is True
or False
for each corresponding element.
Example: Checking Equality
Python
import pandas as pd
s1 = pd.Series([10, 20, 30])
s2 = pd.Series([10, 25, 30])
# Comparing the two Series
result = s1 == s2
print(result)
Output0 True
1 False
2 True
dtype: bool
Binary Operations on Pandas DataFrame
1. Arithmetic Operations on DataFrames
Similar to Series, DataFrame arithmetic operations apply element-wise between two DataFrames.
Note: The DataFrames must have the same shape or matching indexes and columns.
Example: Subtracting DataFrames
Python
import pandas as pd
df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]})
df2 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
# Subtracting the DataFrames
result = df1 - df2
print(result)
Output A B
0 9 36
1 18 45
2 27 54
2. Comparison Operations on DataFrames
Like Series, comparison operations on DataFrames return a DataFrame of boolean values. These boolean values indicate whether the corresponding elements are equal or satisfy other comparison conditions.
Example: Checking Greater Than
Python
import pandas as pd
df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]})
df2 = pd.DataFrame({'A': [5, 15, 35], 'B': [30, 60, 55]})
# Checking if elements of df1 are greater than df2
result = df1 > df2
print(result)
Output A B
0 True True
1 True False
2 False True
2. Comparison Operations on DataFrames
Like Series, comparison operations on DataFrames return a DataFrame of boolean values. These boolean values indicate whether the corresponding elements are equal or satisfy other comparison conditions.
Example: Checking Greater Than
Python
import pandas as pd
df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]})
df2 = pd.DataFrame({'A': [5, 15, 35], 'B': [30, 60, 55]})
# Checking if elements of df1 are greater than df2
result = df1 > df2
print(result)
Output A B
0 True True
1 True False
2 False True
Logical Operations on DataFrame and Series
Pandas also supports logical operations (AND, OR, etc.) on DataFrames and Series. These are commonly used for filtering and applying conditions.
Example: Logical AND on Series
Python
import pandas as pd
s1 = pd.Series([True, False, True])
s2 = pd.Series([False, False, True])
# Applying logical AND
result = s1 & s2
print(result)
Output0 False
1 False
2 True
dtype: bool
Handling Missing Data in Binary Operations
When performing binary operations on DataFrames or Series, missing data (NaN) can affect the results. Pandas handles missing data based on the operation:
- Arithmetic operations involving NaN will generally return NaN (e.g.,
NaN + 1 = NaN
). - Logical operations involving NaN might return
False
or True
, depending on the operation.
Example: Arithmetic with NaN
Python
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2, None], 'B': [4, None, 6]})
df2 = pd.DataFrame({'A': [1, None, 3], 'B': [None, 5, 6]})
# Adding the DataFrames
result = df1 + df2
print(result)
Output A B
0 2.0 NaN
1 NaN NaN
2 NaN 12.0
As seen above, where there is missing data (None or NaN), the result becomes NaN.
By leveraging these operations, you can perform complex calculations, comparisons, and transformations on your data, making Pandas a powerful tool for data analysis.
Similar Reads
Combine two Pandas series into a DataFrame
In this post, we will learn how to combine two series into a DataFrame? Before starting let's see what a series is?Pandas Series is a one-dimensional labeled array capable of holding any data type. In other terms, Pandas Series is nothing but a column in an excel sheet. There are several ways to con
3 min read
Creating a dataframe from Pandas series
Series is a type of list in Pandas that can take integer values, string values, double values, and more. But in Pandas Series we return an object in the form of a list, having an index starting from 0 to n, Where n is the length of values in the series. Later in this article, we will discuss Datafra
5 min read
DataFrame vs Series in Pandas
Pandas is a widely-used Python library for data analysis that provides two essential data structures: Series and DataFrame. These structures are potent tools for handling and examining data, but they have different features and applications. In this article, we will explore the differences between S
8 min read
Create a Pandas Series from array
A Pandas Series is a one-dimensional labeled array that stores various data types, including numbers (integers or floats), strings, and Python objects. It is a fundamental data structure in the Pandas library used for efficient data manipulation and analysis. In this guide we will explore two simple
2 min read
Perform Arithmetic Operations in Pandas
Let us see how to perform basic arithmetic operations like addition, subtraction, multiplication, and division on 2 Pandas Series. For all the 4 operations we will follow the basic algorithm : Import the Pandas module. Create 2 Pandas Series objects. Perform the required arithmetic operation using t
2 min read
Adding New Variable to Pandas DataFrame
In this article let's learn how to add a new variable to pandas DataFrame using the assign() function and square brackets. Pandas is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing d
3 min read
Create a list from rows in Pandas dataframe
Python lists are one of the most versatile data structures, offering a range of built-in functions for efficient data manipulation. When working with Pandas, we often need to extract entire rows from a DataFrame and store them in a list for further processing. Unlike columns, which are easily access
5 min read
How to Plot Multiple Series from a Pandas DataFrame?
In this article, we will discuss how to plot multiple series from a dataframe in pandas. Series is the range of the data that include integer points we cab plot in pandas dataframe by using plot() function Syntax: matplotlib.pyplot(dataframe['column_name']) We can place n number of series and we hav
2 min read
Python | Pandas Dataframe/Series.dot()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas Series.dot()The dot() method is used to compute the dot product between DataFra
6 min read
Create A Set From A Series In Pandas
In Python, a Set is an unordered collection of data types that is iterable, mutable, and has no duplicate elements. The order of elements in a set is undefined though it may contain various elements. The major advantage of using a set, instead of a list, is that it has a highly optimized method for
3 min read