
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Drop Rows from a Pandas DataFrame
The pandas library in python is widely popular for representing data in the form of tabular data structures. The dataset is arranged into a 2-D matrix consisting of rows and columns. Pandas library offers numerous functions that can help the programmer to analyze the dataset by providing valuable mathematical insights.
The tabular data structure is known as a data frame that can be generated with the help of pandas DataFrame() function. In this article we will perform a simple operation of removing/dropping multiple rows from a pandas data frame.
Firstly, we have to prepare a dataset and then generate a data frame with the help of pandas "DataFrame()" function. Let's begin with this ?
Preparing the Dataset
The data from the passed dataset will be arranged in the form of rows and columns.
Here, we imported the pandas library as "pd". We created the dataset with the help of a dictionary of lists.
Each key represents a student which is associated with a list of values representing the marks obtained in different subjects.
After this, we generated a data frame with the help of DataFrame() function. We didn't specify the column name but the student's name automatically acquires the column position for this data frame. The most important step is the labelling of the data frame indexes. We specified the row names by passing a list of values consisting of different subjects.
Example
import pandas as pd dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],} dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"]) print(dataframe)
Output
Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Maths 88 90 95 English 90 71 98 Biology 91 45 80
Dropping Rows through Index Values
For dropping a row we will use the pandas "drop()" method. This is an efficient and simple way of removing rows from a data frame. Following is the syntax of this method -
dataframe.drop(labels=None, *, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
We don't require all the parameters for initiating the "drop" operation (Most of the default values will be enough). There are two techniques for dropping rows: -
We will specify the index value for each row that needs to be dropped.
Example
Following is the implementation of this method. Here,
After creating the data frame we used the drop() method to remove the 3rd and 4th row from the data frame.
We selected the original data frame stored in the "dataframe" variable and locked the index values for the corresponding rows that we wanted to remove with the help "dataframe.index[[]]"
A new data frame is created consisting of the remaining rows.
import pandas as pd dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],} dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"]) print(dataframe) Drop_dataframe = dataframe.drop(dataframe.index[[2, 3]]) print("After dropping 3rd and 4th row") print(Drop_dataframe)
Output
Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Maths 88 90 95 English 90 71 98 Biology 91 45 80 After dropping 3rd and 4th row Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Biology 91 45 80
Dropping Rows Through Labels or Row Names
In this technique, we use the exact name of the rows(labels) which we want to drop from the data frame. We will again use drop() method to execute this technique. now,
We used the same drop() method to remove the 3rd and 4th row from the data frame but this time we used the row name which we labelled while constructing the data frame.
A new data frame is created and the original data frame remains unchanged.
Example
import pandas as pd dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],} dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"]) print(dataframe) Drop_dataframe = dataframe.drop(["Maths", "English"]) print("After dropping 3rd and 4th row") print(Drop_dataframe)
Output
Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Maths 88 90 95 English 90 71 98 Biology 91 45 80 After dropping 3rd and 4th row Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Biology 91 45 80
We can also include the "inplace" argument, if we don't want to create another data frame. This argument can modify the current data frame by making changes in it. The default value is "False" for this argument. We will set the inplace argument value as "True".
Using Index Slicing
We can also drop a list of rows using the index slicing. Following is the example to do so,
Here, we sliced the index and created a range for dropping rows.
We printed the original data frame and then used the "dataframe.index[2:4]" method to set the range from 2 to 3 and "dataframe.drop()" method to drop these rows.
At last, a new data frame will be created consisting of the remaining rows.
Example
import pandas as pd dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],} dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"]) print(dataframe) drop_dataframe = dataframe.drop(dataframe.index[2:4]) print("After dropping 3rd and 4th row") print(drop_dataframe)
Output
Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Maths 88 90 95 English 90 71 98 Biology 91 45 80 After dropping 3rd and 4th row Aman Raj Saloni Physics 98 78 82 Chemistry 92 62 52 Biology 91 45 80
Conclusion
In this article, we covered the basics of pandas data frame. We understood the different methods to drop multiple rows from a data frame. We discussed the different ways of specifying the rows which we want to remove i.e., through "index value" and "row name". At last, we discussed a simple index slicing method.