Removing Rows from a DataFrame


The drop function is used to remove rows or columns from a pandas DataFrame.

To explore how to remove rows using this function, we'll be looking at a DataFrame of foods:

Reset Code Python Output:
food weight length price
0 cheesecake 3.43 4.0 6.38
1 mochi 2.00 1.2 0.75
2 donut 1.30 4.8 1.53
3 churro 2.60 10.0 3.00
4 cupcake 1.20 2.5 2.63
5 flan 7.10 7.0 8.10
6 egg tart 3.00 2.5 3.00

Removing a Single Row from a DataFrame

We can drop a row by passing in its row label as input. It's important to note that for this function, integers are treated as labels, not indices.

Reset Code Python Output:
food weight length price
0 cheesecake 3.43 4.0 6.38
1 mochi 2.00 1.2 0.75
3 churro 2.60 10.0 3.00
4 cupcake 1.20 2.5 2.63
5 flan 7.10 7.0 8.10
6 egg tart 3.00 2.5 3.00
Reset Code Python Output:
food weight length price
0 cheesecake 3.43 4.0 6.38
1 mochi 2.00 1.2 0.75
2 donut 1.30 4.8 1.53
3 churro 2.60 10.0 3.00
5 flan 7.10 7.0 8.10
6 egg tart 3.00 2.5 3.00

Notice that in the output above, the row with label 2 that we dropped previously appears again. This is because the default of the drop function is to return a copy of the DataFrame with the dropped row. It doesn't drop the row from the original DataFrame.

To permanently drop rows from a DataFrame, we need to set inplace = true:

Reset Code Python Output:
food weight length price
0 cheesecake 3.43 4.0 6.38
1 mochi 2.00 1.2 0.75
3 churro 2.60 10.0 3.00
5 flan 7.10 7.0 8.10
6 egg tart 3.00 2.5 3.00

Removing Multiple Rows from a DataFrame

We can drop multiple rows at a time by inputting a list of labels:

Reset Code Python Output:
food weight length price
3 churro 2.6 10.0 3.0
5 flan 7.1 7.0 8.1
6 egg tart 3.0 2.5 3.0

Removing Rows from a DataFrame using Custom Labels

Let's say we're looking to find a new social media platform to use. To do so, we create a DataFrame of the most popular social medias:

Reset Code Python Output:
users(mil.) created
name
Snapchat 493.7 2011
Tiktok 750.0 2016
Instagram 1280.0 2010
Twitter 229.0 2006
Facebook 2100.0 2004

Now, let's say we hear on the news that Facebook is facing many data privacy lawsuits. We don't want our personal information leaked, so we make the decision to drop Facebook from our DataFrame:

Reset Code Python Output:
users(mil.) created
name
Snapchat 493.7 2011
Tiktok 750.0 2016
Instagram 1280.0 2010
Twitter 229.0 2006

As mentioned in the first section, integers are treated as labels. Since the basic axis is set to indicies of the dataframe, inputting general fields into the drop function in this case will result in an error:

Reset Code Python Output:
```
KeyError: ['Tiktok', 'Instagram'] not found in axis
```

Now, let's say that after some consideration we decide to drop Tiktok and Instagram because they're too addicting and time consuming. We can drop both rows by inputting a list of the labels:

Reset Code Python Output:
users(mil.) created
name
Snapchat 493.7 2011
Twitter 229.0 2006

Pandas Documentation

Click Here for the pandas documentation for the drop function