Notice that in the output above, the row with label 2 that we dropped previously appears again. This is because the default of the drop function is to return a copy of the DataFrame with the dropped row. It doesn't drop the row from the original DataFrame.
To permanently drop rows from a DataFrame, we need to set inplace = true:
import pandas as pd\n \n# Creates a DataFrame with 'weight', 'length', and 'price' columns\n# Note: all columns use US measurements: ounces, inches, USD\ndf = pd.DataFrame([\n {'food': 'cheesecake', 'weight': 3.43, 'length': 4, 'price': 6.38},\n {'food': 'mochi', 'weight': 2, 'length': 1.2, 'price': 0.75},\n {'food': 'donut', 'weight': 1.3, 'length': 4.8, 'price': 1.53},\n {'food': 'churro', 'weight': 2.6, 'length': 10, 'price': 3.00},\n {'food': 'cupcake', 'weight': 1.2, 'length': 2.5, 'price': 2.63},\n {'food': 'flan', 'weight': 7.1, 'length': 7, 'price': 8.10},\n {'food': 'egg tart', 'weight': 3, 'length': 2.5, 'price': 3.00},\n])\n# drops the 'cupcake' row\ndf.drop(4, inplace = True)\n \n# drops the 'donut' row\ndf.drop(2, inplace = True)\n \n# prints df\ndf
Reset Code Python Output:
food
weight
length
price
0
cheesecake
3.43
4.0
6.38
1
mochi
2.00
1.2
0.75
3
churro
2.60
10.0
3.00
5
flan
7.10
7.0
8.10
6
egg tart
3.00
2.5
3.00
Removing Multiple Rows from a DataFrame
We can drop multiple rows at a time by inputting a list of labels:
Removing Rows from a DataFrame using Custom Labels
Let's say we're looking to find a new social media platform to use. To do so, we create a DataFrame of the most popular social medias:
import pandas as pd\n# Creates a DataFrame with 'users(mil.)' and 'year created' columns\ndf1 = pd.DataFrame([\n {'name': 'Snapchat', 'users(mil.)': 493.7, 'created': 2011},\n {'name': 'Tiktok', 'users(mil.)': 750, 'created': 2016},\n {'name': 'Instagram', 'users(mil.)': 1280, 'created': 2010},\n {'name': 'Twitter', 'users(mil.)': 229.0, 'created': 2006},\n {'name': 'Facebook', 'users(mil.)': 2100.0, 'created': 2004}])\n \n# Set the row labels to be the `name` column:\ndf1.set_index('name', inplace = True)\ndf1
Reset Code Python Output:
users(mil.)
created
name
Snapchat
493.7
2011
Tiktok
750.0
2016
Instagram
1280.0
2010
Twitter
229.0
2006
Facebook
2100.0
2004
Now, let's say we hear on the news that Facebook is facing many data privacy lawsuits. We don't want our personal information leaked, so we make the decision to drop Facebook from our DataFrame:
As mentioned in the first section, integers are treated as labels. Since the basic axis is set to indicies of the dataframe, inputting general fields into the drop function in this case will result in an error:
```
KeyError: ['Tiktok', 'Instagram'] not found in axis
```
Now, let's say that after some consideration we decide to drop Tiktok and Instagram because they're too addicting and time consuming. We can drop both rows by inputting a list of the labels: