Select Rows From A DataFrame


There are numerous ways to select rows from a DataFrame. One method is to select rows based on the content of its columns. To do this, we can use conditions.

For our example, let's explore a DataFrame of different pets:

Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
0 golden retriever 70.00 11 mammal
1 ferret 4.40 7 mammal
2 axolotl 0.63 12 amphibian
3 bearded dragon 1.00 13 reptile
4 frog 0.80 11 amphibian
5 basilisk 0.43 10 reptile
6 salamander 0.44 16 amphibian
7 chinchilla 1.80 18 mammal
8 goldfish 8.00 12 fish
9 koi 12.00 30 fish
10 gecko 0.15 15 reptile

Condition Operators

When using conditions, there are six primary comparison operators:

  • < (strictly less than)
  • > (strictly greater than)
  • <= (less than or equal to)
  • >= (greater than or equal to)
  • == (exactly equal to)
  • != (doesn't equal)

When you use a conditional by itself, a Series of True or False values based on the truth of the conditional is given:

Reset Code Python Output:
```
0      True
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8      True
9      True
10    False
Name: weight(lb.), dtype: bool
```

Row Selection With a Single Condition

To select only rows that match one specific criteria, we can use a single condition.

For example, say were only interested in looking at amphibian pets:

Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
2 axolotl 0.63 12 amphibian
4 frog 0.80 11 amphibian
6 salamander 0.44 16 amphibian

Now, say we are only interested in smaller pets that weighed less than a pound:

Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
2 axolotl 0.63 12 amphibian
4 frog 0.80 11 amphibian
5 basilisk 0.43 10 reptile
6 salamander 0.44 16 amphibian
10 gecko 0.15 15 reptile

Additional explanations, videos, and example problems covering conditionals is part of the DISCOVERY course content found here:

Row Selection with Multiple Conditions

It is possible to select rows that meet different criteria using multiple conditions by joining conditionals together with & (AND) or | (OR) logical operators. (Note: Python requires the use of parentheses around the conditionals when using multiple conditionals!)

For example, say we want a pet that lives longer than 10 years but less than 15 years.

Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
0 golden retriever 70.00 11 mammal
2 axolotl 0.63 12 amphibian
3 bearded dragon 1.00 13 reptile
4 frog 0.80 11 amphibian
8 goldfish 8.00 12 fish

Row Section with Mixed Logical Operators

Now, say we wanted to look at pets that is either a mammal or an amphibian, and lives more than 12 years.

We have 3 conditions:

  • df['group'] == 'amphibian'
  • df['group'] == 'mammal'
  • df['lifespan(yr.)'] > 12

But, notice the difference in output when these conditions are arranged differently:

This code first checks if the row's group column contains the word 'amphibian'.
If not, then it checks if the row contains the word 'mammal' in the group column and contains a value greater than 12 in the lifespan(yr.) column.
Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
2 axolotl 0.63 12 amphibian
4 frog 0.80 11 amphibian
6 salamander 0.44 16 amphibian
7 chinchilla 1.80 18 mammal

This code first checks if the row's lifespan(yr.) column contains a value greater than 12 and its group column contains the word 'amphibian'. If not, then it checks if the row's group column contains the word 'mammal'.
Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
0 golden retriever 70.00 11 mammal
1 ferret 4.40 7 mammal
6 salamander 0.44 16 amphibian
7 chinchilla 1.80 18 mammal

This line of code checks each row for a value greater than 12 in the lifespan(yr.) column and that its group column contains either amphibian or mammal.
Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
6 salamander 0.44 16 amphibian
7 chinchilla 1.80 18 mammal

Notice that the order of the conditions, the placement of parenthesis, and the use of the logical operators change the output.

Row Selection with Five Conditionals

Finally, when selecting rows from a DataFrame, we can add as many conditions as we want:

Reset Code Python Output:
name weight(lb.) lifespan(yr.) group
0 golden retriever 70.0 11 mammal
3 bearded dragon 1.0 13 reptile

An explanation of how AND and OR operators work, including videos, example problems, and more details is part of the DISCOVERY course content: