DataFrame Indexing: .loc[] vs .iloc[]


The loc and iloc functions are commonly used to select certain groups of rows (and columns) of a pandas DataFrame.

To explore these two functions and their differences, we'll use a DataFrame of 7 drinks with different features and nutrition facts:

Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
soda True cold 10.5 150
coffee False hot 3.0 31
smoothie False cold 6.0 85
water False cold 0.0 0
tea False hot 2.0 43
lemonade False cold 9.5 125
slushy False cold 8.0 99

Difference Between loc and iloc

The difference between the loc and iloc functions is that the loc function selects rows using row labels (e.g. tea) whereas the iloc function selects rows using their integer positions (staring from 0 and going up by one for each row).

Selecting a Single Row of Data

Selecting a Single Row by the Row Label (.loc)

For the loc function, specifying the row label as the input will index only the row with that label.

Reset Code Python Output:
```
carbonated?    False
temperature      hot
sugar(tsp)       2.0
calories          43
Name: tea, dtype: object
```

Selecting a Single Row by the Integer Index (.iloc)

On the other hand, for the iloc function, specifying a single integer index as the input will index the row at that position. For example, starting with zero, index 2 refers to the 3rd element in the list (the smoothie row):

Reset Code Python Output:
```
carbonated?    False
temperature     cold
sugar(tsp)       6.0
calories          85
Name: smoothie, dtype: object
```

Selecting a Single Row as a DataFrame

Note that the python code above returns the row as a Series. If we want the row returned as a DataFrame, we use a list with only one element:

Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
tea False hot 2.0 43
Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
coffee False hot 3.0 31

Selecting Multiple Rows of Data

Selecting Multiple Rows using a List

For both the loc and iloc functions, we can use a list as our input to retrieve multiple rows. The labels of the output are ordered by when the label appears in the input list.

Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
smoothie False cold 6.0 85
tea False hot 2.0 43
soda True cold 10.5 150
Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
smoothie False cold 6.0 85
tea False hot 2.0 43
lemonade False cold 9.5 125

Selecting Multiple Rows using a Slice

Alternatively, we can use a slice object as our input to retrieve multiple rows.
However, it is important to note that for the loc function, the start and stop are included in the output whereas the stop is not included for the iloc function.

Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
coffee False hot 3.0 31
smoothie False cold 6.0 85
water False cold 0.0 0
tea False hot 2.0 43
Reset Code Python Output:
carbonated? temperature sugar(tsp) calories
drink
soda True cold 10.5 150
coffee False hot 3.0 31
smoothie False cold 6.0 85
water False cold 0.0 0

Selecting for BOTH Rows and Columns

Selecting a Single Cell Value

For the loc function, to find a specific cell, we can use a list as the input value, specifying the vertical axis label (left) and the horizontal axis label (right).
Reset Code Python Output:
'hot'
For the iloc function, to find a specific cell, we can specify the row index (left) and column index (right) separated by a comma.
Reset Code Python Output:
10.5

Selecting a Multiple Cell Values using a List

Similar to indexing only rows, we can retrieve multiple rows and columns. One way to do this is using two lists (one for the vertical axis labels and one for the horizontal axis labels) separated by a comma as the input:

Reset Code Python Output:
sugar(tsp) calories
drink
water 0.0 0
lemonade 9.5 125
Reset Code Python Output:

sugar(tsp) calories
drink
coffee 3.0 31
soda 10.5 150

Selecting a Muliple Cell Values using a Slice

Additionally, we can use slice objects:

Reset Code Python Output:
carbonated? temperature sugar(tsp)
drink
soda True cold 10.5
coffee False hot 3.0
smoothie False cold 6.0
water False cold 0.0
Reset Code Python Output:
sugar(tsp) calories
drink
water 0.0 0
tea 2.0 43
lemonade 9.5 125
slushy 8.0 99

Selecting a Muliple Cell Values using a Combination of Everything

We can use a combination of the two methods together:

Reset Code Python Output:
carbonated? temperature sugar(tsp)
drink
coffee False hot 3.0
smoothie False cold 6.0
soda True cold 10.5
Reset Code Python Output:
carbonated? temperature calories
drink
smoothie False cold 85
water False cold 0
tea False hot 43

Pandas Documentation

Click Here for the full pandas documentation for the loc function
Click Here for the full pandas documentation for the iloc function