The loc and iloc functions are commonly used to select certain groups of rows (and columns) of a pandas DataFrame.
To explore these two functions and their differences, we'll use a DataFrame of 7 drinks with different features and nutrition facts:
import pandas as pd
# Create a DataFrame with "carbonated?", "temperature", "sugar(tsp.)", and "calories" columns:
df = pd.DataFrame([{'drink':'soda','carbonated?':True,'temperature':'cold','sugar(tsp)':10.5,'calories':150},{'drink':'coffee','carbonated?':False,'temperature':'hot','sugar(tsp)':3,'calories':31},{'drink':'smoothie','carbonated?':False,'temperature':'cold','sugar(tsp)':6,'calories':85},{'drink':'water','carbonated?':False,'temperature':'cold','sugar(tsp)':0,'calories':0},{'drink':'tea','carbonated?':False,'temperature':'hot','sugar(tsp)':2,'calories':43},{'drink':'lemonade','carbonated?':False,'temperature':'cold','sugar(tsp)':9.5,'calories':125},{'drink':'slushy','carbonated?':False,'temperature':'cold','sugar(tsp)':8,'calories':99},])# Set the row label indexes to the "drink" column:
df.set_index('drink', inplace =True)
df
carbonated?
temperature
sugar(tsp)
calories
drink
soda
True
cold
10.5
150
coffee
False
hot
3.0
31
smoothie
False
cold
6.0
85
water
False
cold
0.0
0
tea
False
hot
2.0
43
lemonade
False
cold
9.5
125
slushy
False
cold
8.0
99
Creating a DataFrame with a custom index column
Difference Between loc and iloc
The difference between the loc and iloc functions is that the loc function selects rows using row labels (e.g. tea) whereas the iloc function selects rows using their integer positions (staring from 0 and going up by one for each row).
Selecting a Single Row of Data
Selecting a Single Row by the Row Label (.loc)
For the loc function, specifying the row label as the input will index only the row with that label.
Selecting a Single Row by the Integer Index (.iloc)
On the other hand, for the iloc function, specifying a single integer index as the input will index the row at that position. For example, starting with zero, index 2 refers to the 3rd element in the list (the smoothie row):
Selecting a Single Row as a DataFrame
Note that the python code above returns the row as a Series. If we want the row returned as a DataFrame, we use a list with only one element:
# loc function
df.loc[['tea']]
carbonated?
temperature
sugar(tsp)
calories
drink
tea
False
hot
2.0
43
Indexing Using a Single Label
# iloc function
df.iloc[[1]]
carbonated?
temperature
sugar(tsp)
calories
drink
coffee
False
hot
3.0
31
Indexing Using a Single Integer
Selecting Multiple Rows of Data
Selecting Multiple Rows using a List
For both the loc and iloc functions, we can use a list as our input to retrieve multiple rows. The labels of the output are ordered by when the label appears in the input list.
# loc function# note that multiple labels along the vertical axis is specified
df.loc[['smoothie','tea','soda']]
carbonated?
temperature
sugar(tsp)
calories
drink
smoothie
False
cold
6.0
85
tea
False
hot
2.0
43
soda
True
cold
10.5
150
Indexing Multiple Labels Using a List
# iloc function # note that the range is between 0 and (len(df) - 1) inclusive
df.iloc[[2,4,5]]
carbonated?
temperature
sugar(tsp)
calories
drink
smoothie
False
cold
6.0
85
tea
False
hot
2.0
43
lemonade
False
cold
9.5
125
Indexing Multiple Integer Positions Using a List
Selecting Multiple Rows using a Slice
Alternatively, we can use a slice object as our input to retrieve multiple rows. However, it is important to note that for the loc function, the start and stop are included in the output whereas the stop is not included for the iloc function.
# loc function# note that 'coffee' and 'tea' are included in the output
df.loc['coffee':'tea']
carbonated?
temperature
sugar(tsp)
calories
drink
coffee
False
hot
3.0
31
smoothie
False
cold
6.0
85
water
False
cold
0.0
0
tea
False
hot
2.0
43
Indexing Multiple Labels Using a Slice Object
# iloc function # note that the output includes the row at index 3 but not the row at index 4
df.iloc[0:4]
carbonated?
temperature
sugar(tsp)
calories
drink
soda
True
cold
10.5
150
coffee
False
hot
3.0
31
smoothie
False
cold
6.0
85
water
False
cold
0.0
0
Indexing Multiple Integer Positions Using a Slice Object
Selecting for BOTH Rows and Columns
Selecting a Single Cell Value
For the loc function, to find a specific cell, we can use a list as the input value, specifying the vertical axis label (left) and the horizontal axis label (right).
For the iloc function, to find a specific cell, we can specify the row index (left) and column index (right) separated by a comma.
Selecting a Multiple Cell Values using a List
Similar to indexing only rows, we can retrieve multiple rows and columns. One way to do this is using two lists (one for the vertical axis labels and one for the horizontal axis labels) separated by a comma as the input:
# loc function
df.loc[['water','lemonade'],['sugar(tsp)','calories']]
sugar(tsp)
calories
drink
water
0.0
0
lemonade
9.5
125
Indexing Using Lists (Row and Column)
# iloc function
df.iloc[[1,0],[2,3]]
sugar(tsp)
calories
drink
coffee
3.0
31
soda
10.5
150
Indexing Using Lists (Row and Column)
Selecting a Muliple Cell Values using a Slice
Additionally, we can use slice objects:
# loc function
df.loc[:'water','carbonated?':'sugar(tsp)']
carbonated?
temperature
sugar(tsp)
drink
soda
True
cold
10.5
coffee
False
hot
3.0
smoothie
False
cold
6.0
water
False
cold
0.0
Indexing Using Slice Objects (Row and Column)
# iloc function
df.iloc[3:7,2:]
sugar(tsp)
calories
drink
water
0.0
0
tea
2.0
43
lemonade
9.5
125
slushy
8.0
99
Indexing Using Slice Objects (Row and Column)
Selecting a Muliple Cell Values using a Combination of Everything
We can use a combination of the two methods together:
# loc function
df.loc[['coffee','smoothie','soda'],:'sugar(tsp)']
carbonated?
temperature
sugar(tsp)
drink
coffee
False
hot
3.0
smoothie
False
cold
6.0
soda
True
cold
10.5
Indexing Using a List and a Slice Object (Row and Column)
# iloc function
df.iloc[2:5,[0,1,3]]
carbonated?
temperature
calories
drink
smoothie
False
cold
85
water
False
cold
0
tea
False
hot
43
Indexing Using a List and a Slice Object (Row and Column)
Pandas Documentation
Click Here for the full pandas documentation for the loc function Click Here for the full pandas documentation for the iloc function