Finding Quantiles of a Column in a DataFrame


We can find many different quantiles for sets of numbers using the .quantile() function of a DataFrame. One specific quantiles, the 50% quantile, is almost universally known since it is the median!

If the numbers in a column are organized in ascending order, the median is the value that rests directly in the middle of the data, with 50% on the left side (and the right side, but we focus specifically on the left side when we think of quantiles). We can also find the 25% quantile, which is the value with 25% of the data to the left, and the 75% quantile, which is the value with 75% of the data to the left.

The Movie Dataset

Let's use a small DataFrame with information about movies to see this function in action!

Reset Code Run All to Here Python Output:
movie release date domestic box office worldwide box office personal rating international box office
0 The Truman Show 1996-06-05 125618201 264118201 10 138500000
1 Rogue One: A Star Wars Story 2016-12-16 532177324 1055135598 9 522958274
2 Iron Man 2008-05-02 318604126 585171547 7 266567421
3 Blade Runner 1982-06-25 32656328 39535837 8 6879509
4 Breakfast at Tiffany's 1961-10-05 9551904 9794721 7 242817

Choosing the Quantile

The usefulness of .quantile() function lies with its parameter. By default, the function calculates the 50% quantile (the median). This is kind of redundant, though, because we already have a .median() function that returns the same result.

Reset Code Run All to Here Python Output:
8.0
Reset Code Run All to Here Python Output:
8.0

We can change which quantile the function calculates by inputting our own decimal parameter. For example, to calculate the 25th percentile, type 0.25 in the parenthesis.

However, we are not limited to 0.25, 0.5, and 0.75. We can input any number between 0 and 1 to calculate more complicated quantiles.

Reset Code Run All to Here Python Output:
7

Additional explanations, videos, and example problems covering quantiles is part of the DISCOVERY course content found here:

Finding Multiple Quantiles

We can also input a list of decimals to get every quantile we want at once. The result will be in list format.

Reset Code Run All to Here Python Output:
0.25     32656328.0
0.50    125618201.0
0.75    318604126.0
Name: domestic box office, dtype: float64