We can find many different quantiles for sets of numbers using the .quantile() function of a DataFrame. One specific quantiles, the 50% quantile, is almost universally known since it is the median!
If the numbers in a column are organized in ascending order, the median is the value that rests directly in the middle of the data, with 50% on the left side (and the right side, but we focus specifically on the left side when we think of quantiles). We can also find the 25% quantile, which is the value with 25% of the data to the left, and the 75% quantile, which is the value with 75% of the data to the left.
The Movie Dataset
Let's use a small DataFrame with information about movies to see this function in action!
The usefulness of .quantile() function lies with its parameter. By default, the function calculates the 50% quantile (the median). This is kind of redundant, though, because we already have a .median() function that returns the same result.
import pandas as pd\n \n# Creates a DataFrame of "movie", "release date", "domestic gross", "worldwide gross", "personal rating", and "international box office" columns\ndf = pd.DataFrame([\n {"movie": "The Truman Show", "release date": "1996-06-05", "domestic box office": 125618201, "worldwide box office": 264118201, "personal rating": 10, "international box office": 138500000},\n {"movie": "Rogue One: A Star Wars Story", "release date": "2016-12-16", "domestic box office": 532177324, "worldwide box office": 1055135598, "personal rating": 9, "international box office": 522958274},\n {"movie": "Iron Man", "release date": "2008-05-02", "domestic box office": 318604126, "worldwide box office": 585171547, "personal rating": 7, "international box office": 266567421},\n {"movie": "Blade Runner", "release date": "1982-06-25", "domestic box office": 32656328, "worldwide box office": 39535837, "personal rating": 8, "international box office": 6879509},\n {"movie": "Breakfast at Tiffany's", "release date": "1961-10-05", "domestic box office": 9551904, "worldwide box office": 9794721, "personal rating": 7, "international box office": 242817}\n])\n# Just as with any other descriptive statistic, specify the column in brackets.\nfifty_quant = df["personal rating"].quantile()\nfifty_quant
We can change which quantile the function calculates by inputting our own decimal parameter. For example, to calculate the 25th percentile, type 0.25 in the parenthesis.
However, we are not limited to 0.25, 0.5, and 0.75. We can input any number between 0 and 1 to calculate more complicated quantiles.