Python Functions for Random Distributions
The scipy.stats
library in Python provides us the ability to represent random distributions using Python! The library has dozens of distributions, including all commonly used distributions. Three extremely common distributions are normal, bernoulli and binomial distributions:
Distribution  Python Code 

Normal Distribution 

Bernoulli Distribution 

Binomial Distribution 

Once you have a variable with a distribution, there are many Python functions we can use to preform calculations with the distribution. The functions are the same no matter what distribution you have  so let's discovery them via examples!
Example Binomial Distribution
A simple binomial distribution that is easy to understand is a binomial distribution with n=2 and p=0.5 (two events, each with a 50% chance of success, like flipping a coin two times and finding out how many times we get heads). To create this distribution in Python:
from scipy.stats import binom
COIN = binom(n=2, p=0.5)
There are four possible outcomes  HH, HT, TH, and TT. The binomial distribution models these outcomes:
 There is a 25% probability of the outcome having zero heads (TT). This is represented when
COIN
returns the value0
(zero heads).  There is a 50% probability of the outcome having exactly one head (TH or HT). This is represented when
COIN
returns the value1
(exactly one head).  There is a 25% probability of the outcome having two heads (HH). This is represented when
COIN
returns the value2
(exactly two heads).
We can represent this distribution as a table and a graph:
x  P( COIN == x ) 

0 zero heads  0.25 
1 one head  0.5 
2 two heads  0.25 
CDF: Cumulative Distribution Function
The Cumulative Distribution Function or CDF is:
 The probability of all outcomes less than or equal to a given value
x
,  Graphically, this is the the total area of everything less than or equal to
x
(the total area of the left ofx
)
Using our twocoin flip example where COIN = binom(n=2, p=0.5)
, the CDF functions are asking the following:
COIN.cdf(0.2)
asks "what percentage of results have 0.2 or fewer heads?"COIN.cdf(1)
asks "what percentage of results have 1 or fewer heads?"COIN.cdf(2)
asks "what percentage of results have 2 or fewer heads?"
Python Example 1: COIN.cdf(0.2)
While it's a bit strange to ask "what percentage of results have 0.2 or fewer heads?" since we cannot get a partial number of heads, but it's easy to calculate that the only number of heads that is equal to or less than 0.2 is getting zero heads. Since this only happens one our of four times, we expect the result to be 25%.
Running the code in Python:
0.25
Python Example 2: COIN.cdf(1)
Similar to the first example, "what percentage of results have 1 or fewer heads?" In this case, we can have either zero heads or one head. Since three of our four outcomes have zer or one heads (TT, TH, and HT), the CDF should be 3/4 or 75%. Let's check with Python:
0.75
Python Example 3: COIN.cdf(2)
Similar to the first two examples, "what percentage of results have 2 or fewer heads?" In this case, we can have either zero heads, one head or two heads  that is every possible result! This means there should be a 100% chance of having two or fewer heads. Checking with Python:
1
PPF: Probability Point Function
The Probability Point Function or PPF is the inverse of the CDF. Specifically, the PPF returns the exact point where the probability of everything to the left is equal to y
. This can be thought of as the percentile function since the PPF tells us the value of a given percentile of the data.
COIN.ppf(0.2)
asks "what is the 20%tile of heads?COIN.ppf(0.6)
asks "what is the 60%tile of heads?COIN.ppf(0.99)
asks "what is the 99%tile of heads?
Examples
Examining the distribution for COIN, we can calculate the percentiles for each number of heads:
x  P( COIN == x )  Percentile Range 

0 zero heads  0.25  0%  25% 
1 one head  0.5  25%  75% 
2 two heads  0.25  75%  100% 
Therefore, we expect that:
COIN.ppf(0.2)
, the 20%tile, falls within 0 heads, and we expect the output to be0
.COIN.ppf(0.6)
, the 60%tile, falls within 1 head, and we expect the output to be1
.COIN.ppf(0.99)
, the 99%tile, falls within 2 heads, and we expect the output to be2
.
Verifying with Python:
0 1 2
PDF / PMF: Probability {Density/Mass} Functions
The .pmf()
and .pdf()
functions find the probability of an event at a specific point in the distribution.
The Probability Mass Function (PMF)  or
.pmf()
 is only defined on discrete distributions where each event has a fixed probability of occurring.The Probability Density Function (PDF)  or
.pdf()
 is only defined on continuous distributions where it finds the probability of an event occurring within a window around a specific point.
Probability Mass Function (PMF)
Earlier, we discussed that the probability of zero heads is 25% in our COIN
binomial random variable. Therefore, we expect that COIN.pmf(0)
should be 0.25:
0.25
Likewise, we expect pmf(1)
to be 50% (for the 50% chance of flipping exactly one head) and pmf(2)
to be 25% (for the 25% chance of flipping two heads):
0.25 0.5 0.25
RVS: Random Value Sample
The .rvs()
function returns a random sample of the distribution with probability equal to the distribution  if something is 80% likely, that value will be sampled 80% of the time. In COIN
, we expect more results with 1
(50% occurrence of 1 head) than 0
or 2
(25% occurrence of either zero heads or two heads).
Generating a sample of 50 values:
array([2, 1, 2, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 2, 2, 0, 0, 1, 1, 0, 1, 2, 1, 0, 2, 1, 1, 1, 1, 1, 1, 1, 0, 2, 0, 1, 1, 2, 2, 2, 0, 1, 2, 1, 1, 1, 1, 1, 1, 1])
We can insert this data into a DataFrame and count the number of occurrences:
Zero Heads: 10 One Head: 29 Two Heads: 11
In this small simulation, we observe far more results of 1
than 0
or 2
. This is the expected result discussed earlier.
Example WalkThroughs with Worksheets
Video 1: Cumulative Distribution Function (CDF) in Python
Video 2: Cumulative Distribution Function (CDF) in Python
Video 3: Probability Point Function (PPF) in Python
Video 4: Probability Mass and Density Functions (PMF/PDF) in Python
Video 5: Random Variable Sample (RVS) in Python
Video 6: Summary of Functions on Distributions in Python
Practice Questions
Q1: Suppose that you have a fair coin. What is the python code for the distribution of this coin?Q2: Which of the following is the appropriate function for a biased die?