Custom Discrete Distribution in Python

MicroProject Overview

In statistics and data science, random variables are used to model events that have uncertain outcomes. For example, in DISCOVERY, we explore the binomial distribution to model flipping a coin, drawing from a deck of cards, guessing on a multiple choice exam, and many other events with a single, fixed probability of success. However, what if there are multiple different outcomes? This MicroProject will explore creating custom discrete distributions in Python to model complex events!

Data Science Skills

In this microproject, you will strengthen the following Data Science skills:

  • Using the scipy.stats library to create a binomial distribution
  • Using scipy.stats's rv_discrete function to create a custom discrete distribution
  • Connecting the mathematical definition of the expected value to the mean() function of a distribution
  • Using mean(), cdf(), pdf() on a custom distribution to find properties of the distribution

Let's nerd out!

First Time Doing a MicroProject?

Each MicroProject starts with a notebook that we provide to you to get started! You will need to configure a git repository to connect to our `microprojects` remote where we release the starter notebook.

Fetch the Initial Files

In your terminal, navigate to your GitHub repository and merge the initial files by running the following commands:

git fetch microprojects
git merge microprojects/microproject-custom-discrete-distributions --allow-unrelated-histories -m "Merging initial files"

Complete the Notebook

If the commands above were successful, you have merged in the initial files to start on the MicroProject.

  • Find the new microproject-custom-discrete-distributions folder.
  • Open microproject-custom-discrete-distributions.ipynb and complete the MicroProject!

Commit and Grade Your Notebook

Once you have finished your notebook, you must use the built-in GitHub Action to preform automated grading of your MicroProject notebook! You will need to commit your work and then manually run the GitHub Action.

Commit Your Work

To commit your notebook, run the standard git commands in your terminal:

git add -u
git commit -m "microproject completed"
git push

Grade Your Notebook

To grade your notebook, you will need to visit your GitHub repository in your browser.

  • Visit your GitHub repository in your browser
  • Click on the "Actions" tab
  • Under "Workflows", find the workflow for this microproject
  • Click the "Run Workflow" in the blue box, and then the green "Run Workflow"
  • After about 10 seconds, you should see a new job that has started running
    • You can click on the job to watch it run in real-time
    • It will take ~1 minute to run and grade
  • Once the running is complete, the autograding summary will be available!