Exploring COVID-19 Data from GitHub
MicroProject Overview
Since before COVID-19 was detected in the United States, the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University has provided daily updates of COVID-19 case data as clean, structured CSV files on GitHub as a free public service to the world.
In this MicroProject, you will explore how to find a dataset on GitHub and use that for Data Science analysis! At the end of the MicroProject, you will check if the Pareto Principle applies to confirmed cases of COVID-19. Let's nerd out!
Data Science Skills
In this microproject, you will explore COVID-19 data from Johns Hopkins University by finding the raw CSV data on GitHub and strengthen the following Data Science skills:
- Importing data directly from GitHub with
pd.read_csv
- Grouping unique variables together using
df.groupby
- Finding the cumulative sum of data using
df.cumsum
- Checking if the Pareto Principle applies to confirmed cases of COVID-19
Let's nerd out!
First Time Doing a MicroProject?
Each MicroProject starts with a notebook that we provide to you to get started! You will need to configure a git repository to connect to our `microprojects` remote where we release the starter notebook.
- Follow our Guide: "First Time Setup for MicroProjects" to get set up!
Fetch the Initial Files
In your terminal, navigate to your GitHub repository and merge the initial files by running the following commands:
git fetch microprojects
git merge microprojects/microproject-covid-data-from-github --allow-unrelated-histories -m "Merging initial files"
Complete the Notebook
If the commands above were successful, you have merged in the initial files to start on the MicroProject.
- Find the new
microproject-covid-data-from-github
folder. - Open
microproject-covid-data-from-github.ipynb
and complete the MicroProject!
Commit and Grade Your Notebook
Once you have finished your notebook, you can use the built-in GitHub Action to preform automated grading of your MicroProject notebook! You will need to commit your work and then manually run the GitHub Action.
Commit Your Work
To commit your notebook, run the standard git commands in your terminal:
git add -u
git commit -m "microproject completed"
git push
Grade Your Notebook
To grade your notebook, you will need to visit your GitHub repository in your browser.
- Visit your GitHub repository in your browser
- Click on the "Actions" tab
- Under "Workflows", find the workflow for this microproject
- Click the "Run Workflow" in the blue box, and then the green "Run Workflow"/li>
- After about 10 seconds, you should see a new job that has started running
- You can click on the job to watch it run in real-time
- It will take ~1 minute to run and grade
- Once the running is complete, the autograding summary will be available!