Data Science MicroProjects
MicroProjects are detailed but small projects designed to do real Data Science in under an hour!
MicroProject: DataFrame of Your NWS Weather Forecast 🌩️
The National Weather Service allows, for free, "developers access to critical forecasts, alerts, and observations, along with other weather data." In this MicroProject, you will explore using the weather.gov API service to get the weather for your location.
MicroProject: Highest Mountains in the World ⛰️
Wikipedia is an absolutely amazing source of information about almost every topic you can imagine! In this microproject, you will explore how to easily use data in Wikipedia tables as datasets, and perform row selection based on the contents of strings in your DataFrame.
MicroProject: Illini Football
The University of Illinois' Fighting Illini Historical Football Scores Dataset contains a "collection of final scores of every known Fighting Illini football game since 1892, with data on location, homecoming, and national bowl games." In this MicroProject, you will explore the history of Illini football games through row selection, data groups, and creating basic data visualization.
MicroProject: World University Rankings
There are hundreds of organizations that rank universities, including US News and World Report, QS World University Rankings, Times Higher Education (THE), and many others.
The Times Higher Education (THE) provides a clean, well-documented CSV that includes their rankings based on the "performance data on universities for students and their families, academics, university leaders, governments and industry". Their 2020 dataset includes almost 1,400 universities across 92 countries and includes 13 performance indicators that measure an institution’s performance across teaching, research, knowledge transfer and international outlook. Their website with additional details on this dataset is found here: https://www.timeshighereducation.com/content/world-university-rankings
In this MicroProject, you will explore basic DataFrame operations on the Times Higher Education university rankings.
MicroProject: United States Congress 🏛️
The @unitedstates project (https://theunitedstates.io/) maintains various high-quality datasets about the United States government. Specifically, the congress-legislators
dataset contains every member "of the United States Congress (1789-Present), congressional committees (1973-Present), committee membership (current only), and presidents and vice presidents of the United States in YAML, JSON, and CSV format."
MicroProject: Building a Scene Recognition Model form Video Frames
Visual images are an important part of all media and Data Scientists are often using images as data sources. In this MicroProject, you will create a simple model to detect the amount of time spent in two different "scenes" we used when creating office-hour style videos for Data Science DISCOVERY. To do this, you will learn how to import an entire folder of images, preform image analysis, and create your own model without using a pre-build library.
MicroProject: United Nations (UNHCR) Refugee Data
The United Nations High Commissioner for Refugees (UNHCR) is a United Nations agency mandated to aid and protect refugees, forcibly displaced communities, and stateless people, and to assist in their voluntary repatriation, local integration or resettlement to a third country.
The UNHCR has a database of refugees and internally displaced persons (IDPs) around the world. The data is updated daily and includes information on the number of refugees and IDPs, the countries they are from, the countries they are in, and the number of people who have been displaced by conflict or natural disasters.
MicroProject: Custom Discrete Distribution in Python
In statistics and data science, random variables are used to model events that have uncertain outcomes. For example, in DISCOVERY, we explore the binomial distribution to model flipping a coin, drawing from a deck of cards, guessing on a multiple choice exam, and many other events with a single, fixed probability of success. However, what if there are multiple different outcomes? This MicroProject will explore creating custom discrete distributions in Python to model complex events!
MicroProject: Simulation for Ten Heads in a Row
Simulation is a powerful tool that allows us to run a event with a probabilistic outcome millions of times in under a second. In this MicroProject, you will use simple simulation to flip a coin a million times and discover how to find trends in the simulated data. After writing this simulation, you will do analysis that compile data over multiple observation -- a simple form of time-series analysis -- to find if the statistical probability of events measure the simulated probability.
MicroProject: Choropleth Maps from DataFrames 🗺️
Geographical data visualizations are some of the most impactful forms of visualization since it easily allows the user to locate places familiar to themselves. One popular geographical visualization is a choropleth map-- a visualization of data on a map where geographical regions are shaded to visually encode data about the region as a whole. For example, population density maps and per-capita income maps are common choropleth maps.
In this MicroProject, you will learn about the folium
Python library to create choropleth maps from a DataFrame! Let's nerd out! :)
MicroProject: Exploring COVID-19 Data from GitHub
Since before COVID-19 was detected in the United States, the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University has provided daily updates of COVID-19 case data as clean, structured CSV files on GitHub as a free public service to the world. In this MicroProject, you will explore how to find a dataset on GitHub and use that for Data Science analysis!
MicroProject: Bechdel Test 🎥
The Bechdel Test, or Bechdel-Wallace Test, is a simple way of measuring the representation of women in a film or other work of fiction. To pass the The Bechdel Test, a work must pass all three criteria: (1): The work must have at least two women in it, (2): who talk to each other, (3): about something other than a man.
The test was popularized by Alison Bechdel's comic, in a 1985 strip called "The Rule". The website BechdelTest.com provides a searchable database of films and their Bechdel Test results, allowing users to explore and analyze patterns in gender representation in cinema.
MicroProject: Valentine's Day 💗
The NRF (National Retail Federation) is the world's largest retail trade association. Its members include department stores, specialty, discount, catalog, Internet, and independent retailers, chain restaurants, grocery stores, and multi-level marketing companies. NRF has surveyed consumers about how they plan to celebrate Valentine’s Day annually for over a decade. This includes consumer spending, gifts purchased, and more!
MicroProject: FIFA World Cup
The FIFA World Cup is a global football (soccer) competition contested by the senior men's national teams which occurs every 4 years. It is likely the most popular sporting event in the world, drawing billions of television viewers every tournament. This microproject explores thousands of football matches, including world cup games through 2022.