MicroProject #11: Learning Handwritten Digits with AI


MicroProject collectable card

MicroProject Overview

The National Institute of Standards and Technology (NIST), an institute run by the government of the United States, provides a collection of "Special Databases" that "contain digital data objects such as images, software, and videos" and can be found at https://www.nist.gov/srd/related-data-products-and-links/special-databases-and-special-software. Two specific early "Special Databases" have become some of the most famous datasets in machine learning and artificial intelligence:

  • SD-3 contains handwriting samples from 2,100 United States census workers during the 1990 census.
  • SD-7 contains handwriting samples from 500 high school students in Maryland.

As part of the 1992 paper "Comparison of classifier methods: a case study in handwritten digit recognition" by LΓ©on Bottou et al., the authors describe the creation of the "Modified NIST" (MNIST) dataset that:

  • Included samples from 500 unique writers (250 from each NIST dataset), and
  • Normalized all images to 28Γ—28 pixel gray scale images

This MNIST dataset is now one of the most famous datasets in machine learning and artificial intelligence. This dataset contains a collection of 70,000 images of hand-written digits and each are labeled with the number (ex: a picture of a "0" and labeled as 0). In this MicroProject, you'll explore the MNIST dataset and learn about various clustering algorithms to help a computer learn to recognize handwritten digits. Let's nerd out! πŸŽ‰


First Time Doing a MicroProject?

Each MicroProject starts with a notebook that we provide to you to get started! You will need to configure a git repository to connect to our `microprojects` remote where we release the starter notebook.


Fetch the Initial Files

In your terminal, navigate to your GitHub repository and merge the initial files by running the following commands:

git fetch microprojects
git merge microprojects/microproject-11-learning-handwritten-digits-with-ai --allow-unrelated-histories -m "Merging initial files"

Complete the Notebook

If the commands above were successful, you have merged in the initial files to start on the MicroProject.

  • Find the new microproject-11-learning-handwritten-digits-with-ai folder.
  • Open microproject-11-learning-handwritten-digits-with-ai.ipynb and complete the MicroProject!

Commit and Grade Your Notebook

Once you have finished your notebook, you must use the built-in GitHub Action to preform automated grading of your MicroProject notebook! You will need to commit your work and then manually run the GitHub Action.

Commit Your Work

To commit your notebook, run the standard git commands in your terminal:

git add -u
git commit -m "microproject completed"
git push

Grade Your Notebook

To grade your notebook, you will need to visit your GitHub repository in your browser.

  • Visit your GitHub repository in your browser
  • Click on the "Actions" tab
  • Under "Workflows", find the workflow for this microproject
  • Click the "Run Workflow" in the blue box, and then the green "Run Workflow"
  • After about 10 seconds, you should see a new job that has started running
    • You can click on the job to watch it run in real-time
    • It will take ~1 minute to run and grade
  • Once the running is complete, the autograding summary will be available!