Upcoming Deadlines
Final Grades are Posted!

Final grades are posted in Canvas. Have an amazing summer!!
Join the DISCOVERY Team!

Interested in helping develop DISCOVERY?
Apply to be a CA!
Week 15: Regression, Clustering, and Machine Learning
This week, we continue our journey with machine learning and you’ll begin to write your first code to have a machine learn! The prelectures and topics for this week’s lectures are listed below:
Monday, April 25: Machine Learning Models with sklearn
Wednesday, April 27: kMeans Clustering
Friday, April 29: Towards Machine Learning in Python
You will also complete your final lab this week, lab_regression
:)
Week 14: Towards Machine Learning
This week we will wrap up hypothesis testing and go into one of the most exciting areas of data science: machine learning! Here’s the plan:
Monday, April 18:
 We will finish up Hypothesis Testing from last week
 PreLecture: Overview of Machine Learning
 No new homeworks on Monday!
Wednesday, April 20:
 PreLecture: Correlation
 Homework m602, due April 22 by 11:59pm
Friday, April 22:
 PreLecture: Linear Regression
 Homework m603, due April 25 by 11:59pm
In addition, remember that we you have your lab section every week and we have office hours every weekday (MondayFriday)! :)
Data Science and Related Courses
Last week, Wade and Karle discussed Data Science Courses offered at Illinois and related courses. This is the slides from the talk:
Week 13: Confidence Intervals and Hypothesis Testing
This week you will explore two important statistical topics: Confidence Intervals and Hypothesis Testing! We will continue our discussion of random variables as we explore doing these both by hand and in Python!
 Monday’s Prelecture: Confidence Intervals
 Wednesday’s Prelecture: Hypothesis Testing
 Friday’s Prelecture: More Hypothesis Testing no prelecture for today, but the HW released today will be a review of what we covered this week.
You have 3 PL homeworks due this week at 11:59pm on Monday, Wednesday, and Friday! You’ll also get to nerd out more with lab_clt :)
Week 12: Distributions and Random Variables
This week we will spend some time on Monday talking about classes you can consider taking if you want to dive deeper into Data Science! We’ll talk about our favorite courses on campus and the continue talking about distributions and random variables! The prelectures and homeworks for this coming week:
Monday, April 4:
 We will discuss Data Science at Illinois, including courses you can take to further your knowledge in Data Science, Computer Science, and Statistics.
 We will review Bernoulli & Binomial Random Variables (covered in detail last week)
 Homework #24: Bernoulli and Binomial Distributions (m502), Due Wednesday, April 6 at 11:59pm
Wednesday, April 6:
 PreLecture: Central Limit Theorem
 Homework #25: Statistical Distributions in Python (m503), Due Friday, April 8 at 11:59pm
Friday, April 8:
 PreLecture: Polling and Sampling
 Homework #26: Central Limit Theorem (m504), Due Monday, April 11 at 11:59pm
In addition, remember that we you have your lab section every week and we have office hours every weekday (MondayFriday)! :)
Week 11: Midterm 2, Project, and Distributions
This week you will work on Project #1 (Image Mosaic) and dive more into the normal distribution:
 Monday’s Lectures: Bernoulli and Binomial Distributions
 Wednesday’s Lecture: Distributions in Python
 Fridays’s Lecture: No Class Friday due to the midterm :)
You have one PL homework due this week – the practice exam!! During lab, you should work on your project. If you come to lab and work on your project, you’ll get the 20 points for this week! If you finish your project before your lab, send a copy of your image mosaic to your lab TA before your lab starts to get the 20 points!
Week 10: Normal Distribution
This week you will work on Project #1 (Image Mosaic) and dive more into the normal distribution:
 Monday’s Lectures: Normal Distribution
 Wednesday’s Lecture: Law of Large Numbers
 Fridays’s Lecture: Random Variables
You have no PL homeworks this week – you should instead work on your project and prepare for the upcoming Midterm exam! In lab, you will work on lab_justice. :)
Week 8: More Python Conditionals and Functions
This week we will nerd out with more Python! Specifically, we will revisit conditionals and introduce the idea of functions. On Friday, we will introduce our first project of the semester. There are three prelectures for this week on Monday and Wednesday:
 Monday’s Lectures: Sample Space and More Conditionals in Python
 Wednesday’s Lecture: Functions in Python
 Friday’s Lecture: No Prelecture PROJECT INTRO DAY!
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll work on a new lab to practice the Python programming we have learned so far. :)
Week 7: Simulation
We are just two weeks away from Spring Break and beginning my favorite part of the class!! Next week we will finish up probability and begin to talk about simulation – we’ll use Python to simulate realworld events including guessing on multiplechoice exams, playing casino games with various strategies, and much, much more!
 Monday’s Lecture: Bayes’ Theorem
 Wednesday’s Lecture: Overview of Simulation
 Friday’s Lecture: ForLoops in Python
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll work on a new lab to practice the Python programming we have learned so far. :)
Week 6: Probability
This week is all about diving into probability – you’ll continue to learn about how to solve complex probability problems with the Multiplication Rule, the Addition Rule, and finally, we will begin to look at conditional probability! There are three lecture for this week:
 Monday’s Lecture: The Multiplication Rule
 Wednesday’s Lecture: The Addition Rule
 Friday’s Lecture: Conditional Probability
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll work on lab_gpa in lab to practice the Python progamming we have learned so far :) The HW assignments will be released right after each lecture.
Enjoy Week 6!
Week 5: Midterm Exam Week
This week you will take your first midterm exam! You must register for the time you will take your exam in the CBTF by visiting https://cbtf.engr.illinois.edu/.
Since we have an exam next week (more on that below), there’s only one prelecture next week and two lectures:
 Monday’s Lecture: No prelecture reading required
 Wednesday’s Lecture: Complete Random Numbers in Python
 Friday Lecture is Cancelled (Exam Week)
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll begin to work with grouping data in Python in lab_plots! :) These HW assignments will be released right after each lecture.
Good luck with Week 5!
Week 4: Grouping Data and Visualizations
This week you will learn about how we form complex groups of data with Python and begin to dive into Wade’s favorite topic (data visualizations)!
Before each class, make sure to watch the prelecture videos listed below:

Grouping Data in Python – watch before Monday’s Lecture on Feb. 7

Visualization – Histograms – watch before Wednesday’s Lecture on Feb. 9

Visualization – Boxplots – watch before Friday’s Lecture on Feb. 11
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll begin to work with grouping data in Python in lab_simpsons_paradox! :) These HW assignments will be released right after each lecture.
Good luck with Week 4!
Week 3: Conditionals and Descriptive Statistics
This week, you will learn how to filter data by the contents of the row using conditionals and learn about a practice called EDA or Exploratory Data Analysis.
Before each class, make sure to watch the prelecture videos listed below:

DataFrames with Conditionals – watch before Monday’s Lecture on Jan. 31

Software Version Control with git and Exploratory Data Analysis Overview – watch before Wednesday’s Lecture on Feb. 2

Descriptive Statistics – watch before Friday’s Lecture on Feb. 4
In addition to the prelectures, there is a quick, masterybased homework after each lecture and you’ll begin to work with DataFrames conditionals in Python in lab_exp_design! :) These HW assignments will be released right after each lecture.
Good luck with Week 3!
Week 2: Working with Pandas + Experimental Design
This week, we will explore how to select a subset of rows from a DataFrame and different aspects of experimental design (observational studies, confounders, and Simpson’s Paradox!). Class will meet in person this week Mon/Wed/Fri from 1212:50pm in Room 100 Noyes! We are so excited to meet you :)
Before each class, make sure to watch the prelecture videos listed below:
 Row Selection – watch before Monday’s Lecture on 1/24
 Observational Studies, Confounders, and Stratifcation – watch before Wednesday’s Lecture on 1/26
 Simpson’s Paradox – watch before Friday’s Lecture on 1/28
In addition to the prelectures, there is a quick, masterybased homework for every lecture and you’ll begin to work with DataFrames in Python in lab_pandas! :) These HW assignments will be released right after each lecture.
Good luck with Week 2!
Week 1: The Beginning
We (Prof. Wade and Prof. Karle) sent out an email last week and again on Tuesday evening, make sure to view that for important links to the course discord, and more! If you missed them, you can find them as Canvas announcements as well! :)
There will be an introduction to the course held on Zoom during the first meeting of DISCOVERY (Wednesday, Jan. 19). Join us on Zoom at 12:00noon!
In addition to the introduction, you will explore the basics of data science through four microlectures:
 What is Data Science?
 Types of Data
 Experimental Design and Blocking
 Python for Data Science: Introduction to DataFrames
Finally, there are two small homeworks to complete and lab_intro (all due Monday, Jan 24 by 11:59pm):
 Homework 1: Experimental Design and Blocking (m102)
 Homework 2: Python for Data Science: Introduction to DataFrames (m103)
 lab_intro
We will see you on Zoom on Wednesday – and then IRL this coming Monday for the first lecture of DISCOVERY!
Welcome to Data Science Discovery!
Data Science DISCOVERY is an inperson course, except that all UIUC courses will meet online during the first week back.
More details as we get closer to the first day of class. :)