Syllabus for STAT/CS/IS 107

Data Science DISCOVERY is offered at The University of Illinois as STAT/CS/IS 107. The content of this website (https://discovery.cs.illinois.edu/) is used as the pre-lecture materials for the lecture where we do interactive activities, practice problems, and nerd out with data science.

The syllabus here provides the structure of the course, schedule, grades, course policies, and the overview of all assignments, labs, and exams. Information about your teaching assistants (TA), office hours, lab sections, discussion forums and other resources available to students enrolled in the STAT 107 is available via the Canvas page:

Course Overview

Course Description: Data Science DISCOVERY or simply "DISCOVERY" is UIUC's largest introductory data science course. This course focuses on the intersection of inferential thinking using statistics, computation using industry standard tools like Python and git, and real-world relevance. As a data driven course, students perform hands-on analysis of real world datasets to analyze and discover the impact of data science. Throughout each experience, students reflect on the social issues surrounding data analysis and practice communicating their results.

Lecture Section: Synchronous in-person lectures that meet every Monday, Wednesday, and Friday for DISCOVERY! You must be enrolled at one lecture section. Lecture sections are listed as the "Lecture" sections under STAT 107 in Course Explorer.

Lab Section: Synchronous in-person lab sections that meet each week. Lab sections are led by TAs and focus on small group programming exercises and discussion. Sections are "Bring Your Own Device" and participation in your lab section makes up a third of your lab score each week. The labs are 80 minutes long and capped at 30 students each. You must attend the lab section you are officially registered for to earn credit. Lab sections are listed as the "Laboratory-Discussion" sections under STAT 107 in Course Explorer.

Prerequisites: None
General Education Credit at UIUC: Quantitative Reasoning I (QR1)
Credit Hours: 4 credit hours (3 lecture hours, 1 lab hour)

Course Contact

DISCOVERY is co-taught by two professors from two different departments:

Lab TA: Your primary point of contact with the course staff will be your Lab Teaching Assistant based on the lab you are registered for in DISCOVERY. The TA contact e-mails by lab section for Spring 2024 is found on Canvas.

Course Discussion Forum: Discord is our class discussion forum. All of the course staff (instructors, TAs, and CAs) will be on Discord to answer any questions you have related to the course. Access to the Discord will be given to you via Canvas.

Even as an online and more casual space, the Discord space is an extension of the classroom and we expect you to be kind, respect one-another, and uphold the student code. Please do not DM or "@" any course staff, do not share screenshots of solution code; instead, it’s a community to help people learn how to master the material themselves. The Discord "Join Link" for Spring 2024 is found on Canvas.

Course Content

In DISCOVERY, there are 3 websites we use regularly:

  1. The DISCOVERY Website (https://discovery.cs.illinois.edu/): This website contains many useful resources for students, including data science guides. We will reference it frequently throughout the semester.
  2. Canvas (https://canvas.illinois.edu/): We use Canvas as a grade book in DISCOVERY! Grades are updated in Canvas each week.
  3. Mastery Platform (https://mastery.cs.illinois.edu/): All homework and CBTF exams will be done using Illinois Mastery Platform.

Required Course Materials

Laptop Computer: You need a laptop running Windows, Mac OS, or Linux. Android Tablets, Chromebooks, and iPads are not supported. You will need to be able to install both Python and git to complete the labs (instructions provided).

Course Packet: DISCOVERY has a course packet (notebook) you can purchase from the Illini Union Bookstore (IUB) for under $10. This includes all of the course notes for the semester! You will use this notebook during each lecture. Filling out the entire notebook will earn you extra credit at the end of the semester.

Grades

Course grades are given in points, totaling 1,000 points throughout the semester. The breakdown of points is as follows:

  • Labs: 210 points (14 × 15 points; 5 for participation, 10 for Python notebook)
  • Homework: 210 points (33 total, 30 × 7 points and lowest 3 dropped)
  • Projects: 100 points
  • Midterm Exam 1: 120 points (Computer-based, in the CBTF)
  • Midterm Exam 2: 120 points (Computer-based, in the CBTF)
  • Comprehensive Final Exam: 240 points (Computer-based, in the CBTF)
    = 1,000 total points

Grade Letter Cutoffs

Course points will be translated into a course grade at the end of the semester.

Points EarnedMinimum GradePoints EarnedMinimum GradePoints EarnedMinimum Grade
ExceptionalA+[930, 1000+)A[900, 930)A-
[870, 900)B+[830, 870)B[800, 830)B-
[770, 800)C+[730, 770)C[700, 730)C-
[670, 700)D+[630, 670)D[600, 630)D-
  [600, 0)F  

We might lower these cutoffs, however, we won’t raise them. (In recent semesters these cutoffs have not moved significantly from these targets.)

Extra Credit

There are multiple opportunities for extra credit in DISCOVERY (usually called "+1 points"). Points for extra credit work will be assigned after grade cutoffs are determined, so they are a true bonus to your score. The total amount of extra credit you can earn varies each semester. We will announce extra credit opportunities throughout the semester! Some of them will include:

  • Surveys: These will be done using Google Forms with your @illinois.edu email address.
  • MicroProjects: After each lecture, there will be a MicroProject available for you to complete for +1 EC. These are designed for you to do "real data science in under an hour".
  • Notebook: At the end of the semester, if your entire notebook is filled out, you will earn +15 EC points. This is the largest opportunity for EC in the course.
    ...and more!

Course Assignments

Labs: Labs will be completed each week in your lab section. Labs are due Mondays at 11:59pm. There will be a lab every week, even during exam weeks.

Homeworks (HWs): In general, there will be one short mastery-based homework assignment for each lecture. Homeworks will be due at 11:59pm on Monday, Wednesday, or Friday (homeworks will be due the following lecture after it’s assigned; ex: if it’s assigned on Monday, it will be due Wednesday at 11:59pm).

Projects: There will be 2 projects throughout the semester, which are longer and more involved than the weekly labs. You will have at least one week (and usually more) to complete the project. The projects are an opportunity to apply everything you’ve learned in DISCOVERY and explore your individual interests and passions.

Exams: There will be 2 midterm exams and a cumulative final. Exams will be taken in the Computer Based Testing Facility (CBTF). See https://cbtf.engr.illinois.edu/ for details.

Course Calendar

(Note: For specific due dates for Spring 2024, make sure to check the Spring 2024 calendar on Canvas.)

LectureTopicsAssignment
Module 1: Basics of Data Science with Python
#1What is Data Science? and Types of DataHomework #1: Party Words Survey
#2Experimental Design and Blocking and Python for Data Science: Introduction to DataFramesHomework #2: Mastery Platform (m1-02)
#3Row Selection with DataFramesHomework #3: Mastery Platform (m1-03)
#4Observational Studies, Confounders, and StratificationHomework #4: Mastery Platform (m1-04)
#5Simpson's ParadoxHomework #5: Mastery Platform (m1-05)
#6DataFrames with ConditionalsHomework #6: Mastery Platform (m1-06/07)
#7Software Version Control with git and Exploratory Data Analysis (EDA) OverviewHomework #7: Mastery Platform (Module 1 Review)
Module 2: Exploratory Data Analysis
#8Descriptive StatisticsHomework #8: Mastery Platform (m2-02)
#9Grouping Data in PythonHomework #9: Mastery Platform (m2-03)
#10Histograms and Quartiles and Box PlotsHomework #10: Mastery Platform (m2-04/05)
#11Basic Data Visualization in PythonHomework #11: Practice Exam #1
Module 3: Simulation and Distributions
#12Overview of Simulation and Random Numbers in PythonHomework #12: Mastery Platform (m3-01/02)
#13For-Loops in Python and Simple Simulations in PythonNo Homework (Midterm 1 Upcoming!)
#14No Lecture: Midterm #1 (Covers Module #1 and Module #2)
#15Sample Space amd Conditionals in PythonHomework #13: Mastery Platform (m3-03/04)
#16Day #1 of Functions in Python and Project #1 ReleasedHomework #14: Mastery Platform (m3-05)
#17Day #2 of Functions in PythonHomework #15: Mastery Platform (m3-06)
#18Normal DistributionHomework #16: Mastery Platform (m3-07)
#19Law of Large NumbersHomework #17: Mastery Platform (m3-08)
Module 4: Prediction and Probability
#20Probability Introduction and The Monty Hall ProblemHomework #18: Mastery Platform (Module 3 Review)
#21Multi-event Probability: Multiplication RuleHomework #19: Mastery Platform (m4-03)
#22Multi-event Probability: Addition RuleHomework #20: Mastery Platform (m4-04)
#23Conditional ProbabilityHomework #21: Mastery Platform (m4-05)
#24Bayes' TheoremHomework #22: Mastery Platform (m4-06)
#25Review/Complex Examples: Conditional Probability and Bayes' TheoremHomework #23: Midterm #2 Review
Module 5: Polling, Confidence Intervals, and the Normal Distribution
#26Random VariablesNo Homework (Project #1 Due Soon!)
#27Bernoulli and Binomial Random VariablesNo Homework (Project #1 Due!)
#28Beyond DISCOVERY: STAT 207, CS 277, CS 307 and more!No Homework (Midterm #2 Upcoming!)
#29No Lecture: Midterm #2 (Covers Module #3 and #4)
#30Python Functions for Random DistributionsHomework #24: Mastery Platform (m5-02/03)
#31Central Limit TheoremHomework #25: Mastery Platform (m5-04)
#32Polling and SamplingHomework #26: Mastery Platform (m5-05)
#33Confidence IntervalsHomework #27: Mastery Platform (m5-06)
#34Hypothesis TestingHomework #28: Mastery Platform (m5-07)
Module 6: Towards Machine Learning
#35Overview of Machine LearningHomework #29: Mastery Platform (Module 5 Review)
#36CorrelationHomework #30: Mastery Platform (m6-02)
#37Linear RegressionHomework #31: Mastery Platform (m6-03)
#38Machine Learning Models in Python with sk-learnNo Homework (Project #2 Due Soon!)
#39ClusteringNo Homework (Project #2 Due Today!)
#40Towards Machine Learning in PythonHomework #32: Mastery Platform (m6-04/05)
#41Last Day of Lecture: Topics in Data ScienceHomework #33: Final Exam Review
Comprehensive Final Exam

Office Hours and Getting Help

Open office hours are fantastic for getting help on understanding course concepts, getting help on assignments, debugging your code, and more! All open office hours will have multiple TAs and/or CAs available to help you out! We have open office hours every single weekday and Prof. Wade and Prof. Karle have professor office hours one day each week.

See the "Office Hours" section in the Canvas syllabus for Spring 2024 office hour times and locations.

Course Policies

Attendance: Attendance is strongly recommended for lecture but is not required. Attendance in your lab section makes up one third of the points of your total lab grade. If you cannot come to your lab due to illness, you can email your lab TA before your lab starts to discuss an alternative.

Late Work: No late submissions are accepted. However, we do drop your 3 lowest homework assignments at the end of the semester. So missing 3 homework assignments won’t hurt your grade. Late projects and late labs will not be accepted.

Rounding Grades/Curves: Due to the large amount of extra credit offered in DISCOVERY, we do not round grades or have curves on exams.

DRES Accommodations: We are happy to offer accommodations for disabilities verified through DRES (http://www.disability.illinois.edu/). Please email us a copy of your DRES letter during Week 1. Since all of our exams will be in the CBTF, you can request extra time through this link: https://cbtf.illinois.edu/students/dres. If you have any other questions or need any other accommodations, don’t hesitate to reach out! :)

Learning Collaboratively: Data Science is a collaborative science. Do not try to tackle this course alone. We strongly encourage you to discuss all of your course activities (with the exception of exams) with your friends and classmates! You will learn more by talking through the problems, teaching others, and sharing ideas.

Continue to read on “Academic Integrity” to understand the difference between collaboration and giving an answer away.

Academic Integrity: Collaboration is about working together. Collaboration is not giving the direct answer to a friend or sharing the source code to an assignment. Collaboration requires you to make a serious attempt at every assignment and discuss your ideas and doubts with others so everyone gets more out of the discussion. Your answers must be your own words and your code must be typed (not copied/pasted) by you.

Academic dishonesty is taken very seriously in STAT 107 and all cases will be brought to the University, your college, and your department. You should understand how academic integrity applies specifically to STAT 107: the sanctions for cheating on an assignment includes a loss of all points for the assignment, the loss of all extra credit in STAT 107, and that the final course grade is lowered by one whole letter grade (100 points). A second incident, or any cheating on an exam, results in an automatic F in the course.

Academic integrity includes protecting your work. If your work ends up submitted by someone else, we have considered this a violation of academic integrity just as though you submitted someone else’s work.