The first half of this lab will be spent getting you all set up for the semester – you will only need to do this once. :)
To begin to do Data Science, you need a few basic tools installed on your computer. All of these tools are free, open-source and industry standard. We have prepared guides based on what type of computer you have:
Python 3
If you find you only have Python 2.x, and not Python 3.x, you can install Python 3 form python.org here. You will need at least Python 3.7, and you should grab the latest.
Part 1b: Creating your STAT 107 git
repository
When working in Data Science, you will want to store all of your code and data together, in the cloud, in a “repository”. For Discovery, we will be using an Illinois-hosted repository called GitHub Enterprise.
Part 1c: Set up your Python notebook
In Data Science, all of our programming will be done in “notebooks”. Your python install will need a few libraries
in order to run the notebooks. Using your command line (either by going to your system’s command line interface or pressing Ctrl
+ `
is Visual Studio Code), run the following:
This will take a bit. You will need to press [Enter] to confirm you want to install of of the packages (the option [y]/n shows that y
is default when you choose no option).
Other Commands
If the command above tells you that py
is not found, you can try these alternatives:
python3 -m pip install pandas
(particularly on Macs/OS X)
pip install pandas
(particularly if you have installed Python before this course)
python -m pip install pandas
(has worked for a few people when nothing else has worked)
Part 2: Complete the “Lab: Introduction” Notebook
Using your command line, navigate to your stat107
repository (cd Desktop
-> cd stat107
-> cd [NETID]
, unless you’re already there) and fetch the notebook from our release repository by running the following two git
commands:
git fetch release
git merge release/lab_intro -m "Merging initial files"
ONLY IF you get an error related to unrelated histories, use:
git merge release/lab_intro --allow-unrelated-histories -m "Merging initial files"
Open the lab_intro
folder inside of Visual Studio Code by going int Visual Studio Code and choosing File -> Open Folder, then:
- Open up the
lab_intro.ipynb
notebook
- Follow the instructions inside of the notebook