Simple Simulations in Python


Let's start writing a simulation in Python! Simulations are used from everything to medical research, fashion, launching rockets, and more, but we're going to start off with several very basic simulation -- but the basic principles are the same! To write a simulation, we must identify all factors that might influence the outcome of the simulation and write Python code to simulate each of these factors.

Simulation

The objective of the code we will develop is to store the results of every run of our simulation in a DataFrame. By storing the data in a DataFrame, you can use all the tools and techniques you already know to select a subset of rows of in a DataFrame, to group data within a DataFrame, to find descriptive statistics about data in the DataFrame, and more!

Almost all simulations will follow a similar "pattern" where we need to only modify the pattern in a few select areas to create a simulation to solve a variety of different problems.

Simulation Pattern

Every simulation we will write will follow a six-step pattern:

  1. We will create a initially empty Python List called data to accumulate each run of our simulation. This will always be data = [].

  2. We will write a for-loop to run a block of code for each run of our simulation. For a 10,000 run simulation, for i in range(10000):.

  3. Inside of the for-loop, we will simulate all real-world factors. For a simple simulation of a six-sided die roll, roll = random.randint(1, 6) is the only real-world variable.

  4. Inside of the for-loop, we will accumulate all real-world factors we simulated in Python dictionary called d. A dictionary is a list of key-value pairs, enclosed in curly braces, and separated by commas.

    • We will always name the key in our dictionary the same as our real-world factor, except the key must have quotes around it.

    • For example, if you have a single real-world variable roll, our dictionary d is: d = { "roll": roll }.

    • If we have two real world variables red and blue', our dictionary d separates the two variables with a comma: d = { "red": red, "blue": blue }.

    • If the real-world variable is height, our dictionary d is: d = { "height": height }.

    • If we have two real world variables one and two', our dictionary d is: d = { "one": one, "two": two }.

    • We will always refer to our variable by the variable name itself. (The effect of this is that we are creating a column in our DataFrame labeled with the name of our variable.)

  5. Inside of the for-loop, we will append our dictionary to our list data. This will always be: data.append(d).

  6. Finally, outside of the for-loop, we will save our data as a DataFrame df. This will always be: df = pd.DataFrame(data), which creates a DataFrame out of data.

Simulate Rolling Die

One of the most simple simulations we can write is to simulate rolling fair, six-sided die.

Example: Simulating Rolling a Six-sided Die

Using the six-sided die example, the full simulation code to simulate rolling a six-sided die 600 times and saving the results will be six lines of code:

data = []                      # Step 1, empty list `data`
for i in range(600):           # Step 2: for-loop
  roll = random.randint(1, 6)  # Step 3: simulate real-world factors
  d = { "roll": roll }         # Step 4: accumulate factors in dictionary `d`
  data.append(d)               # Step 5: append `d` to `data`
df = pd.DataFrame(data)        # Step 6: create the DataFrame (outside of the for-loop)
# In a second cell, we'll print out `df` (otherwise we would be re-running the simulation)
df

Example: Simulating Rolling Two Six-sided Dice

If we want to roll two six-sided dice, there are now two real-world factors that happen every simulation. Let's think of one die as a "white" die (variable white) and the other as the "black" die (variable black):

# Step 1, empty list `data`:
data = []

# Step 2: for-loop:
for i in range(600):
  # Step 3: simulate all real-world factors:
  black = random.randint(1, 6)  
  white = random.randint(1, 6)

  # Step 4: accumulate all factors in dictionary `d`:
  d = { "white": white, "black": black }

  # Step 5: append `d` to `data`
  data.append(d)

# Step 6: create the DataFrame (outside of the for-loop)
df = pd.DataFrame(data)
# In a second cell, we'll print out `df` (otherwise we would be re-running the simulation)
df

Example Walk-Throughs with Worksheets

Video 1: Writing Simulations in Python I

Follow along with the worksheet to work through the problem:

Video 2: Writing Simulations in Python II

Follow along with the worksheet to work through the problem:

Practice Questions

Q1: How do you append data in the variable `courses` to a Python list stored in the variable `schedule`?
Q2: If we have simulation data in three variables `math`, `english`, `science`, then the dictionary `d` containing these variables is...