Simple Simulations in Python
Let's start writing a simulation in Python! Simulations are used from everything to medical research, fashion, launching rockets, and more, but we're going to start off with several very basic simulation -- but the basic principles are the same! To write a simulation, we must identify all factors that might influence the outcome of the simulation and write Python code to simulate each of these factors.
Simulation
The objective of the code we will develop is to store the results of every run of our simulation in a DataFrame. By storing the data in a DataFrame, you can use all the tools and techniques you already know to select a subset of rows of in a DataFrame, to group data within a DataFrame, to find descriptive statistics about data in the DataFrame, and more!
Almost all simulations will follow a similar "pattern" where we need to only modify the pattern in a few select areas to create a simulation to solve a variety of different problems.
Simulation Pattern
Every simulation we will write will follow a six-step pattern:
We will create a initially empty Python List called
data
to accumulate each run of our simulation. This will always bedata = []
.We will write a for-loop to run a block of code for each run of our simulation. For a 10,000 run simulation,
for i in range(10000):
.Inside of the for-loop, we will simulate all real-world factors. For a simple simulation of a six-sided die roll,
roll = random.randint(1, 6)
is the only real-world variable.Inside of the for-loop, we will accumulate all real-world factors we simulated in Python dictionary called
d
. A dictionary is a list of key-value pairs, enclosed in curly braces, and separated by commas.We will always name the key in our dictionary the same as our real-world factor, except the key must have quotes around it.
For example, if you have a single real-world variable
roll
, our dictionaryd
is:d = { "roll": roll }
.If we have two real world variables
red
andblue
', our dictionaryd
separates the two variables with a comma:d = { "red": red, "blue": blue }
.If the real-world variable is
height
, our dictionaryd
is:d = { "height": height }
.If we have two real world variables
one
andtwo
', our dictionaryd
is:d = { "one": one, "two": two }
.We will always refer to our variable by the variable name itself. (The effect of this is that we are creating a column in our DataFrame labeled with the name of our variable.)
Inside of the for-loop, we will append our dictionary to our list
data
. This will always be:data.append(d)
.Finally, outside of the for-loop, we will save our
data
as a DataFramedf
. This will always be:df = pd.DataFrame(data)
, which creates a DataFrame out ofdata
.
Simulate Rolling Die
One of the most simple simulations we can write is to simulate rolling fair, six-sided die.
Example: Simulating Rolling a Six-sided Die
Using the six-sided die example, the full simulation code to simulate rolling a six-sided die 600 times and saving the results will be six lines of code:
data = [] # Step 1, empty list `data`
for i in range(600): # Step 2: for-loop
roll = random.randint(1, 6) # Step 3: simulate real-world factors
d = { "roll": roll } # Step 4: accumulate factors in dictionary `d`
data.append(d) # Step 5: append `d` to `data`
df = pd.DataFrame(data) # Step 6: create the DataFrame (outside of the for-loop)
# In a second cell, we'll print out `df` (otherwise we would be re-running the simulation)
df
Example: Simulating Rolling Two Six-sided Dice
If we want to roll two six-sided dice, there are now two real-world factors that happen every simulation. Let's think of one die as a "white" die (variable white
) and the other as the "black" die (variable black
):
# Step 1, empty list `data`:
data = []
# Step 2: for-loop:
for i in range(600):
# Step 3: simulate all real-world factors:
black = random.randint(1, 6)
white = random.randint(1, 6)
# Step 4: accumulate all factors in dictionary `d`:
d = { "white": white, "black": black }
# Step 5: append `d` to `data`
data.append(d)
# Step 6: create the DataFrame (outside of the for-loop)
df = pd.DataFrame(data)
# In a second cell, we'll print out `df` (otherwise we would be re-running the simulation)
df
Example Walk-Throughs with Worksheets
Video 1: Writing Simulations in Python I
Video 2: Writing Simulations in Python II
Practice Questions
Q1: How do you append data in the variable `courses` to a Python list stored in the variable `schedule`?Q2: If we have simulation data in three variables `math`, `english`, `science`, then the dictionary `d` containing these variables is...