Monty Hall Problem - Interactive Game and Three Intuitive Solutions


A classic probability problem that has confused many since 1975 is the "Monty Hall Problem". It gets its name from the original host of the game show "Let's Make a Deal" whose name was Monty Hall. The problem brings up the following situation:

Suppose you are on a game show, and you're given the choice of three doors. There is a prize (usually a new car) behind one of the doors and the other two doors have goats behind them. You pick a door (for example, Door #1).

The host, who knows what's behind the doors, opens a losing door that you did not choose. The host finally asks you "Do you want to keep or switch your door?"

The Monty Hall Problem asks if it is to your advantage to switch the door?

Play the Game

Behind one of these three doors, we've hidden a prize! Choose your door to get lucky:

Initial Intuitions

Many people's intuition tells them that it doesn't matter if you switch or stay and that the probability of getting the car is the same either way. Many people think of the problem as:

  • Initially, there was one winning door and two losing doors and my chance of winning is 1/3.
  • When the host opens a losing door, there is still one winning door but only one losing door. Therefore, my probability of winning has increased to 1/2.
  • However, I do not have any information on the winning door so I would not gain an advantage by switching and I have a 50% (1/2) probability of winning.

This guide will provide three different explanations that each reveal this intuition is incorrect.

Explanation #1: Analysis of Initial Choices

One intuitive way to explain why the probability of winning increases if you switch is by considering the two cases that exist after Monty Hall has opened a door without the prize.

Case 1: Initially Choosing the Winning Door

  • 1/3 times that you play the game you will initially choose the winning door! 🎉

  • When you initially choose the winning door, both of the other doors are losing doors. The host may choose either door to open.

  • In this case you will get the prize (the car!) simply by staying with your original choice -- your original choice gave you a 1/3 chance of winning.

Case 2: Initially Choosing the Losing Door

  • The other 2/3 times you play the game you will initially choose the losing door. :(

  • When you initially choose a losing door, the two doors remaining includes the winning door and the other losing door.

  • Since the host MUST open a losing door, the host MUST reveal the remaining losing door (this the key consideration!).

  • After the host opens the losing door, and you initially picked a winning door, the door you would switch to MUST be the winning door.

  • In this case you will get the prize by simply switching to the unopened door -- in other words, if you pick the wrong door at the beginning, you should switch.

Conclusion

Since you're more likely to pick the wrong door at the beginning (there's a 2/3 chance of this), the probability of winning increases if you switch. Switching is the best decision! If you stay, you'll only get a win 1/3 of the time.

Explanation #2: Analysis of Sample Space

In any probabilistic event, you can always define the sample space of all possible outcomes. In this problem, the sample space includes our decision to stay or swap doors and two additional variables:

  • The location of the prize (Door 1, Door 2, or Door 3)
  • The initial door you picked (Door 1, Door 2, or Door 3)

Sample Space Analysis of Staying with Your Initial Door

Initial Door Chosen
Prize LocationYour Choice: Door #1Your Choice: Door #2Your Choice: Door #3
Prize Location: Door #1WINLOSELOSE
Prize Location: Door #2LOSEWINLOSE
Prize Location: Door #3LOSELOSEWIN

Among the nine possible outcomes, you win 1/3 of the time when you keep your original door.

Sample Space Analysis of Swapping with Your Initial Door

Initial Door Chosen
Prize LocationYour Choice: Door #1Your Choice: Door #2Your Choice: Door #3
Prize Location: Door #1LOSEWIN
The host would reveal Door #3, and you would swap from Door #2 to Door #1
WIN
The host would reveal Door #2, and you would swap from Door #3 to Door #1
Prize Location: Door #2WIN
The host would reveal Door #3, and you would swap from Door #1 to Door #2
LOSEWIN
The host would reveal Door #1, and you would swap from Door #3 to Door #2
Prize Location: Door #3WIN
The host would reveal Door #2, and you would swap from Door #1 to Door #3
WIN
The host would reveal Door #1, and you would swap from Door #2 to Door #3
LOSE

Conclusion

Among the nine possible outcomes, you win 2/3 of the time when you change your original door. Therefore, switching doors is the best decision!

Additional explanations, videos, and example problems covering sample space is part of the DISCOVERY course content found here:

Explanation #3: Simulation in Python

Using Python, we can write a simulation of playing the same 100,000 times (or more!).

import pandas as pd
import random

data = []
for i in range(100000):
  # == Simulate Real-World Variables ==
  prize_location = random.choice([1, 2, 3])
  initial_choice = random.choice([1, 2, 3])

  possible_reveal_doors = [1, 2, 3]
  # Host cannot reveal the winning door:
  possible_reveal_doors.remove(prize_location)
  # Host cannot reveal the door we picked (if it's still left in the list):
  if initial_choice in possible_reveal_doors:
    possible_reveal_doors.remove(initial_choice)
  # Host will randomly reveal one of the remaining doors:
  reveal_door = random.choice(possible_reveal_doors)

  stay_or_swap = random.choice(["stay", "swap"])

  if stay_or_swap == "swap":
    # The doors all add up to `6` (1 + 2 + 3).  If we subtract the
    # door number we initially picked and the one revealed, we're
    # left with the door that we swap to:
    final_choice = 6 - initial_choice - reveal_door
  else:
    final_choice = initial_choice

  # == Accumulate Variables ==
  d = {"prize_location": prize_location, "initial_choice": initial_choice, "reveal_door": reveal_door, "stay_or_swap": stay_or_swap, "final_choice": final_choice}
  data.append(d)

df = pd.DataFrame(data)
df
prize_locationinitial_choicereveal_doorstay_or_swapfinal_choice
0123swap1
1312swap3
2321stay2
3213stay1
4132stay3
..................
99995223stay2
99996213swap2
99997113stay1
99998231stay3
99999332swap1
Simulating the Monty Hall Problem 100,000 Times

Analysis of Simulation Results

In our simulation, the player is a winner when their final_choice is the same as the prize_location. Out of all 100,000 games played, nearly 50% of games were winning when we did not consider a specific strategy of staying with the initial door or swapping:

# Overall Win Percentage:
len(df[ df.final_choice == df.prize_location ]) / len(df)
0.49831
Overall Win Percentage

However, we find that if we look at only the games played where we stayed with our initial door, our win percentage drops all the way to 33%:

# Win Percentage when Staying with Initial Door:
df_stay = df[ df.stay_or_swap == "stay" ]
len(df_stay[ df_stay.final_choice == df_stay.prize_location ]) / len(df_stay)
0.3334530579666766
Win Percentage when Staying with Initial Door

Alternatively, we find that if we look at only the games played where we swapped our door, our win percentage jumps to over 66%:

# Win Percentage when Swapping Doors:
df_swap = df[ df.stay_or_swap == "swap" ]
len(df_swap[ df_swap.final_choice == df_swap.prize_location ]) / len(df_swap)
0.6639270321740002
Win Percentage when Swapping Doors

Conclusion

When a random strategy is used, the initial intuition of a 50% chance of winning is found. However, when controlling for the strategy used, the simulation shows the winning percentage is only 1/3 when staying with your initial door and 2/3 when swapping doors.

Additional explanations, videos, and example problems covering simulations, and the pattern we used to create the simulation above, is part of the DISCOVERY course content found here:

Additional Explanations