# Perception of Probability Words Dataset

The "Probability Words Dataset" is a survey of primary undergraduate students at The University of Illinois and their perception of words that describe a probability of rain on a given day.

• Dataset Format: Well-formatted CSV with column headers as the first row
• Dataset Size: 75 rows × 17 columns
• CSV File Location: https://waf.cs.illinois.edu/discovery/words.csv
• Dataset Variables:
• Almost Certain : number ➜ The respondent's perception of the probability when they are told it is "almost certain it will rain tomorrow".
• Highly Likely : number ➜ The respondent's perception of the probability when they are told it is "highly likely it will rain tomorrow".
• Very Good Chance : number ➜ The respondent's perception of the probability when they are told there is a "very good chance it will rain tomorrow".
• Probable : number ➜ The respondent's perception of the probability when they are told it is "probable it will rain tomorrow".
• Likely : number ➜ The respondent's perception of the probability when they are told it is "likely it will rain tomorrow".
• We Believe : number ➜ The respondent's perception of the probability when they are told that "we believe it will rain tomorrow".
• Probably : number ➜ The respondent's perception of the probability when they are told it "probably will rain tomorrow".
• Better than Even : number ➜ The respondent's perception of the probability when they are told it is "better than even it will rain tomorrow".
• About Even : number ➜ The respondent's perception of the probability when they are told it is "about even it will rain tomorrow".
• We Doubt : number ➜ The respondent's perception of the probability when they are told that "we doubt it will rain tomorrow".
• Improbable : number ➜ The respondent's perception of the probability when they are told it is "improbable it will rain tomorrow".
• Unlikely : number ➜ The respondent's perception of the probability when they are told it is "unlikely it will rain tomorrow".
• Probably Not : number ➜ The respondent's perception of the probability when they are told it will "probably not rain tomorrow".
• Little Chance : number ➜ The respondent's perception of the probability when they are told it is "little chance it will rain tomorrow".
• Almost No Chance : number ➜ The respondent's perception of the probability when they are told it is "almost no chance it will rain tomorrow".
• Highly Unlikely : number ➜ The respondent's perception of the probability when they are told it is "highly unlikely it will rain tomorrow".
• Chances are Slight : number ➜ The respondent's perception of the probability when they are told that "chances are slight it will rain tomorrow".
• Data Collection: When this dataset was presented to users, the phrases appeared in a random order as a measure to eliminate possible bias.

### Using the Probability Words Dataset in Python

The dataset can be loaded using the pandas library in Python:

import pandas as pd
df
Almost CertainHighly LikelyVery Good ChanceProbableLikelyWe BelieveProbablyBetter than EvenAbout EvenWe DoubtImprobableUnlikelyProbably NotLittle ChanceAlmost No ChanceHighly UnlikelyChances are Slight
0909080707070.070605020.02020202010.010.010
1109090657585.07060505.051510205.05.05
2909075707060.070605030.051030202.010.030
3959085708590.075605010.02205151.05.012
4959085506575.050655065.00302555.010.020
......................................................
70959055805085.06080505.021010451.015.05
71959080606075.070605020.0203030105.015.015
721009090808050.08060505.0301070305.040.030
73751090507550.050755025.00252555.05.030
74907070209550.065355030.00101082.05.010

The full Probability Words Dataset stored in a DataFrame (75 rows).

### Pages Using the Probability Words Dataset

1. Video Walk-Through & Worksheet: Learn Page: Calculating Quartiles and Outliers
2. Video Walk-Through & Worksheet: Learn Page: Histograms, Bar Charts, and Box Plots