Perception of Probability Words Dataset

The "Probability Words Dataset" is a survey of primary undergraduate students at The University of Illinois and their perception of words that describe a probability of rain on a given day.

Dataset Format: Well-formatted CSV with column headers as the first row
Dataset Size: 75 rows × 17 columns
CSV File Location: https://waf.cs.illinois.edu/discovery/words.csv
Dataset Variables:
- Almost Certain : number ➜ The respondent's perception of the probability when they are told it is "almost certain it will rain tomorrow".
- Highly Likely : number ➜ The respondent's perception of the probability when they are told it is "highly likely it will rain tomorrow".
- Very Good Chance : number ➜ The respondent's perception of the probability when they are told there is a "very good chance it will rain tomorrow".
- Probable : number ➜ The respondent's perception of the probability when they are told it is "probable it will rain tomorrow".
- Likely : number ➜ The respondent's perception of the probability when they are told it is "likely it will rain tomorrow".
- We Believe : number ➜ The respondent's perception of the probability when they are told that "we believe it will rain tomorrow".
- Probably : number ➜ The respondent's perception of the probability when they are told it "probably will rain tomorrow".
- Better than Even : number ➜ The respondent's perception of the probability when they are told it is "better than even it will rain tomorrow".
- About Even : number ➜ The respondent's perception of the probability when they are told it is "about even it will rain tomorrow".
- We Doubt : number ➜ The respondent's perception of the probability when they are told that "we doubt it will rain tomorrow".
- Improbable : number ➜ The respondent's perception of the probability when they are told it is "improbable it will rain tomorrow".
- Unlikely : number ➜ The respondent's perception of the probability when they are told it is "unlikely it will rain tomorrow".
- Probably Not : number ➜ The respondent's perception of the probability when they are told it will "probably not rain tomorrow".
- Little Chance : number ➜ The respondent's perception of the probability when they are told it is "little chance it will rain tomorrow".
- Almost No Chance : number ➜ The respondent's perception of the probability when they are told it is "almost no chance it will rain tomorrow".
- Highly Unlikely : number ➜ The respondent's perception of the probability when they are told it is "highly unlikely it will rain tomorrow".
- Chances are Slight : number ➜ The respondent's perception of the probability when they are told that "chances are slight it will rain tomorrow".
Data Collection: When this dataset was presented to users, the phrases appeared in a random order as a measure to eliminate possible bias.

Using the Probability Words Dataset in Python

The dataset can be loaded using the pandas library in Python:

import pandas as pd
df = pd.read_csv("https://waf.cs.illinois.edu/discovery/words.csv")
df

	Almost Certain	Highly Likely	Very Good Chance	Probable	Likely	We Believe	Probably	Better than Even	About Even	We Doubt	Improbable	Unlikely	Probably Not	Little Chance	Almost No Chance	Highly Unlikely	Chances are Slight
0	90	90	80	70	70	70.0	70	60	50	20.0	20	20	20	20	10.0	10.0	10
1	10	90	90	65	75	85.0	70	60	50	5.0	5	15	10	20	5.0	5.0	5
2	90	90	75	70	70	60.0	70	60	50	30.0	5	10	30	20	2.0	10.0	30
3	95	90	85	70	85	90.0	75	60	50	10.0	2	20	5	15	1.0	5.0	12
4	95	90	85	50	65	75.0	50	65	50	65.0	0	30	25	5	5.0	10.0	20
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
70	95	90	55	80	50	85.0	60	80	50	5.0	2	10	10	45	1.0	15.0	5
71	95	90	80	60	60	75.0	70	60	50	20.0	20	30	30	10	5.0	15.0	15
72	100	90	90	80	80	50.0	80	60	50	5.0	30	10	70	30	5.0	40.0	30
73	75	10	90	50	75	50.0	50	75	50	25.0	0	25	25	5	5.0	5.0	30
74	90	70	70	20	95	50.0	65	35	50	30.0	0	10	10	8	2.0	5.0	10

The full Probability Words Dataset stored in a DataFrame (75 rows).

Pages Using the Probability Words Dataset

Video Walk-Through & Worksheet: Learn Page: Calculating Quartiles and Outliers
Video Walk-Through & Worksheet: Learn Page: Histograms, Bar Charts, and Box Plots