Subject
Statistics
Describe data with summary statistics, reason about probability, and work with the normal distribution and regression.
Begin at I if you’re new to Statistics.
Mean, Median, Mode
Three ways to describe the middle of a data set. Mean averages, median picks the middle, mode finds what repeats most.
Begin →
Range and IQR
Two simple measures of spread. Range covers the full extent; IQR covers the middle 50%, robust against outliers.
Begin →
Standard Deviation
Variance measures the average squared deviation from the mean; standard deviation is its square root, in the original units.
Begin →
Five-Number Summary
Describe a data set with five values: min, Q1, median, Q3, max. The foundation of the box plot.
Begin →
Probability Fundamentals
When outcomes are equally likely, probability is a counting problem: favorable outcomes divided by total outcomes. Plus the complement rule.
Begin →
Independent Events
When two events don't affect each other, the probability they both happen is the product of their probabilities. The multiplication rule.
Begin →
Conditional Probability
Probability of A given B already happened. Restrict the sample space to just the B outcomes, then count favorable A within that.
Begin →
Permutations and Combinations
Count arrangements (order matters) and selections (order doesn't). Factorial, P(n,k), and C(n,k) are the three core tools.
Begin →
Expected Value
The long-run average of a discrete random variable. Multiply each value by its probability, then add. The key tool for analyzing games and decisions.
Begin →
Normal Distribution and z-scores
The bell curve, described by its mean and standard deviation. A z-score converts any raw value to standard-deviations-from-the-mean.
Begin →
Empirical Rule (68-95-99.7)
Quick percentages for normal data: 68% within 1 SD, 95% within 2 SD, 99.7% within 3 SD. Combined with symmetry, you can read off almost any region.
Begin →
Binomial Probability
Probability of exactly k successes in n independent trials with success probability p. P(X = k) = C(n, k) · p^k · (1-p)^(n-k).
Begin →
Correlation Coefficient
r measures the strength and direction of a linear relationship between two variables, from -1 to 1. r² is the fraction of variance explained.
Begin →
Linear Regression
The line of best fit y = mx + b. Compute slope and intercept from data; use the line to predict y from any x.
Begin →