The Normal Curve & the Empirical Rule
Almost every quantitative variable that piles up in the middle and tapers at the edges is well-approximated by a normal model. The famous 68–95–99.7 rule says that for any normal distribution, those percentages of data fall within 1, 2, and 3 standard deviations of the mean. Change the spread (σ) — the curve gets wider or narrower, but the percentages never move.
Mean vs. Median: Who Gets Pulled?
A classic exam trap: which is bigger, the mean or the median? The answer lives in the shape. The median sits at the 50% mark and barely flinches, but the mean is a balance point that the long tail drags toward it. Slide from a left skew to a right skew and watch the two lines separate.
Correlation & the Least-Squares Line
Click anywhere in the plot to drop a point. The correlation r and the least-squares regression line ŷ = a + bx recompute instantly. Notice how a single far-away point (an influential outlier) can swing the line — and how r only measures the strength of a linear pattern.
The Binomial Distribution
Count the successes in n independent trials, each with success probability p. The result is a binomial distribution with mean np and standard deviation √(np(1−p)). Slide p away from 0.5 to see it skew; turn on the normal overlay to see when the approximation is safe (the rule of thumb: np ≥ 10 and n(1−p) ≥ 10).
The Law of Large Numbers
Probability is a long-run idea, not a short-run one. Flip a coin a few times and the proportion of heads bounces wildly; flip it thousands of times and it settles toward the true probability. This is why "I'm due for a win" is a fallacy — the average converges, but the coin has no memory.
The Central Limit Theorem
The crown jewel of the course. Start with any population shape — even a wildly skewed or bimodal one. Take a sample of size n, record its mean, and repeat thousands of times. The distribution of those sample means (right) becomes approximately normal, centered at the population mean, with standard deviation σ/√n. Larger n → tighter, more normal.
What "95% Confidence" Really Means
A confidence level describes the method, not any single interval. Each horizontal bar below is one sample's interval for the true mean (the gold line). Build hundreds of them: about 95% will capture the truth and about 5% will miss. Raise the confidence level and the intervals get wider — the price of being right more often.
Hypothesis Tests & the p-value
A significance test asks: if the null hypothesis were true, how surprising is my data? The curve below is the distribution of the test statistic assuming H₀ is true. The shaded tail(s) are the p-value — the chance of a result at least this extreme. Slide the observed statistic out toward the tail and watch the p-value shrink past your significance level α.