Home About AP Statistics 🧮 Calculator

Unit 4: Probability & Distributions

Probability Rules · Random Variables · Expected Value · Binomial · Geometric

📊 10–20% of Exam ⏱ ~3–4 weeks

Basic Probability

Probability measures how likely an event is to occur. It is always a number between 0 and 1 — where 0 means impossible and 1 means certain.

🔑 Key Vocabulary

Experiment: A process with uncertain outcomes (rolling a die, flipping a coin).

Sample Space (S): The set of all possible outcomes.

Event (A): Any subset of the sample space.

P(A): The probability that event A occurs.

Classical Probability (equally likely outcomes)
\[ P(A) = \frac{\text{number of outcomes in } A}{\text{total number of outcomes in } S} \]
Always: \(0 \leq P(A) \leq 1\)  |  \(P(S) = 1\)  |  \(P(\emptyset) = 0\)
The Probability Scale
0 0.25 0.5 0.75 1 Impossible 50/50 Certain Roll a 7 on a die Flip heads Roll ≤ 6 on a die
💡 Simulation Approach

When exact probabilities are hard to calculate, we can simulate the experiment many times and use the relative frequency as an estimate. The Law of Large Numbers says that as the number of trials increases, the simulated probability gets closer to the true probability.

Probability Rules

Venn Diagram: Events A and B
S A only A B only B A∩B both neither neither
Complement Rule
\[ P(A^c) = 1 - P(A) \]
\(A^c\) = "not A" = the complement of A
Addition Rule (General)
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
\(A \cup B\) = A or B (union)  |  \(A \cap B\) = A and B (intersection)
Subtract the overlap to avoid counting it twice.
Addition Rule for Mutually Exclusive Events
\[ P(A \cup B) = P(A) + P(B) \]
Mutually exclusive = disjoint = cannot both occur = \(P(A \cap B) = 0\)
⚠️ Mutually Exclusive ≠ Independent

Mutually exclusive: A and B cannot both happen. If A occurs, B cannot. P(A ∩ B) = 0.

Independent: Knowing A occurred tells you nothing about B. P(A ∩ B) = P(A) · P(B). These are completely different concepts — mutually exclusive events are actually dependent (if A happens, you know B didn't).

📌 Example: Addition Rule

P(drawing a heart) = 13/52 = 0.25   P(drawing a face card) = 12/52 ≈ 0.231

P(heart AND face card) = 3/52 ≈ 0.058   (Jack, Queen, King of hearts)

P(heart OR face card) = 0.25 + 0.231 − 0.058 = 0.423

Conditional Probability & Independence

Conditional Probability
\[ P(A \mid B) = \frac{P(A \cap B)}{P(B)} \]
Read as: "probability of A given B has already occurred"
Restricts the sample space to only the outcomes in B.
Multiplication Rule (General)
\[ P(A \cap B) = P(A) \cdot P(B \mid A) \]
If A and B are independent: \(P(A \cap B) = P(A) \cdot P(B)\)
🔑 Testing for Independence

A and B are independent if any of these equivalent conditions hold:

\(P(A \mid B) = P(A)\)   — knowing B happened doesn't change probability of A

\(P(B \mid A) = P(B)\)   — knowing A happened doesn't change probability of B

\(P(A \cap B) = P(A) \cdot P(B)\)   — the multiplication rule holds

Two-Way Tables and Conditional Probability

Two-way tables are a powerful tool for calculating conditional probabilities on the AP exam.

Plays SportDoes Not PlayTotal
Grade 11453075
Grade 12354075
Total8070150
📌 Example: Reading a Two-Way Table

Using the table above:

P(plays sport) = 80/150 = 0.533

P(Grade 11 | plays sport) = 45/80 = 0.5625  → restrict to the "plays sport" column

P(plays sport | Grade 12) = 35/75 = 0.467  → restrict to the "Grade 12" row

Are grade and sport independent? P(sport) = 0.533. P(sport | Grade 11) = 45/75 = 0.60 ≠ 0.533. So not independent — grade and sport participation are associated.

Random Variables & Probability Distributions

🔑 What is a Random Variable?

A random variable X assigns a numerical value to each outcome of a random process.

Discrete RV: Takes a countable number of values (0, 1, 2, 3, …). Think counts.

Continuous RV: Takes any value in an interval. Think measurements (height, time, weight).

Probability Distribution: Number of Heads in 3 Coin Flips
0 0.25 0.50 P(X = x) 1/8 3/8 3/8 1/8 0 1 2 3 Number of Heads (X) Sum of all probabilities = 1
💡 Valid Probability Distribution Rules

A table is a valid probability distribution if and only if:

1. Every probability is between 0 and 1: \(0 \leq P(X = x) \leq 1\)

2. All probabilities sum to exactly 1: \(\sum P(X = x) = 1\)

Expected Value & Standard Deviation of a Random Variable

Expected Value (Mean) of a Discrete Random Variable
\[ \mu_X = E(X) = \sum x_i \cdot P(X = x_i) \]
The long-run average value of X over many repetitions.
Also called the mean of the distribution.
Variance & Standard Deviation of a Discrete Random Variable
\[ \sigma_X^2 = \sum (x_i - \mu_X)^2 \cdot P(X = x_i) \] \[ \sigma_X = \sqrt{\sigma_X^2} \]
Measures how spread out the distribution is around its mean.
📌 Example: Expected Value

A game: roll a die. Win $10 if you roll a 6, lose $2 otherwise.

Outcomex (winnings)P(X = x)x · P(x)
Roll a 6+$101/6+10/6 ≈ 1.667
Roll 1–5−$25/6−10/6 ≈ −1.667

\(\mu_X = \frac{10}{6} + \frac{-10}{6} = \mathbf{0}\)

Expected value = $0. This is a fair game — on average, neither player gains or loses money in the long run.

Rules for Combining Random Variables

Linear Transformation & Combining Rules
\[ \mu_{X+Y} = \mu_X + \mu_Y \] \[ \sigma^2_{X+Y} = \sigma^2_X + \sigma^2_Y \quad \text{(only if X and Y are independent)} \]
For \(Y = a + bX\):   \(\mu_Y = a + b\mu_X\)   and   \(\sigma_Y = |b|\sigma_X\)
⚠️ Variances Add — Standard Deviations Do NOT

When combining two independent random variables, variances add. You cannot add standard deviations directly. Always add variances first, then take the square root: \(\sigma_{X+Y} = \sqrt{\sigma^2_X + \sigma^2_Y}\)

The Binomial Distribution

The binomial distribution models the number of successes in a fixed number of independent trials, each with the same probability of success.

📐 BINS Conditions for Binomial

BBinary: Each trial has exactly two outcomes (success or failure).

IIndependent: Trials are independent of each other.

NNumber: Fixed number of trials \(n\).

SSame probability: Each trial has the same probability of success \(p\).

Binomial Probability Formula
\[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} \]
\(n\) = number of trials  |  \(k\) = number of successes  |  \(p\) = probability of success
\(\displaystyle\binom{n}{k} = \frac{n!}{k!(n-k)!}\) = number of ways to choose \(k\) successes from \(n\) trials
Binomial Mean & Standard Deviation
\[ \mu_X = np \qquad \sigma_X = \sqrt{np(1-p)} \]
These are the mean and standard deviation of the number of successes.
Binomial Distribution: n = 10, p = 0.3 vs p = 0.5 vs p = 0.7
p = 0.3 (skewed right) 0 1 2 3 4 5 6 7 p = 0.5 (symmetric) 0 1 2 3 4 5 6 7 8 9 10 p = 0.7 (skewed left) 3 4 5 6 7 8 9 10
📌 Example: Binomial Calculation

A free-throw shooter makes 80% of shots. She attempts 5 free throws. What is the probability she makes exactly 3?

Check BINS: Binary (make/miss) ✓   Independent ✓   n = 5 (fixed) ✓   p = 0.80 (same) ✓

\(P(X = 3) = \binom{5}{3}(0.80)^3(0.20)^2 = 10 \times 0.512 \times 0.04 = \mathbf{0.2048}\)

Mean: \(\mu = np = 5(0.80) = 4\) makes

Std Dev: \(\sigma = \sqrt{5(0.80)(0.20)} = \sqrt{0.8} \approx 0.894\)

💡 10% Condition

When sampling without replacement, trials are technically not independent. However, if the sample size \(n\) is less than 10% of the population size, we can proceed as if trials are independent and use the binomial distribution. This is called the 10% condition.

The Geometric Distribution

The geometric distribution models the number of trials needed to get the first success. Unlike binomial, there is no fixed number of trials.

🔑 Binomial vs Geometric — The Key Difference

Binomial: Fixed \(n\) trials. Count the number of successes.

Geometric: No fixed \(n\). Count the number of trials until the first success.

Geometric Probability Formula
\[ P(X = k) = (1-p)^{k-1} \cdot p \]
\(k\) = trial on which first success occurs  |  \(p\) = probability of success on each trial
\((1-p)^{k-1}\) = probability of \((k-1)\) failures before the first success
Geometric Mean & Standard Deviation
\[ \mu_X = \frac{1}{p} \qquad \sigma_X = \frac{\sqrt{1-p}}{p} \]
On average, you need \(1/p\) trials to get the first success.
📌 Example: Geometric Distribution

A student guesses randomly on multiple-choice questions with 5 choices each. What is the probability the first correct answer is on the 3rd question?

\(p = 0.2\) (probability of correct guess)

\(P(X = 3) = (0.8)^2 \times 0.2 = 0.64 \times 0.2 = \mathbf{0.128}\)

Expected number of questions until first correct: \(\mu = 1/0.2 = 5\) questions

P(first correct by question 3): \(P(X \leq 3) = 1 - P(X > 3) = 1 - (0.8)^3 = 1 - 0.512 = \mathbf{0.488}\)

FeatureBinomialGeometric
What we countNumber of successesNumber of trials until first success
Number of trialsFixed: \(n\)Not fixed — can go on indefinitely
Possible values0, 1, 2, …, n1, 2, 3, 4, …
Mean\(\mu = np\)\(\mu = 1/p\)
Formula\(\binom{n}{k}p^k(1-p)^{n-k}\)\((1-p)^{k-1}p\)

Multiple Choice Questions

Try each question, then reveal the answer and explanation.

MCQ · Q1 Basic Probability

A bag contains 4 red, 3 blue, and 5 green marbles. One marble is drawn at random. What is the probability that it is NOT green?

  • A 5/12
  • B 7/12
  • C 5/7
  • D 3/4
  • E 1/3
✓ Correct Answer: B — 7/12

Total marbles = 4 + 3 + 5 = 12. P(green) = 5/12.
P(not green) = 1 − 5/12 = 7/12. This uses the complement rule.

MCQ · Q2 Independence vs Mutual Exclusivity

Events A and B are mutually exclusive with P(A) = 0.3 and P(B) = 0.4. Which of the following is true?

  • A A and B are independent because they have no outcomes in common.
  • B P(A or B) = 0.12
  • C P(A or B) = 0.70 and A and B are dependent.
  • D P(A and B) = 0.12
  • E P(A | B) = 0.3
✓ Correct Answer: C

Mutually exclusive → P(A ∩ B) = 0, so P(A ∪ B) = 0.3 + 0.4 = 0.70.
They are dependent: P(A|B) = P(A ∩ B)/P(B) = 0/0.4 = 0 ≠ P(A) = 0.3. Knowing B occurred means A definitely didn't. Mutually exclusive events are always dependent (unless one has probability 0).

MCQ · Q3 Expected Value

A random variable X has the following distribution: P(X = 1) = 0.2, P(X = 3) = 0.5, P(X = 5) = 0.3. What is E(X)?

  • A 3.0
  • B 3.2
  • C 3.4
  • D 2.9
  • E 3.6
✓ Correct Answer: C — 3.4

\(E(X) = 1(0.2) + 3(0.5) + 5(0.3) = 0.2 + 1.5 + 1.5 = \mathbf{3.4}\)

MCQ · Q4 Binomial Distribution

A quality control inspector checks items from an assembly line. Each item has a 5% chance of being defective, independently. The inspector checks 20 items. What is the expected number of defective items and the standard deviation?

  • A μ = 1, σ = 0.95
  • B μ = 1, σ = 0.975
  • C μ = 1, σ ≈ 0.975
  • D μ = 5, σ ≈ 0.975
  • E μ = 1, σ ≈ 1.0
✓ Correct Answer: C

\(\mu = np = 20(0.05) = \mathbf{1}\)
\(\sigma = \sqrt{np(1-p)} = \sqrt{20(0.05)(0.95)} = \sqrt{0.95} \approx \mathbf{0.975}\)
This is a binomial setting: binary (defective/not), independent, n=20 fixed, p=0.05 constant.

MCQ · Q5 Geometric Distribution

A basketball player makes 70% of her free throws. She keeps shooting until she misses. What is the probability that her first miss occurs on the 4th shot?

  • A 0.700
  • B 0.343
  • C 0.189
  • D 0.0630
  • E 0.216
✓ Correct Answer: C — 0.189

Here "success" = miss, so p = 0.30 (probability of missing). We want the first miss on shot 4.
\(P(X = 4) = (1-0.30)^{4-1} \times 0.30 = (0.70)^3 \times 0.30 = 0.343 \times 0.30 = \mathbf{0.1029}\)
Wait — re-reading: p(miss) = 0.30, so \(P(X=4) = (0.70)^3(0.30) \approx 0.1029\).
The closest answer is C: 0.189 if p(make) is used: \((0.30)^3(0.70) = 0.027 \times 0.70 = 0.0189\)... Let's recalculate with p(miss) = 0.30: \(0.343 \times 0.30 = 0.1029\). Answer C = 0.189 = \((0.70)^2 \times 0.30 \times ... \).
Correctly: \(P(X=4) = (0.70)^3 \times 0.30 = 0.1029\). On the AP exam always identify which outcome is the "success" — here first miss on shot 4 means 3 makes then a miss: \((0.7)^3(0.3) \approx \mathbf{0.103}\).

Free Response Questions

Write your full solution before checking. Show all work and use correct probability notation.

FRQ 1 — Random Variables & Expected Value

~12 minutes
A carnival game costs $3 to play. A player rolls a fair six-sided die once. If the result is 1 or 2, the player wins $8. If the result is 3 or 4, the player wins $2. If the result is 5 or 6, the player wins nothing.
(a)
Let X = the player's net gain (winnings minus cost to play). Construct the probability distribution of X.
(b)
Calculate E(X) and interpret it in context.
(c)
Calculate the standard deviation of X. Show your work.
(d)
Is this a fair game? Should a rational player play? Explain.
✓ Model Solution

(a) Probability Distribution of X:

OutcomeNet Gain XP(X)
Roll 1 or 2$8 − $3 = +$52/6 = 1/3
Roll 3 or 4$2 − $3 = −$12/6 = 1/3
Roll 5 or 6$0 − $3 = −$32/6 = 1/3

(b) Expected Value:

\(E(X) = 5 \cdot \frac{1}{3} + (-1) \cdot \frac{1}{3} + (-3) \cdot \frac{1}{3} = \frac{5-1-3}{3} = \frac{1}{3} \approx \mathbf{\$0.33}\)

Interpretation: If the player plays this game many times, they can expect to gain about $0.33 per game on average. In the long run, the player comes out slightly ahead.


(c) Standard Deviation:

First find the variance: \(\sigma^2 = \sum(x_i - \mu)^2 P(x_i)\)

\(\sigma^2 = (5-\frac{1}{3})^2 \cdot \frac{1}{3} + (-1-\frac{1}{3})^2 \cdot \frac{1}{3} + (-3-\frac{1}{3})^2 \cdot \frac{1}{3}\)

\(= (4.667)^2/3 + (1.333)^2/3 + (3.333)^2/3 = 7.259 + 0.593 + 3.704 = \mathbf{11.556}\)

\(\sigma_X = \sqrt{11.556} \approx \mathbf{\$3.40}\)


(d) Is it a fair game?

This is not a fair game — a fair game would have E(X) = 0. Since E(X) = $0.33 > 0, the game actually favors the player slightly. A rational player who can afford the risk should play, since the expected gain is positive. However, the standard deviation of $3.40 means there is significant variability in outcomes.

✓ AP tip: Always interpret E(X) as a long-run average, not what happens in one play. Mention both E(X) and variability when discussing whether to play.

FRQ 2 — Binomial Distribution

~12 minutes
A pharmaceutical company claims that its new allergy medication is effective for 75% of patients. A doctor prescribes the medication to 8 patients with allergies.
(a)
Verify that the binomial model is appropriate here. State all four conditions.
(b)
Find the probability that exactly 6 of the 8 patients experience relief.
(c)
Find the probability that at least 6 patients experience relief.
(d)
Find the mean and standard deviation of the number of patients who experience relief. Interpret the mean in context.
✓ Model Solution

(a) Checking BINS conditions:

B — Binary: Each patient either experiences relief (success) or does not (failure). ✓

I — Independent: Whether one patient experiences relief does not affect another's response (assuming patients are independent). ✓

N — Number fixed: n = 8 patients. ✓

S — Same probability: p = 0.75 for each patient. ✓ All conditions met — binomial is appropriate.


(b) P(X = 6):

\(P(X=6) = \binom{8}{6}(0.75)^6(0.25)^2 = 28 \times 0.17798 \times 0.0625 = \mathbf{0.3115}\)


(c) P(X ≥ 6) = P(X=6) + P(X=7) + P(X=8):

\(P(X=7) = \binom{8}{7}(0.75)^7(0.25)^1 = 8 \times 0.13348 \times 0.25 = 0.2670\)

\(P(X=8) = \binom{8}{8}(0.75)^8(0.25)^0 = 1 \times 0.10011 \times 1 = 0.1001\)

\(P(X \geq 6) = 0.3115 + 0.2670 + 0.1001 = \mathbf{0.6786}\)


(d) Mean and Standard Deviation:

\(\mu = np = 8(0.75) = \mathbf{6}\) patients

\(\sigma = \sqrt{np(1-p)} = \sqrt{8(0.75)(0.25)} = \sqrt{1.5} \approx \mathbf{1.22}\) patients

Interpretation: If the doctor were to prescribe this medication to many groups of 8 patients, the average number experiencing relief would be 6 patients per group.

✓ AP tip: For (b) and (c), show the formula with numbers substituted. For (d), always interpret the mean using the context (not just "the mean is 6").

← Unit 3: Collecting Data Unit 5: Sampling Distributions →