Unit 7: Inference for Means

Section 7.1

Why t and Not z?

In Unit 6 we used z-procedures for proportions because we could compute the standard deviation of \(\hat{p}\) directly from \(p_0\). For means, the situation is different — the population standard deviation \(\sigma\) is almost never known in practice.

🔑 The Core Problem

The z-test for means requires knowing \(\sigma\). In real life we almost never know \(\sigma\), so we estimate it with the sample standard deviation \(s\).

But replacing \(\sigma\) with \(s\) introduces extra uncertainty. To account for this, we use the t-distribution instead of the Normal distribution — and the t-distribution has heavier tails to reflect that extra uncertainty.

Section 7.2

The t-Distribution

🔑 Key Facts About the t-Distribution

Shape: Symmetric and bell-shaped, like the Normal — but with heavier tails.

Degrees of freedom (df): One parameter that controls the shape. For one-sample procedures: \(df = n - 1\).

As df increases: The t-distribution approaches the Standard Normal. By df = 30+, they are very close.

Critical value \(t^*\): Always larger than the corresponding z* for the same confidence level — because t-distributions have heavier tails.

💡 On the AP Exam

The AP exam provides a t-table. You need to know: find the row for your degrees of freedom (df = n − 1) and the column for your confidence level or tail probability.

Your calculator can also compute t* and p-values directly. Know both methods.

Section 7.3

One-Sample t Confidence Interval for \(\mu\)

One-Sample t-Interval for a Population Mean

\[ \bar{x} \pm t^* \cdot \frac{s}{\sqrt{n}} \]

\(\bar{x}\) = sample mean | \(s\) = sample standard deviation | \(n\) = sample size
\(t^*\) = critical value from t-distribution with \(df = n-1\)
\(s/\sqrt{n}\) = standard error of the mean (SE)

Conditions for One-Sample t Procedures

📐 Three Conditions (RNI)

R — Random: Data from a random sample or randomized experiment.

N — Normal/Large Sample: Either:

• Population is Normally distributed, OR

• \(n \geq 30\) (CLT applies), OR

• Sample size is small BUT no strong skew or outliers in the data (check with a dotplot or histogram)

I — Independent: \(n \leq 10\%\) of population (if sampling without replacement).

⚠️ The Normal Condition for Means vs Proportions

Proportions (Unit 6): Large Counts — \(np \geq 10\) and \(n(1-p) \geq 10\)

Means (Unit 7): Normal/Large Sample — population Normal, OR \(n \geq 30\), OR small sample with no outliers/skew. These are completely different conditions for different procedures — don't mix them up.

📌 Example: One-Sample t CI

A random sample of 16 students has a mean study time of \(\bar{x} = 8.5\) hours/week with \(s = 2.4\) hours. Construct a 95% CI for the true mean study time. Assume the population is approximately Normal.

Conditions: Random ✓ | Normal (stated) ✓ | Independent (16 < 10% of all students) ✓

df = n − 1 = 15 | t* = 2.131 (from t-table, df=15, 95% CI)

\(SE = s/\sqrt{n} = 2.4/\sqrt{16} = 2.4/4 = 0.6\)

\(CI = 8.5 \pm 2.131(0.6) = 8.5 \pm 1.279\)

Interval: (7.221, 9.779) hours

✓ Interpretation: "We are 95% confident that the true mean study time for all students is between 7.22 and 9.78 hours per week."

Section 7.4

One-Sample t Test for \(\mu\)

One-Sample t Test Statistic

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

\(\mu_0\) = the hypothesized value of \(\mu\) from \(H_0\)
\(df = n - 1\) | p-value found from the t-distribution

💡 z vs t: Which to Use?

Situation	Use	Why
Inference for a proportion	z-procedure	SE uses p₀ or p̂, which is known
Inference for a mean, σ known	z-procedure	Rare in practice — σ almost never known
Inference for a mean, σ unknown	t-procedure	Must estimate σ with s — adds uncertainty

📌 Example: One-Sample t Test

The label on a bottling machine claims it fills bottles with 500 mL on average. A quality inspector takes a random sample of 25 bottles and finds \(\bar{x} = 497.3\) mL and \(s = 6.8\) mL. Is there evidence that the machine is underfilling? Use α = 0.05.

Step 1 — Hypotheses: \(H_0: \mu = 500\) vs \(H_a: \mu < 500\) (one-sided left)

Step 2 — Conditions: Random ✓ | Large enough sample (n=25, assume approx. Normal) ✓ | Independent ✓

Step 3 — Calculate:

\(t = \frac{497.3 - 500}{6.8/\sqrt{25}} = \frac{-2.7}{1.36} \approx -1.985\)

\(df = 24\) | p-value = P(t < −1.985) ≈ 0.029

Step 4 — Conclude: Since p-value (0.029) < α (0.05), we reject \(H_0\). There is convincing evidence that the true mean fill amount is less than 500 mL — the machine appears to be underfilling.

Section 7.5

Two-Sample t Procedures

When comparing means from two independent groups, we use two-sample t procedures. The two groups must be independent — observations in one group do not affect the other.

Two-Sample t Confidence Interval for \(\mu_1 - \mu_2\)

\[ (\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} \]

Two-Sample t Test Statistic

\[ t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}} \]

Degrees of freedom: Use your calculator (Welch's df formula is complex).
On the AP exam, always use technology for df in two-sample t procedures — the formula is not required.

⚠️ Do NOT Pool Standard Deviations

Unlike the two-proportion z-test which uses a pooled proportion, the AP Statistics course does not use a pooled standard deviation for two-sample t-tests. Always use the unpooled formula shown above, which uses \(s_1\) and \(s_2\) separately. Pooling is an extra assumption that AP Statistics avoids.

📌 Example: Two-Sample t Test

Do males and females differ in sleep duration? Random samples: Males (n₁=20): \(\bar{x}_1=7.1\)h, \(s_1=1.2\)h. Females (n₂=25): \(\bar{x}_2=7.6\)h, \(s_2=0.9\)h. Test at α = 0.05.

Hypotheses: \(H_0: \mu_1 = \mu_2\) vs \(H_a: \mu_1 \neq \mu_2\)

Conditions: Both random ✓ | Both samples large enough or approx Normal ✓ | Independent groups ✓

\(t = \frac{(7.1-7.6)-0}{\sqrt{1.44/20 + 0.81/25}} = \frac{-0.5}{\sqrt{0.072+0.0324}} = \frac{-0.5}{\sqrt{0.1044}} = \frac{-0.5}{0.323} \approx -1.55\)

Using calculator: df ≈ 34, p-value = 2P(t < −1.55) ≈ 0.130

Since p-value (0.130) > α (0.05), fail to reject \(H_0\). There is not convincing evidence of a difference in mean sleep duration between males and females.

Section 7.6

Matched Pairs Design

When observations come in natural pairs — the same subject measured twice, or two subjects matched on relevant characteristics — we use a matched pairs t-test. This is actually just a one-sample t-test on the differences.

🔑 The Matched Pairs Trick

Calculate the difference for each pair: \(d_i = x_{1i} - x_{2i}\)

Then treat the differences as a single dataset and run a one-sample t-test on \(\{d_i\}\).

Hypotheses: \(H_0: \mu_d = 0\) vs \(H_a: \mu_d \neq 0\) (or >0 or <0)

💡 Matched Pairs vs Two-Sample — How to Tell

Matched pairs: Same subject measured twice (before/after), or subjects paired by a common characteristic. n pairs → one dataset of n differences.

Two-sample t: Two completely separate, independent groups of subjects. Two separate datasets. The AP exam frequently tests your ability to identify which design was used.

Section 7.7

Conditions Summary & Procedure Selection

Procedure	When to Use	df	Key Formula
One-sample t CI	Estimating single population mean μ	\(n-1\)	\(\bar{x} \pm t^* \cdot s/\sqrt{n}\)
One-sample t test	Testing claim about single μ	\(n-1\)	\(t = (\bar{x}-\mu_0)/(s/\sqrt{n})\)
Two-sample t CI	Estimating difference \(\mu_1-\mu_2\) for two independent groups	Use calculator	\((\bar{x}_1-\bar{x}_2) \pm t^*\sqrt{s_1^2/n_1+s_2^2/n_2}\)
Two-sample t test	Testing \(\mu_1=\mu_2\) for two independent groups	Use calculator	\(t = (\bar{x}_1-\bar{x}_2)/\sqrt{s_1^2/n_1+s_2^2/n_2}\)
Matched pairs t	Two paired measurements — same subjects or matched pairs	\(n-1\) (n = # pairs)	\(t = \bar{d}/(s_d/\sqrt{n})\)

💡 AP Exam Checklist — Every t Procedure

✓ State hypotheses (test) or confidence level (CI) with correct parameter notation (\(\mu\), not \(\bar{x}\))

✓ Check ALL three conditions: Random, Normal/Large Sample, Independent

✓ For Normal condition with small samples: mention "no strong skew or outliers"

✓ Show formula, substitute values, compute t and df

✓ Conclude in context — "convincing evidence" (reject) or "not convincing evidence" (FTR)

Exam Practice

Multiple Choice Questions

Try each question, then reveal the answer and explanation.

MCQ · Q1 z vs t

A researcher wants to test whether the mean commute time for workers in a city exceeds 30 minutes. She takes a random sample of 22 workers and records their commute times. The population standard deviation is unknown. Which test is most appropriate?

A One-sample z-test for a mean
B One-sample t-test for a mean
C Two-sample t-test for a difference in means
D One-sample z-test for a proportion
E Matched pairs t-test

✓ Correct Answer: B

One group, one mean, σ unknown → one-sample t-test. We can't use z because σ is unknown. There is only one group (not two), and no pairing — so two-sample and matched pairs are ruled out.

MCQ · Q2 Degrees of Freedom

A one-sample t-test is conducted using a sample of size n = 18. What are the degrees of freedom, and how does the corresponding t* for 95% confidence compare to z* = 1.960?

A df = 18; t* = 1.960, same as z*
B df = 17; t* < 1.960
C df = 17; t* > 1.960
D df = 18; t* > 1.960
E df = 17; t* = 1.960

✓ Correct Answer: C

\(df = n - 1 = 17\). The t-distribution with df=17 has heavier tails than the Normal, so t* for 95% confidence is larger than z* = 1.960. Specifically, t* ≈ 2.110. This is always the case: t* > z* for finite degrees of freedom.

MCQ · Q3 Matched Pairs vs Two-Sample

Researchers study the effect of a new training program on employee productivity. They measure each employee's productivity score before the training and again after. Which analysis is most appropriate?

A Two-sample t-test comparing before scores to after scores
B One-sample z-test for the mean difference
C Matched pairs t-test on the differences (after − before)
D Two-sample t-test for the difference in means
E Chi-square test for independence

✓ Correct Answer: C

The same employees are measured twice (before and after) — this is a matched pairs design. The correct analysis is to compute the difference for each employee and run a one-sample t-test on those differences. A two-sample t-test would be wrong here because the two measurements are not independent — they come from the same people.

MCQ · Q4 Normal Condition for t

A researcher has a random sample of 8 measurements. A dotplot of the data shows a slight right skew and no outliers. Is it appropriate to use a one-sample t-procedure?

A No, because n < 30 so the CLT does not apply.
B No, because the data is skewed so we cannot use t-procedures.
C Yes, because the sample size is large enough.
D Yes, because with no strong skew or outliers, the Normal condition is satisfied even for small samples.
E Only if the population standard deviation is known.

✓ Correct Answer: D

For small samples, the Normal condition is met if the data shows no strong skew or outliers. The CLT (n ≥ 30) is only one way to satisfy the condition — it is not the only way. A slight skew with no outliers is acceptable. (A) is too strict — n ≥ 30 is a sufficient but not necessary condition.

MCQ · Q5 t CI Interpretation

A 95% confidence interval for the mean number of hours per week adults spend on social media is (8.2, 14.6). A researcher claims that adults spend more than 10 hours per week on average. Is this claim supported?

A Yes, because the interval is entirely above 8 hours.
B No, because 10 is contained in the interval, so the claim is refuted.
C The interval does not support or refute the claim — we need a hypothesis test.
D No, because we cannot conclude μ > 10 since values below 10 (like 8.2) are also plausible.
E Yes, because the midpoint of the interval (11.4) is above 10.

✓ Correct Answer: D

The interval (8.2, 14.6) contains values both below and above 10. Since the interval includes values like 8.2 and 9.5 which are below 10, we cannot conclude that μ > 10. The entire interval would need to be above 10 to support the claim that μ > 10. (C) is tempting but wrong — a CI can and does give information about one-sided claims.

Exam Practice

Free Response Questions

Always use the 4-step procedure. State conditions carefully — the Normal condition for means is different from proportions.

FRQ 1 — Matched Pairs t-Test

~15 minutes

A nutritionist wants to determine whether a new diet reduces systolic blood pressure. Eight patients have their blood pressure measured before and after 8 weeks on the diet. The results (in mmHg) are:

              Before: 148, 162, 155, 171, 142, 166, 158, 175

              After:  141, 153, 149, 160, 138, 157, 150, 164

(a)

Explain why a matched pairs analysis is appropriate here rather than a two-sample t-test.

(b)

Compute the differences (Before − After) for each patient. Find \(\bar{d}\) and \(s_d\).

(c)

Perform a matched pairs t-test at α = 0.05 to determine if the diet reduces blood pressure. Show all four steps.

✓ Model Solution

(a) Why matched pairs:

The same 8 patients are measured twice — before and after the diet. The two measurements for each patient are not independent: a patient with naturally high blood pressure will tend to have high readings both before and after. By taking differences within each patient, we control for individual variation in baseline blood pressure. A two-sample test would be inappropriate because the two groups (before/after) are not independent samples.

(b) Differences (Before − After):

7, 9, 6, 11, 4, 9, 8, 11

\(\bar{d} = (7+9+6+11+4+9+8+11)/8 = 65/8 = \mathbf{8.125}\) mmHg

\(s_d\): deviations from mean: −1.125, 0.875, −2.125, 2.875, −4.125, 0.875, −0.125, 2.875

\(s_d = \sqrt{\frac{\sum(d_i-\bar{d})^2}{n-1}} = \sqrt{\frac{1.266+0.766+4.516+8.266+17.016+0.766+0.016+8.266}{7}} = \sqrt{40.875/7} \approx \mathbf{2.416}\)

(c) Matched Pairs t-Test — 4 Steps:

Step 1 — Hypotheses: Let \(\mu_d\) = true mean difference (Before − After) in blood pressure.

\(H_0: \mu_d = 0\) vs \(H_a: \mu_d > 0\) (one-sided: we're testing whether the diet reduces BP, i.e., differences are positive)

Step 2 — Conditions: Random (stated) ✓ | Normal: n=8 is small, but the differences (7,9,6,11,4,9,8,11) show no strong skew or outliers ✓ | Independent: patients are independent of each other ✓

Step 3 — Calculate:

\(t = \frac{\bar{d} - 0}{s_d/\sqrt{n}} = \frac{8.125}{2.416/\sqrt{8}} = \frac{8.125}{0.854} \approx \mathbf{9.51}\)

\(df = n-1 = 7\) | p-value = P(t > 9.51) with df=7 ≈ < 0.0001

Step 4 — Conclude: Since p-value (< 0.0001) < α (0.05), we reject \(H_0\). There is very convincing evidence that the new diet reduces systolic blood pressure on average.

✓ AP tip: Part (a) must say "same subjects measured twice" — not just "they're paired." Part (c) hypothesis must use μ_d (the mean difference), not μ₁ and μ₂.

FRQ 2 — Two-Sample t Confidence Interval

~12 minutes

A school district wants to compare reading scores between two independent schools. A random sample of 30 students from School A has \(\bar{x}_A = 74.2\) with \(s_A = 8.5\). A random sample of 35 students from School B has \(\bar{x}_B = 69.8\) with \(s_B = 11.2\).

(a)

Check the conditions for constructing a two-sample t confidence interval.

(b)

Construct a 95% confidence interval for \(\mu_A - \mu_B\). Use df = 58 and t* = 2.002.

(c)

Interpret the interval in context and comment on whether there is convincing evidence of a difference.

✓ Model Solution

(a) Conditions:

Random: Both samples are random samples from their respective schools. ✓

Normal/Large Sample: \(n_A = 30 \geq 30\) ✓ and \(n_B = 35 \geq 30\) ✓ — CLT applies for both groups.

Independent: The two schools are separate, independent groups. 30 and 35 students are each less than 10% of their school's population. ✓

(b) 95% Confidence Interval:

\(SE = \sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}} = \sqrt{\frac{72.25}{30} + \frac{125.44}{35}} = \sqrt{2.408 + 3.584} = \sqrt{5.992} \approx 2.448\)

\(CI = (74.2 - 69.8) \pm 2.002(2.448) = 4.4 \pm 4.901\)

Interval: (−0.501, 9.301)

(c) Interpretation and conclusion:

We are 95% confident that the true difference in mean reading scores (School A minus School B) is between −0.501 and 9.301 points.

Since the interval contains 0, we do not have convincing evidence of a difference in mean reading scores between the two schools at the 95% confidence level. Both positive and negative differences are plausible — we cannot conclude that one school outperforms the other.

✓ AP tip: Always check whether 0 is in the CI for two-sample problems. If yes → fail to reject H₀. If no → reject H₀. This connects CIs to hypothesis tests.