Checking Conditions for a Difference in Population Proportions
When performing a significance test for a difference in proportions, we assume the null hypothesis is true.
Because the null assumes no difference, we must use a combined (pooled) sample proportion, p̂_c.
The combined proportion is calculated as:
p̂_c = (x₁ + x₂) / (n₁ + n₂)
The three conditions to check are Random, 10%, and Large Counts.
For the Large Counts condition, expected successes and failures are calculated using the combined proportion, not the individual sample proportions.
Hi everyone, Mr. Antonucci here. In this video, we’re going to talk about checking the conditions for a significance test about a difference between two population proportions.
When performing this test, we assume the null hypothesis is true, meaning we assume there is no difference between the two population proportions. Because of that assumption, this test is a little different from what we’ve done before.
Since we are assuming the population proportions are equal, we must calculate what is called the combined, or pooled, value of the two parameters. We call this value p̂_c.
To estimate p̂_c, we combine the data from both samples and treat them as if they came from one larger sample.
The combined sample proportion is calculated as:
p̂_c = (x₁ + x₂) / (n₁ + n₂)
In other words:
x₁ and x₂ are the number of successes in each sample
n₁ and n₂ are the sample sizes
You can also think of x₁ and x₂ as:
x₁ = n₁ · p̂₁
x₂ = n₂ · p̂₂
So p̂_c gives us the overall proportion of successes across both samples combined.
The AP Statistics formula sheet includes this combined sample proportion on both sections of the exam, but the key is knowing when to use it. We use p̂_c when we are performing a significance test for a difference in proportions and assuming no difference under the null hypothesis.
The conditions for performing a significance test about a difference between two population proportions are the same three we’ve seen before:
Random
10%
Large Counts
However, we need to be careful about how we verify them.
For the random condition, we must verify that the data come from two independent random samples or from two groups in a randomized experiment.
This condition ensures that the samples are independent of each other.
The 10% condition states that if we are sampling without replacement, each sample must be less than 10% of its population.
This condition helps justify independence within each sample.
For the Large Counts condition, the expected number of successes and failures in each sample must be at least 10.
Important note:
To calculate these expected counts, we do not use the individual sample proportions. Instead, we use the combined (pooled) proportion, p̂_c.
Because we must check:
successes and failures for Sample 1, and
successes and failures for Sample 2,
this condition appears four times in our work.
Let’s return to the example comparing drive-through accuracy at McDonald’s and Wendy’s.
At McDonald’s, 145 of 159 drive-through orders were accurate.
At Wendy’s, 139 of 163 drive-through orders were accurate.
We are working toward determining whether there is convincing evidence of a difference in the population proportions. The next step is to check whether the conditions for performing the test are met.
Remember, you cannot simply write “Random, 10%, Large Counts” and check them off. You must justify each condition in context.
Independent random samples of 159 drive-through orders from McDonald’s and 163 drive-through orders from Wendy’s were taken. Because the samples come from two different restaurants, it is reasonable to treat them as independent.
It is reasonable to assume that:
159 orders is less than 10% of all McDonald’s drive-through orders, and
163 orders is less than 10% of all Wendy’s drive-through orders.
So the 10% condition is satisfied.
First, calculate the combined sample proportion:
p̂_c = (145 + 139) / (159 + 163) = 284 / 322 ≈ 0.882
Now calculate the expected counts:
McDonald’s
Expected successes: 159 · 0.882 ≈ 140
Expected failures: 159 · (1 − 0.882) ≈ 18.8
Wendy’s
Expected successes: 163 · 0.882 ≈ 143
Expected failures: 163 · (1 − 0.882) ≈ 19.2
All expected counts are at least 10, so the Large Counts condition is satisfied.
Because the Random, 10%, and Large Counts conditions are all met, it is appropriate to perform a significance test for a difference in population proportions for this situation.
That’s it for this video. I hope this was helpful in understanding how to check the conditions for this type of test. Feel free to reach out if you have any questions.