Topic 11: Hypothesis Tests and Confidence Intervals for Categorical Data

function ok_checkbox(response, n) {
  if (!response || response.length === 0) 
    return html`<span style="color:purple">You haven't answered yet.</span>`;
  if (response.toString() === n) 
    return html`<span style="color:green">Correct ✓</span>`;
  return html`<span style="color:red">Not Yet! ✗</span>`;
}

About

This activity formally introduces confidence intervals and hypothesis tests as tools for statistical inference. We work through two motivating problems — one confidence interval and one hypothesis test — in the context of real survey data about immigration policy.

Hypothesis Testing and Confidence Intervals for Categorical Data

In this activity, we continue our exploration of statistical inference. We’ll cover the basic hypothesis testing framework in addition to discussing confidence intervals more formally than we did in the previous activity.

We’ll motivate this activity by watching three short videos from volunteers at OpenIntro.org. The first two are from Dr. David Diez, a data scientist, and the last is from Dr. Shannon McLintock, a member of the statistics faculty at Cal Poly. After each video, you’ll walk through a hands-on application of the video content to a real scenario.

Variability in Point Estimates

Our first video discusses variability in point estimates. Some of the content will sound familiar from the previous activity. Watch the video, then we’ll engage with the ideas by walking through an example together.

An Example: A June 2020 Pew Research survey revealed that 74% of Americans support offering a path to citizenship for undocumented immigrants who were brought to the US illegally as children — often referred to as DREAMers.

We’ve discussed the impossibility of a true census, so the Pew study did not poll every single American. Instead, they surveyed 9,654 US adults between June 4 and June 10, 2020. You can find out more about the study logistics here if you are interested. This means that the 74% referenced in the article is the proportion of individuals from the study who were in favor of a path to citizenship for the DREAMers.

Check Your Understanding: Terminology I

The 74% from the Pew Research article is a/an (select all that apply):

viewof q1 = Inputs.checkbox(
  new Map([
    ["sample statistic.", 1],
    ["population parameter.", 2],
    ["point estimate.", 3],
    ["observation.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q1_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q1_selected", JSON.stringify(q1));
  localStorage.setItem("q1_correct", "1,3");
  localStorage.setItem("q1_result", (!q1 || q1.length === 0) ? "unattempted" : (q1.toString() === "1,3" ? "correct" : "incorrect"));
}

ok_checkbox(q1.toString(), "1,3");

Note on Terminology

We often use sample statistic and point estimate interchangeably. A sample statistic serves as a point estimate for the corresponding population parameter.

Check Your Understanding: Terminology II

What is the parameter of interest here?

mutable ok_response = (response, n) => { return html`Loading...` };
viewof q2 = Inputs.radio(
  new Map([
    ["The sample proportion of American adults who are in favor of a citizenship option for DREAMers.", 1],
    ["The true proportion of American adults who are in favor of a citizenship option for DREAMers.", 2],
    ["74% of American adults.", 3],
    ["All DREAMers.", 4],
    ["All American adults.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q2_selected") ?? "null")}
);

{
  localStorage.setItem("q2_selected", JSON.stringify(q2));
  localStorage.setItem("q2_correct", "2");
  localStorage.setItem("q2_result", q2 === null ? "unattempted" : (q2 == 2 ? "correct" : "incorrect"));
}

ok_response(q2, "2");

Check Your Understanding: Terminology III

According to the methodology document, the 9,654 participants were a random sample representative of the population of American adults. If the study were completed again with a new set of 9,654 participants, we would expect:

viewof q3 = Inputs.radio(
  new Map([
    ["A similar but slightly different result.", 1],
    ["Exactly the same result.", 2],
    ["A completely different result.", 3],
    ["It is impossible to determine.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q3_selected") ?? "null")}
);

{
  localStorage.setItem("q3_selected", JSON.stringify(q3));
  localStorage.setItem("q3_correct", "1");
  localStorage.setItem("q3_result", q3 === null ? "unattempted" : (q3 == 1 ? "correct" : "incorrect"));
}

ok_response(q3, "1");

The code below simulates a random sample of 9,654 individuals for which there is a 74% chance the individual is in support of a path to citizenship for DREAMers. This should look somewhat familiar if you completed the lab activity where we simulated shots taken by a basketball player in order to investigate the hot hand phenomenon. Run the code a few times to see the results.

By running the code block above multiple times, you’ve probably seen that most of the samples resulted in a sample proportion within about one percentage point (0.01) of the assumed proportion \(p = 0.74\).

In the video, Dr. Diez discusses how we can use the Central Limit Theorem to quantify how much variability we should see in the point estimate from one sample to the next. In the case of a single proportion, the Central Limit Theorem states:

Central Limit Theorem for a Sample Proportion

When observations are independent and the sample size is sufficiently large, the sample proportion \(\hat{p}\) will tend to follow a normal distribution with mean \(\mu = p\) (the true population proportion) and standard error \(\displaystyle{S_E = \sqrt{\frac{p\left(1-p\right)}{n}}}\). That is:

\[\hat{p} \sim N\!\left(\mu = p,\ S_E = \sqrt{\frac{p\left(1-p\right)}{n}}\right)\]

It is typical to assume that sufficiently large means that the success-failure condition is satisfied. The condition requires that the sample is large enough that we should expect at least 10 “successes” and at least 10 “failures”.

Use the code block below to answer the questions that follow.

Check Your Understanding: Success-Failure Condition I

The sample size is sufficiently large if the success-failure condition is satisfied. What is the success-failure condition? Select all that apply.

viewof q4 = Inputs.checkbox(
  new Map([
    ["There should be at least an expected 10 observations in each group (here: in favor / not in favor).", 1],
    ["If the population proportion is p and the sample size is n, then n⋅p ≥ 10 and n⋅(1−p) ≥ 10.", 2],
    ["There should be a possibility that we succeed but also that we fail.", 3],
    ["There must be at least one success and one failure.", 4],
    ["Failure is an option.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q4_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q4_selected", JSON.stringify(q4));
  localStorage.setItem("q4_correct", "1,2");
  localStorage.setItem("q4_result", (!q4 || q4.length === 0) ? "unattempted" : (q4.toString() === "1,2" ? "correct" : "incorrect"));
}

ok_checkbox(q4.toString(), "1,2");

Check Your Understanding: Success-Failure Condition II

Is the success-failure condition satisfied for the Pew study with 9,654 participants?

viewof q5 = Inputs.radio(
  new Map([
    ["Yes. We should expect at least 10 participants in favor and at least 10 participants opposed.", 1],
    ["No. We cannot expect at least 10 participants to be in favor.", 2],
    ["No. We cannot expect at least 10 participants to oppose.", 3],
    ["No. We cannot expect at least 10 participants in either group.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q5_selected") ?? "null")}
);

{
  localStorage.setItem("q5_selected", JSON.stringify(q5));
  localStorage.setItem("q5_correct", "1");
  localStorage.setItem("q5_result", q5 === null ? "unattempted" : (q5 == 1 ? "correct" : "incorrect"));
}

ok_response(q5, "1");

Check Your Understanding: Success-Failure Condition III

Would the success-failure condition be satisfied for a small study with only 35 participants?

viewof q6 = Inputs.radio(
  new Map([
    ["Yes. We should expect at least 10 participants in favor and at least 10 participants opposed.", 1],
    ["No. We cannot expect at least 10 participants to be in favor.", 2],
    ["No. We cannot expect at least 10 participants to oppose.", 3],
    ["No. We cannot expect at least 10 participants in either group.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q6_selected") ?? "null")}
);

{
  localStorage.setItem("q6_selected", JSON.stringify(q6));
  localStorage.setItem("q6_correct", "3");
  localStorage.setItem("q6_result", q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect"));
}

ok_response(q6, "3");

Check Your Understanding: Shape of the Sampling Distribution

What will be the shape of the sampling distribution for samples of size 9,654?

viewof q7 = Inputs.radio(
  new Map([
    ["Approximately normal.", 1],
    ["Skewed left.", 2],
    ["Skewed right.", 3]
  ]),
  {value: JSON.parse(localStorage.getItem("q7_selected") ?? "null")}
);

{
  localStorage.setItem("q7_selected", JSON.stringify(q7));
  localStorage.setItem("q7_correct", "1");
  localStorage.setItem("q7_result", q7 === null ? "unattempted" : (q7 == 1 ? "correct" : "incorrect"));
}

ok_response(q7, "1");

Check Your Understanding: Standard Error

Which of the following is the expected spread of the sampling distribution, measured by the standard error?

viewof q8 = Inputs.radio(
  new Map([
    ["Approximately 0.00002", 1],
    ["Approximately 0.0045", 2],
    ["Approximately 0.26", 3],
    ["Approximately 0.74", 4],
    ["Approximately 0.1924", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q8_selected") ?? "null")}
);

{
  localStorage.setItem("q8_selected", JSON.stringify(q8));
  localStorage.setItem("q8_correct", "2");
  localStorage.setItem("q8_result", q8 === null ? "unattempted" : (q8 == 2 ? "correct" : "incorrect"));
}

ok_response(q8, "2");

Notice that the standard error is about half of a percentage point (close to 0.005). Doubling this estimate closely matches what we observed about the sampling error using our simulations. This brings us to our next topic — confidence intervals.

Intro to Confidence Intervals

Watch the next video from Dr. Diez. Once you’ve watched it, we’ll continue with our example about the 2020 Pew Research study.

As Dr. Diez mentions, a confidence interval can be used to capture a population parameter with some desired degree of certainty. In general, we construct a confidence interval using the following formula:

\[\left(\text{point estimate}\right) \pm \left(\text{critical value}\right) \cdot S_E\]

where the point estimate comes from the sample data, the critical value is related to the level of confidence, and the standard error (\(S_E\)) measures the spread of the sampling distribution.

Recall that we’ve been working with a 2020 Pew Research study which included 9,654 participants. The study resulted in 74% of participants being in favor of a path to citizenship for the DREAMers, and we computed the standard error to be approximately 0.0045.

If the sampling distribution is well-modeled by a normal distribution, the following critical values are associated with several common levels of confidence:

Confidence Level	Critical Value
90%	1.65
95%	1.96
98%	2.33
99%	2.58

Use what you learned in the video and your knowledge of the Pew Research study to answer the following questions. You can use the code block below for any necessary computations.

Hint 3

To find the upper bound, add the margin of error to the point estimate. The margin of error is the critical value multiplied by the standard error.

#Upper Bound:
___ + (___ * ___)

Hint 6

The point estimate is 0.74, the critical value is 2.33, and the standard error is about 0.0045.

To find the lower bound, we subtract the margin of error from the point estimate.

#Upper Bound:
0.74 + (2.33 * 0.0045)

#Lower Bound: 
___ - (___ * ___)

Hint 7

The point estimate is 0.74, the critical value is 2.33, and the standard error is about 0.0045.

To find the lower bound, we subtract the margin of error from the point estimate. The margin of error is still the critical value times the standard error, and the point estimate is still 0.74.

#Upper Bound:
0.74 + (2.33 * 0.0045)

#Lower Bound: 
0.74 - (2.33 * 0.0045)

Check Your Understanding: Confidence Interval I

The point estimate for our confidence interval is:

viewof q9 = Inputs.radio(
  new Map([
    ["0.74", 1],
    ["74", 2],
    ["7,144", 3],
    ["9,654", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q9_selected") ?? "null")}
);

{
  localStorage.setItem("q9_selected", JSON.stringify(q9));
  localStorage.setItem("q9_correct", "1");
  localStorage.setItem("q9_result", q9 === null ? "unattempted" : (q9 == 1 ? "correct" : "incorrect"));
}

ok_response(q9, "1");

Check Your Understanding: Confidence Interval II

The standard error (\(S_E\)) is:

viewof q10 = Inputs.radio(
  new Map([
    ["0.0045", 1],
    ["0.5", 2],
    ["0.74", 3],
    ["9,653", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q10_selected") ?? "null")}
);

{
  localStorage.setItem("q10_selected", JSON.stringify(q10));
  localStorage.setItem("q10_correct", "1");
  localStorage.setItem("q10_result", q10 === null ? "unattempted" : (q10 == 1 ? "correct" : "incorrect"));
}

ok_response(q10, "1");

Check Your Understanding: Confidence Interval III

The appropriate critical value for a 98% confidence interval is:

viewof q11 = Inputs.radio(
  new Map([
    ["0.74", 1],
    ["1.65", 2],
    ["1.96", 3],
    ["2.33", 4],
    ["2.58", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q11_selected") ?? "null")}
);

{
  localStorage.setItem("q11_selected", JSON.stringify(q11));
  localStorage.setItem("q11_correct", "4");
  localStorage.setItem("q11_result", q11 === null ? "unattempted" : (q11 == 4 ? "correct" : "incorrect"));
}

ok_response(q11, "4");

Check Your Understanding: Confidence Interval IV

Which of the following are the bounds for a 98% confidence interval? Select all that apply.

viewof q12 = Inputs.checkbox(
  new Map([
    ["0.7295", 1],
    ["0.7312", 2],
    ["0.74", 3],
    ["0.7489", 4],
    ["0.7505", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q12_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q12_selected", JSON.stringify(q12));
  localStorage.setItem("q12_correct", "1,5");
  localStorage.setItem("q12_result", (!q12 || q12.length === 0) ? "unattempted" : (q12.toString() === "1,5" ? "correct" : "incorrect"));
}

ok_checkbox(q12.toString(), "1,5");

Check Your Understanding: Confidence Interval V

Which of the following is the correct interpretation of the 98% confidence interval?

viewof q13 = Inputs.radio(
  new Map([
    ["We are 98% confident that the true population proportion of American adults supporting a path to citizenship for the DREAMers is between the lower bound and the upper bound.", 1],
    ["The true population proportion of American adults supporting a path to citizenship for the DREAMers is between the lower bound and the upper bound.", 2],
    ["The probability that the true population proportion of American adults supporting a path to citizenship for the DREAMers is between the lower bound and the upper bound is 98%.", 3],
    ["We are 98% confident that the sample proportion of American adults supporting a path to citizenship for the DREAMers is between the lower bound and the upper bound.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q13_selected") ?? "null")}
);

{
  localStorage.setItem("q13_selected", JSON.stringify(q13));
  localStorage.setItem("q13_correct", "1");
  localStorage.setItem("q13_result", q13 === null ? "unattempted" : (q13 == 1 ? "correct" : "incorrect"));
}

ok_response(q13, "1");

Check Your Understanding: Confidence Interval VI

Without computing the bounds, a 90% confidence interval would be:

viewof q14 = Inputs.radio(
  new Map([
    ["Wider than the 98% confidence interval.", 1],
    ["More narrow than the 98% confidence interval.", 2],
    ["Exactly the same as the 98% confidence interval.", 3],
    ["It is impossible to tell without computing the bounds.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q14_selected") ?? "null")}
);

{
  localStorage.setItem("q14_selected", JSON.stringify(q14));
  localStorage.setItem("q14_correct", "2");
  localStorage.setItem("q14_result", q14 === null ? "unattempted" : (q14 == 2 ? "correct" : "incorrect"));
}

ok_response(q14, "2");

So far, so good! There’s one more topic to go. Sometimes we’ll want to test a claim about a population parameter rather than build a confidence interval for it. Inferential statistics provides a formal framework called the hypothesis test for evaluating statistical claims such as:

Is a population mean or proportion larger, smaller, or different from some proposed value?
Do the population means or proportions differ across multiple groups?

Intro to Hypothesis Testing

Here’s one more video from Dr. Shannon McLintock introducing the notion of the hypothesis test.

A 2018 poll from NPR reported that 65% of Americans supported a path to citizenship for DREAMers. Does the 2020 Pew Research poll provide evidence that support for a pathway to citizenship has grown over the past two years? Use a significance level of \(\alpha = 0.05\).

Use what you learned from Dr. McLintock to answer the following questions and complete the hypothesis test. You can use the code block below for any calculations.

Hint 7 (Solved)

# Standard error (using null value p = 0.65)
se <- sqrt((0.65 * (1 - 0.65)) / 9654)

# Test statistic
z <- (0.74 - 0.65) / se

# p-value (upper tail, since Ha: p > 0.65)
1 - pnorm(z)

The \(p\)-value will be extremely small — much less than \(\alpha = 0.05\) — leading us to reject the null hypothesis.

Check Your Understanding: Hypothesis Test I

Which of the following are the hypotheses used to test this claim?

viewof q15 = Inputs.radio(
  new Map([
    ["H₀: p = 0.65, Hₐ: p > 0.65", 1],
    ["H₀: p = 0.65, Hₐ: p < 0.65", 2],
    ["H₀: p = 0.65, Hₐ: p ≠ 0.65", 3],
    ["H₀: p = 0.74, Hₐ: p > 0.74", 4],
    ["H₀: p = 0.74, Hₐ: p < 0.74", 5],
    ["H₀: p = 0.74, Hₐ: p ≠ 0.74", 6]
  ]),
  {value: JSON.parse(localStorage.getItem("q15_selected") ?? "null")}
);

{
  localStorage.setItem("q15_selected", JSON.stringify(q15));
  localStorage.setItem("q15_correct", "1");
  localStorage.setItem("q15_result", q15 === null ? "unattempted" : (q15 == 1 ? "correct" : "incorrect"));
}

ok_response(q15, "1");

Check Your Understanding: Hypothesis Test II

What is the level of significance for the test?

viewof q16 = Inputs.radio(
  new Map([
    ["α = 0.10", 1],
    ["α = 0.05", 2],
    ["α = 0.02", 3],
    ["α = 0.01", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q16_selected") ?? "null")}
);

{
  localStorage.setItem("q16_selected", JSON.stringify(q16));
  localStorage.setItem("q16_correct", "2");
  localStorage.setItem("q16_result", q16 === null ? "unattempted" : (q16 == 2 ? "correct" : "incorrect"));
}

ok_response(q16, "2");

Check Your Understanding: Hypothesis Test III

What is the point estimate?

viewof q17 = Inputs.radio(
  new Map([
    ["p̂ = 0.65", 1],
    ["p̂ = 0.74", 2],
    ["p̂ = 0.50", 3],
    ["p̂ = 0.09", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q17_selected") ?? "null")}
);

{
  localStorage.setItem("q17_selected", JSON.stringify(q17));
  localStorage.setItem("q17_correct", "2");
  localStorage.setItem("q17_result", q17 === null ? "unattempted" : (q17 == 2 ? "correct" : "incorrect"));
}

ok_response(q17, "2");

Check Your Understanding: Hypothesis Test IV

What is the null value?

viewof q18 = Inputs.radio(
  new Map([
    ["p = 0.65", 1],
    ["p = 0.74", 2],
    ["p = 0.50", 3],
    ["p = 0.09", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q18_selected") ?? "null")}
);

{
  localStorage.setItem("q18_selected", JSON.stringify(q18));
  localStorage.setItem("q18_correct", "1");
  localStorage.setItem("q18_result", q18 === null ? "unattempted" : (q18 == 1 ? "correct" : "incorrect"));
}

ok_response(q18, "1");

Check Your Understanding: Hypothesis Test V

The standard error is computed as \(S_E = \sqrt{\frac{p(1-p)}{n}}\), where \(p\) is the null value. Which of the following is the standard error? (round to four decimal places)

viewof q19 = Inputs.radio(
  new Map([
    ["SE = 0.0042", 1],
    ["SE = 0.0045", 2],
    ["SE = 0.0049", 3],
    ["SE = 0.0052", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q19_selected") ?? "null")}
);

{
  localStorage.setItem("q19_selected", JSON.stringify(q19));
  localStorage.setItem("q19_correct", "3");
  localStorage.setItem("q19_result", q19 === null ? "unattempted" : (q19 == 3 ? "correct" : "incorrect"));
}

ok_response(q19, "3");

Why Use the Null Value in the Standard Error?

Notice that we use \(p = 0.65\) (the null value) rather than \(\hat{p} = 0.74\) (the sample proportion) when computing the standard error for the hypothesis test. This is because, during a hypothesis test, we assume the null hypothesis is true — we are asking: if the true proportion really is 0.65, how surprising is our observed sample?

Check Your Understanding: Hypothesis Test VI

The test statistic is computed as \(\displaystyle{z = \frac{(\text{point estimate}) - (\text{null value})}{S_E}}\). Which of the following is the test statistic? (round to two decimal places)

viewof q20 = Inputs.radio(
  new Map([
    ["z = 1.65", 1],
    ["z = 2.33", 2],
    ["z = 18.54", 3],
    ["z = 20.00", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q20_selected") ?? "null")}
);

{
  localStorage.setItem("q20_selected", JSON.stringify(q20));
  localStorage.setItem("q20_correct", "3");
  localStorage.setItem("q20_result", q20 === null ? "unattempted" : (q20 == 3 ? "correct" : "incorrect"));
}

ok_response(q20, "3");

Check Your Understanding: Hypothesis Test VII

Use pnorm() to compute the \(p\)-value associated with this test.

viewof q21 = Inputs.radio(
  new Map([
    ["p-value ≈ 0", 1],
    ["p-value = 0.05", 2],
    ["p-value = 1", 3],
    ["p-value = 0 (exactly)", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q21_selected") ?? "null")}
);

{
  localStorage.setItem("q21_selected", JSON.stringify(q21));
  localStorage.setItem("q21_correct", "1");
  localStorage.setItem("q21_result", q21 === null ? "unattempted" : (q21 == 1 ? "correct" : "incorrect"));
}

ok_response(q21, "1");

Note on Reported p-values of Zero

If software reports a \(p\)-value of exactly 0, this simply means the \(p\)-value is smaller than the precision the software can display. It is more accurate to say the \(p\)-value is very small, being rounded to 0, approximately 0, or to report it as \(p < 0.0001\).

Check Your Understanding: Hypothesis Test VIII

Which of the following is the conclusion of the hypothesis test?

viewof q22 = Inputs.radio(
  new Map([
    ["The p-value is less than α, so we accept the null hypothesis.", 1],
    ["The p-value is at least as large as α, so we reject the null hypothesis and accept the alternative.", 2],
    ["The p-value is at least as large as α, so we do not have enough evidence to reject the null hypothesis.", 3],
    ["The p-value is less than α, so we reject the null hypothesis and accept the alternative.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q22_selected") ?? "null")}
);

{
  localStorage.setItem("q22_selected", JSON.stringify(q22));
  localStorage.setItem("q22_correct", "4");
  localStorage.setItem("q22_result", q22 === null ? "unattempted" : (q22 == 4 ? "correct" : "incorrect"));
}

ok_response(q22, "4");

Check Your Understanding: Hypothesis Test IX

Which of the following is the result of the hypothesis test stated in context?

viewof q23 = Inputs.radio(
  new Map([
    ["We do not have evidence to suggest that the proportion of American adults in favor of a path to citizenship has increased since 2018.", 1],
    ["We have evidence to suggest that the proportion of American adults in favor of a path to citizenship has stayed the same since 2018.", 2],
    ["We have evidence to suggest that the proportion of American adults in favor of a path to citizenship has increased since 2018.", 3],
    ["We do not have evidence to suggest that the proportion of American adults in favor of a path to citizenship has stayed the same since 2018.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q23_selected") ?? "null")}
);

{
  localStorage.setItem("q23_selected", JSON.stringify(q23));
  localStorage.setItem("q23_correct", "3");
  localStorage.setItem("q23_result", q23 === null ? "unattempted" : (q23 == 3 ? "correct" : "incorrect"));
}

ok_response(q23, "3");

Submit

If you are part of a course with an instructor who is grading your work on these activities, please copy and submit both of the hashes below using the method your instructor has requested (there is only a question hash for this activity, no exercise hash).

Question Hash

The hash below encodes your responses to the multiple choice and checkbox questions in this activity.

function buildQuestionResults() {
  return {
    notebook: "Topic 11: Hypothesis Tests and Confidence Intervals for Categorical Data",
    type: "questions",
    timestamp: new Date().toISOString(),
    questions: {
      q1_terminology_statistic_estimate: {
        selected: q1,
        correct_answer: "1,3",
        result: (!q1 || q1.length === 0) ? "unattempted" : (q1.toString() === "1,3" ? "correct" : "incorrect")
      },
      q2_parameter_of_interest: {
        selected: q2,
        correct_answer: "2",
        result: q2 === null ? "unattempted" : (q2 == 2 ? "correct" : "incorrect")
      },
      q3_repeated_study_result: {
        selected: q3,
        correct_answer: "1",
        result: q3 === null ? "unattempted" : (q3 == 1 ? "correct" : "incorrect")
      },
      q4_success_failure_condition: {
        selected: q4,
        correct_answer: "1,2",
        result: (!q4 || q4.length === 0) ? "unattempted" : (q4.toString() === "1,2" ? "correct" : "incorrect")
      },
      q5_success_failure_satisfied: {
        selected: q5,
        correct_answer: "1",
        result: q5 === null ? "unattempted" : (q5 == 1 ? "correct" : "incorrect")
      },
      q6_success_failure_small_study: {
        selected: q6,
        correct_answer: "3",
        result: q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect")
      },
      q7_sampling_dist_shape: {
        selected: q7,
        correct_answer: "1",
        result: q7 === null ? "unattempted" : (q7 == 1 ? "correct" : "incorrect")
      },
      q8_standard_error: {
        selected: q8,
        correct_answer: "2",
        result: q8 === null ? "unattempted" : (q8 == 2 ? "correct" : "incorrect")
      },
      q9_ci_point_estimate: {
        selected: q9,
        correct_answer: "1",
        result: q9 === null ? "unattempted" : (q9 == 1 ? "correct" : "incorrect")
      },
      q10_ci_standard_error: {
        selected: q10,
        correct_answer: "1",
        result: q10 === null ? "unattempted" : (q10 == 1 ? "correct" : "incorrect")
      },
      q11_ci_critical_value: {
        selected: q11,
        correct_answer: "4",
        result: q11 === null ? "unattempted" : (q11 == 4 ? "correct" : "incorrect")
      },
      q12_ci_bounds: {
        selected: q12,
        correct_answer: "1,5",
        result: (!q12 || q12.length === 0) ? "unattempted" : (q12.toString() === "1,5" ? "correct" : "incorrect")
      },
      q13_ci_interpretation: {
        selected: q13,
        correct_answer: "1",
        result: q13 === null ? "unattempted" : (q13 == 1 ? "correct" : "incorrect")
      },
      q14_ci_width_comparison: {
        selected: q14,
        correct_answer: "2",
        result: q14 === null ? "unattempted" : (q14 == 2 ? "correct" : "incorrect")
      },
      q15_ht_hypotheses: {
        selected: q15,
        correct_answer: "1",
        result: q15 === null ? "unattempted" : (q15 == 1 ? "correct" : "incorrect")
      },
      q16_ht_significance_level: {
        selected: q16,
        correct_answer: "2",
        result: q16 === null ? "unattempted" : (q16 == 2 ? "correct" : "incorrect")
      },
      q17_ht_point_estimate: {
        selected: q17,
        correct_answer: "2",
        result: q17 === null ? "unattempted" : (q17 == 2 ? "correct" : "incorrect")
      },
      q18_ht_null_value: {
        selected: q18,
        correct_answer: "1",
        result: q18 === null ? "unattempted" : (q18 == 1 ? "correct" : "incorrect")
      },
      q19_ht_standard_error: {
        selected: q19,
        correct_answer: "3",
        result: q19 === null ? "unattempted" : (q19 == 3 ? "correct" : "incorrect")
      },
      q20_ht_test_statistic: {
        selected: q20,
        correct_answer: "3",
        result: q20 === null ? "unattempted" : (q20 == 3 ? "correct" : "incorrect")
      },
      q21_ht_pvalue: {
        selected: q21,
        correct_answer: "1",
        result: q21 === null ? "unattempted" : (q21 == 1 ? "correct" : "incorrect")
      },
      q22_ht_conclusion: {
        selected: q22,
        correct_answer: "4",
        result: q22 === null ? "unattempted" : (q22 == 4 ? "correct" : "incorrect")
      },
      q23_ht_conclusion_in_context: {
        selected: q23,
        correct_answer: "3",
        result: q23 === null ? "unattempted" : (q23 == 3 ? "correct" : "incorrect")
      }
    }
  };
}

function toBase64(str) {
  return btoa(unescape(encodeURIComponent(str)));
}

question_hash = {
  q1; q2; q3; q4; q5; q6; q7; q8; q9; q10; q11; q12;
  q13; q14; q15; q16; q17; q18; q19; q20; q21; q22; q23;
  return toBase64(JSON.stringify(buildQuestionResults()));
}

html`<div style="font-family: monospace; font-size: 0.85em; background: #f5f5f5; padding: 12px; border-radius: 6px; word-break: break-all; border: 1px solid #ddd; user-select: all; cursor: pointer;" onclick="navigator.clipboard.writeText(this.innerText)">
  ${question_hash}
</div>
<p style="margin-top: 8px; font-size: 0.9em; color: #555;">
  Click the box to copy to clipboard.
</p>`

Exercise Hash

Since there were no code cell exercises in this activity, there is no exercise hash to generate. You’ll see exercise hashes in future activities.

Summary

Main Takeaways

On point estimates and variability:

Sample statistics provide point estimates for their corresponding population parameters — a sample proportion estimates a population proportion, a sample mean estimates a population mean.
Sample statistics provide reliable point estimates only when the sample is representative of the population.
Every sample produces a slightly different statistic. Much of statistics is focused on quantifying this variability.

On confidence intervals:

A confidence interval captures a population parameter with a desired degree of confidence, computed as: \[(\text{point estimate}) \pm (\text{critical value}) \cdot S_E\]
The point estimate is a sample statistic. The critical value depends on the desired confidence level. The standard error (\(S_E\)) quantifies the expected variability in the point estimate.
A correct interpretation: “We are XX% confident that the true [population parameter] lies between [lower bound] and [upper bound].”
Higher confidence levels require larger critical values, which produce wider intervals.

On hypothesis tests:

A hypothesis test provides a formal framework for evaluating claims about a population parameter.
We begin with a null hypothesis (\(H_0\)) representing the status quo and an alternative hypothesis (\(H_a\)) representing the claim to be tested.
We set a significance level \(\alpha\) — the threshold below which a \(p\)-value is considered surprising enough to reject \(H_0\).
We compute a test statistic: \(\displaystyle{z = \frac{(\text{point estimate}) - (\text{null value})}{S_E}}\)
The \(p\)-value measures the probability of observing a sample at least as favorable to \(H_a\) as ours, assuming \(H_0\) is true. A \(p\)-value smaller than \(\alpha\) is taken as evidence against \(H_0\).

Looking Ahead

This activity introduced the general frameworks for confidence intervals and hypothesis tests using proportions. In the coming activities, we’ll continue to utilize these tools to help us estimate population parameters and to test claims about them.

As a preview of what’s coming, here’s a link to a Standard Error Decision Tree that we’ll use throughout the remainder of the course. It looks intimidating now, but look at the bottom-right corner — there’s the confidence interval formula you just used! And the lower-left corner shows the general test statistic formula. Everything else on the document will be explained in the coming activities.