Topic 14: Inference for One or More Categorical Variables With Many Levels

function ok_checkbox(response, n) {
  if (!response || response.length === 0) 
    return html`<span style="color:purple">You haven't answered yet.</span>`;
  if (response.toString() === n) 
    return html`<span style="color:green">Correct ✓</span>`;
  return html`<span style="color:red">Not Yet! ✗</span>`;
}

About

This activity introduces \(\chi^2\) (Chi-Squared) tests for Goodness of Fit and Independence — two methods for testing claims involving categorical data with more than two levels. Data used in this activity is simulated, based on the US Census Bureau, The Sentencing Project, and the 2019 Behavioral Risk Factor Surveillance System.

Chi-Squared Tests for Goodness of Fit and Independence

In this activity, we consider two methods for testing claims corresponding to categorical data with possibly more than two levels. The first is called the \(\chi^2\) Goodness of Fit Test, used to test whether a sample provides evidence that an assumed discrete distribution is not an appropriate model for a categorical variable. The second is called the \(\chi^2\) Test of Independence, used to test whether two categorical variables are associated with one another. You’ll be exposed to video explanations and worked examples before trying problems on your own.

Chi-Squared Goodness of Fit

Consider a sociologist interested in better understanding incarceration rates in the state of New Hampshire. The researcher wants to determine whether minority populations are disproportionately represented in State Penitentiaries. Using estimates from the United States Census Bureau, the population of New Hampshire had the following racial breakdown as of 2019: white (89.8%), Black (1.8%), Hispanic (4.0%), other (4.4%). A reasonable expectation is that the incarcerated population should roughly reflect this same distribution.

The researcher took a random sample of 300 inmates in State Penitentiaries across New Hampshire and observed the following results: 243 inmates were white, 24 were Black, 15 were Hispanic, and 18 were of another race. The researcher wants to determine whether this sample provides evidence that the state prison population does not reflect the racial demographics of the State.

We’ll come back to this example shortly, but first let’s watch Dr. Çetinkaya-Rundel introduce the \(\chi^2\) Goodness of Fit Test.

Dr. Çetinkaya-Rundel discussed an example of racial bias in jury selection. Before applying these techniques to our New Hampshire example, let’s recap a few key ideas.

The \(\chi^2\) test statistic does not follow a normal distribution — instead, it follows a \(\chi^2\)-distribution. Although this is a new distribution, the principles remain familiar. The test statistic measures how far our sample falls from what was expected under the null hypothesis, and the \(p\)-value is the corresponding tail probability. Since the \(\chi^2\) test statistic is always non-negative, we are always interested in the upper tail.

Running a \(\chi^2\) Goodness of Fit test requires the following conditions:

The sample must be taken randomly.
If sampling without replacement, the sample should include less than 10% of the entire population.
The sample must be large enough that each group has an expected count of at least 5 observations.

Now, back to our application. As a reminder, New Hampshire is estimated to be 89.8% white, 1.8% Black, 4.0% Hispanic, and 4.4% of other races. We have a sample of 300 inmates: 243 white, 24 Black, 15 Hispanic, and 18 of another race. We want to know whether this sample provides evidence that the prison population does not reflect the State’s racial demographics.

Check Your Understanding: GoF I

What is the variable of interest in this study?

mutable ok_response = (response, n) => { return html`Loading...` };
viewof q1 = Inputs.radio(
  new Map([
    ["The racial demographics of the state prison population in New Hampshire.", 1],
    ["The racial demographics of the population of the state of New Hampshire.", 2],
    ["The number of people belonging to each race within the state prison population in New Hampshire.", 3],
    ["The average length of prison sentences in New Hampshire.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q1_selected") ?? "null")}
);

{
  localStorage.setItem("q1_selected", JSON.stringify(q1));
  localStorage.setItem("q1_correct", "1");
  localStorage.setItem("q1_result", q1 === null ? "unattempted" : (q1 == 1 ? "correct" : "incorrect"));
}

ok_response(q1, "1");

Check Your Understanding: GoF II

The variable of interest in this study is:

viewof q2 = Inputs.radio(
  new Map([
    ["numerical.", 1],
    ["categorical.", 2],
    ["both numerical and categorical.", 3]
  ]),
  {value: JSON.parse(localStorage.getItem("q2_selected") ?? "null")}
);

{
  localStorage.setItem("q2_selected", JSON.stringify(q2));
  localStorage.setItem("q2_correct", "2");
  localStorage.setItem("q2_result", q2 === null ? "unattempted" : (q2 == 2 ? "correct" : "incorrect"));
}

ok_response(q2, "2");

Check Your Understanding: GoF III

How many measured levels of the variable of interest are there?

viewof q3 = Inputs.radio(
  new Map([
    ["1", 1],
    ["2", 2],
    ["3", 3],
    ["4", 4],
    ["5", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q3_selected") ?? "null")}
);

{
  localStorage.setItem("q3_selected", JSON.stringify(q3));
  localStorage.setItem("q3_correct", "4");
  localStorage.setItem("q3_result", q3 === null ? "unattempted" : (q3 == 4 ? "correct" : "incorrect"));
}

ok_response(q3, "4");

Check Your Understanding: GoF IV

In order to answer the question as posed, we should:

viewof q4 = Inputs.radio(
  new Map([
    ["Construct a confidence interval.", 1],
    ["Conduct a hypothesis test.", 2],
    ["Compute a required sample size.", 3],
    ["Find a probability.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q4_selected") ?? "null")}
);

{
  localStorage.setItem("q4_selected", JSON.stringify(q4));
  localStorage.setItem("q4_correct", "2");
  localStorage.setItem("q4_result", q4 === null ? "unattempted" : (q4 == 2 ? "correct" : "incorrect"));
}

ok_response(q4, "2");

Check Your Understanding: GoF V

Which of the following are the correct hypotheses associated with this test? Select all that apply.

viewof q5 = Inputs.checkbox(
  new Map([
    ["H₀: p_white = p_Black; Hₐ: The distribution is different.", 1],
    ["H₀: p_white = 0.898, p_Black = 0.018, p_Hispanic = 0.040, p_other = 0.044; Hₐ: The distribution is different.", 2],
    ["H₀: p_white = p_Black = p_Hispanic = p_other = 0.25; Hₐ: The distribution is different.", 3],
    ["H₀: The distribution is uniform; Hₐ: The distribution is different.", 4],
    ["H₀: The racial demographics of the NH state prison population match those of the State; Hₐ: The distribution is different.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q5_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q5_selected", JSON.stringify(q5));
  localStorage.setItem("q5_correct", "2,5");
  localStorage.setItem("q5_result", (!q5 || q5.length === 0) ? "unattempted" : (q5.toString() === "2,5" ? "correct" : "incorrect"));
}

ok_checkbox(q5.toString(), "2,5");

Now compute the expected counts for each racial group in the sample of 300 inmates.

Expected count — white inmates:

Hint 3

The expected count is the total number of prisoners multiplied by the proportion of the State’s population in that racial group.

There are 300 inmates in the sample.

300 * ___

Hint 4 (Solved)

The expected count is the total number of prisoners multiplied by the proportion of the State’s population in that racial group.

There are 300 inmates in the sample.
According to the US Census Bureau, the proportion of white residents in NH was about 0.898.

300 * 0.898


300 * 0.898

Expected count — Black inmates:

Hint 2 (Solved)

Use the same approach as the previous question. What proportion of New Hampshire residents are Black?

There are still 300 inmates in the sample and the proportion of black residents in NH, according to the US Census Bureau in 2019, was about 0.018.

300 * 0.018


300 * 0.018

Expected count — Hispanic inmates:


300 * 0.040

Expected count — inmates of other races:


300 * 0.044

Now we’re ready to compute the test statistic. Recall from Dr. Çetinkaya-Rundel’s video that the \(\chi^2\) statistic is:

\[\chi^2 = \sum_{i = 1}^{k}{\frac{\left(\text{observed} - \text{expected}\right)^2}{\text{expected}}}\]

where \(k\) is the number of groups. As a reminder, the \(\sigma\) symbol is used to indicate that we should add terms together. You’ll calculate \(\displaystyle{\frac{\left(\text{observed} - \text{ expected}\right)^2}{\text{expected}}}\) for each group (white, black, hispanic, and other) and add those quantities together.

Use the code block below to compute the \(\chi^2\) test statistic. The block is pre-populated to get you started.

Hint 2

The expected counts are 269.4, 5.4, 12.0, and 13.2 for white, Black, Hispanic, and other respectively.

observed <- c(243, 24, 15, 18)
expected <- c(269.4, 5.4, 12.0, 13.2)

test_stat <- sum((observed - expected)^2 / expected)
test_stat


observed <- c(243, 24, 15, 18)
expected <- c(269.4, 5.4, 12.0, 13.2)

test_stat <- sum((observed - expected)^2 / expected)
test_stat

Now compute the degrees of freedom for the \(\chi^2\)-distribution associated with this test.


4 - 1

Use the code block below to compute the \(p\)-value. The function pchisq(q, df) returns the probability to the left of the boundary value q under a \(\chi^2\) distribution with df degrees of freedom.


1 - pchisq(69.15, df = 3)

Assume the test was conducted at the \(\alpha = 0.05\) level of significance.

Check Your Understanding: GoF VI

What is the result of the test?

viewof q6 = Inputs.radio(
  new Map([
    ["Since p ≥ α, we do not have enough evidence to reject the null hypothesis.", 1],
    ["Since p ≥ α, we accept the null hypothesis.", 2],
    ["Since p < α, we reject the null hypothesis and accept the alternative hypothesis.", 3],
    ["Since p < α, we fail to reject the null hypothesis.", 4],
    ["It is impossible to determine.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q6_selected") ?? "null")}
);

{
  localStorage.setItem("q6_selected", JSON.stringify(q6));
  localStorage.setItem("q6_correct", "3");
  localStorage.setItem("q6_result", q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect"));
}

ok_response(q6, "3");

Check Your Understanding: GoF VII

The result of the test means that:

viewof q7 = Inputs.radio(
  new Map([
    ["The sample data did not provide significant evidence to suggest that the state prison population in New Hampshire does not reflect the demographics of the State.", 1],
    ["The sample data proved that the state prison population in New Hampshire does not reflect the demographics of the State.", 2],
    ["The sample data provided significant evidence to suggest that the state prison population in New Hampshire does not reflect the demographics of the State.", 3],
    ["The sample data provided significant evidence that the sample of prisoners did not match the racial demographics of the State.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q7_selected") ?? "null")}
);

{
  localStorage.setItem("q7_selected", JSON.stringify(q7));
  localStorage.setItem("q7_correct", "3");
  localStorage.setItem("q7_result", q7 === null ? "unattempted" : (q7 == 3 ? "correct" : "incorrect"));
}

ok_response(q7, "3");

Data Sources

This application is based on 2019 data from the US Census Bureau and The Sentencing Project.

Chi-Squared Test of Independence

Now that you’ve completed a Goodness of Fit test, let’s consider another application of the \(\chi^2\)-distribution. We’ll work through an application in which we are interested in determining whether household income (IncomeLevel) and adolescent drug use (DrugUse) are independent. We’ll use a simulated dataset based on metrics from the 2019 Behavioral Risk Factor Surveillance System. The simulated data is stored in a data frame called BRFSSsim.

First, let’s watch Dr. Çetinkaya-Rundel introduce the \(\chi^2\) Test of Independence.

We return to our simulated BRFSS data. Each row of the dataset represents a response from a single adolescent in the USA. We are interested in determining, at the 5% level of significance, whether there is evidence to suggest an association between IncomeLevel which has levels Poverty, LowIncome, MiddleIncome, and HighIncome and the variable DrugUse with yes or no response indicating whether the individual reported using any illicit drug in the past year.

We assume the following marginal distributions: approximately 16% of households fall below the poverty line, 38% are low income, 35% are middle income, and 11% are high income. Additionally, approximately 14% of adolescents are estimated to have used illicit drugs in any given year. We’ll use these assumed percentages when computing our expected counts.

Check Your Understanding: Independence I

How many variables of interest are there in this researcher’s study?

viewof q8 = Inputs.radio(
  new Map([
    ["1", 1],
    ["2", 2],
    ["3", 3],
    ["8", 4],
    ["10", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q8_selected") ?? "null")}
);

{
  localStorage.setItem("q8_selected", JSON.stringify(q8));
  localStorage.setItem("q8_correct", "2");
  localStorage.setItem("q8_result", q8 === null ? "unattempted" : (q8 == 2 ? "correct" : "incorrect"));
}

ok_response(q8, "2");

Check Your Understanding: Independence II

In order to answer their question, the researcher should:

viewof q9 = Inputs.radio(
  new Map([
    ["Construct a confidence interval.", 1],
    ["Conduct a hypothesis test.", 2],
    ["Find a probability.", 3],
    ["Compute a desired sample size.", 4],
    ["Conduct a census.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q9_selected") ?? "null")}
);

{
  localStorage.setItem("q9_selected", JSON.stringify(q9));
  localStorage.setItem("q9_correct", "2");
  localStorage.setItem("q9_result", q9 === null ? "unattempted" : (q9 == 2 ? "correct" : "incorrect"));
}

ok_response(q9, "2");

Check Your Understanding: Independence III

The variables of interest to the researcher are:

viewof q10 = Inputs.radio(
  new Map([
    ["All numerical.", 1],
    ["All categorical.", 2],
    ["Some numerical and some categorical.", 3],
    ["Geo-spatial data.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q10_selected") ?? "null")}
);

{
  localStorage.setItem("q10_selected", JSON.stringify(q10));
  localStorage.setItem("q10_correct", "2");
  localStorage.setItem("q10_result", q10 === null ? "unattempted" : (q10 == 2 ? "correct" : "incorrect"));
}

ok_response(q10, "2");

Check Your Understanding: Independence IV

The hypotheses associated with the researcher’s test are:

viewof q11 = Inputs.radio(
  new Map([
    ["H₀: Household income and drug use are independent; Hₐ: Household income and drug use are dependent.", 1],
    ["H₀: Household income and drug use are dependent; Hₐ: Household income and drug use are independent.", 2],
    ["H₀: p_income = p_druguse; Hₐ: p_income ≠ p_druguse", 3],
    ["H₀: μ_income = μ_druguse; Hₐ: μ_income ≠ μ_druguse", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q11_selected") ?? "null")}
);

{
  localStorage.setItem("q11_selected", JSON.stringify(q11));
  localStorage.setItem("q11_correct", "1");
  localStorage.setItem("q11_result", q11 === null ? "unattempted" : (q11 == 1 ? "correct" : "incorrect"));
}

ok_response(q11, "1");

Under the assumption of the null hypothesis (independence), compute the expected number of the 2,500 adolescents in each of the following drug-use groups.

Expected count — Poverty and drug use:


2500 * 0.16 * 0.14

Expected count — Low Income and drug use:

Hint 3 (Solved)

Use the same approach as the previous question, but update the probability of being in the Low Income group.

\[\text{Expected Count} = \mathbb{P}\left[\text{low income}\right]\cdot \mathbb{P}\left[\text{drug use}\right]\cdot n\]

0.38*0.14*2500


2500 * 0.38 * 0.14

Expected count — Middle Income and drug use:

Hint 2 (Solved)

Use the same approach as for calculating the previous two expected counts. The marginal probability of a randomly selected household being middle income is 0.35.

0.35*0.14*2500


2500 * 0.35 * 0.14

Expected count — High Income and drug use:


2500 * 0.11 * 0.14

You’ve now computed the expected counts for the drug-use row. I’ll do it for the row corresponding to no drug-use. To build that row, we do exactly what we did above, but replace 0.14 (the probability of a randomly chosen individual having used illicit drugs in the last 12 months) with 0.86 (the probability that they have not).

The full expected and observed tables are shown below for reference.

Expected Results:

	Poverty	Low Income	Middle Income	High Income
No Drug Use	344	817	752.5	236.5
Drug Use	56	133	122.5	38.5

Observed Results:

	Poverty	Low Income	Middle Income	High Income
No Drug Use	341	752	796	221
Drug Use	78	153	116	43

The \(\chi^2\) test statistic formula is the same as for Goodness of Fit:

\[\chi^2 = \sum_{i = 1}^{k}{\frac{\left(\text{observed} - \text{expected}\right)^2}{\text{expected}}}\]

where \(k\) is again the total number of groups. There were four income levels and two drug-use categories, so \(4\times 2 = 8\) total groups. Use the code block below to compute the \(\chi^2\) test statistic.

Hint 1

Start by copying and pasting the code we used to compute the test statistic in the previous example.

observed <- c(243, 24, 15, 18)
expected <- c(269.4, 5.4, 12.0, 13.2)

test_stat <- sum((observed - expected)^2 / expected)
test_stat

Hint 3

Type in the observed counts from our scenario. Pay special attention to the order you use.

observed <- c(341, 752, 796, 221, 78, 153, 116, 43)
expected <- c(___)

test_stat <- sum((observed - expected)^2 / expected)
test_stat

Hint 4

Type in the observed counts from our scenario. Pay special attention to the order you use.

Type in the expected counts you calculated earlier. Note that you must use the same ordering here as you did for the observed counts.

observed <- c(341, 752, 796, 221, 78, 153, 116, 43)
expected <- c(344, 817, 752.5, 236.5, 56, 133, 122.5, 38.5)

test_stat <- sum((observed - expected)^2 / expected)
test_stat

Hint 5 (Solved)

Run the code – no additional changes are required.

observed <- c(341, 752, 796, 221, 78, 153, 116, 43)
expected <- c(344, 817, 752.5, 236.5, 56, 133, 122.5, 38.5)

test_stat <- sum((observed - expected)^2 / expected)
test_stat


observed <- c(341, 752, 796, 221, 78, 153, 116, 43)
expected <- c(344, 817, 752.5, 236.5, 56, 133, 122.5, 38.5)

test_stat <- sum((observed - expected)^2 / expected)
test_stat

Now compute the degrees of freedom associated with this test for independence.

Hint 4 (Solved)

There are \(k = 4\) levels for the income variable and there are \(\ell = 2\) levels for the drug use variable.

For the \(\chi^2\) test for independence, the degrees of freedom is the product \(\left(k - 1\right)\left(\ell - 1\right)\).

(4 - 1)*(2 - 1)


(4 - 1) * (2 - 1)

Now compute the \(p\)-value associated with this test.


1 - pchisq(21.24924, df = 3)

Recall that the test was conducted at the \(\alpha = 0.05\) level of significance.

Check Your Understanding: Independence V

What is the result of the test?

viewof q12 = Inputs.radio(
  new Map([
    ["Since p ≥ α, we do not have enough evidence to reject the null hypothesis.", 1],
    ["Since p ≥ α, we accept the null hypothesis.", 2],
    ["Since p < α, we reject the null hypothesis and accept the alternative hypothesis.", 3],
    ["Since p < α, we fail to reject the null hypothesis.", 4],
    ["It is impossible to determine.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q12_selected") ?? "null")}
);

{
  localStorage.setItem("q12_selected", JSON.stringify(q12));
  localStorage.setItem("q12_correct", "3");
  localStorage.setItem("q12_result", q12 === null ? "unattempted" : (q12 == 3 ? "correct" : "incorrect"));
}

ok_response(q12, "3");

Check Your Understanding: Independence VI

The result of the test means that:

viewof q13 = Inputs.radio(
  new Map([
    ["The sample data provided significant evidence to suggest that household income and illicit drug use by adolescents are dependent.", 1],
    ["The sample data proved that household income and illicit drug use by adolescents are dependent.", 2],
    ["The sample data provided significant evidence to suggest that household income and illicit drug use by adolescents are independent.", 3],
    ["The sample data provided significant evidence that household income and illicit drug use were dependent for the individuals in our sample.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q13_selected") ?? "null")}
);

{
  localStorage.setItem("q13_selected", JSON.stringify(q13));
  localStorage.setItem("q13_correct", "1");
  localStorage.setItem("q13_result", q13 === null ? "unattempted" : (q13 == 1 ? "correct" : "incorrect"));
}

ok_response(q13, "1");

Submit

If you are part of a course with an instructor who is grading your work on these activities, please copy and submit both of the hashes below using the method your instructor has requested.

Question Hash

The hash below encodes your responses to the multiple choice and checkbox questions in this activity.

function buildQuestionResults() {
  return {
    notebook: "Topic 14: Inference for One or More Categorical Variables With Many Levels",
    type: "questions",
    timestamp: new Date().toISOString(),
    questions: {
      q1_gof_variable_of_interest: {
        selected: q1,
        correct_answer: "1",
        result: q1 === null ? "unattempted" : (q1 == 1 ? "correct" : "incorrect")
      },
      q2_gof_variable_type: {
        selected: q2,
        correct_answer: "2",
        result: q2 === null ? "unattempted" : (q2 == 2 ? "correct" : "incorrect")
      },
      q3_gof_num_levels: {
        selected: q3,
        correct_answer: "4",
        result: q3 === null ? "unattempted" : (q3 == 4 ? "correct" : "incorrect")
      },
      q4_gof_inference_type: {
        selected: q4,
        correct_answer: "2",
        result: q4 === null ? "unattempted" : (q4 == 2 ? "correct" : "incorrect")
      },
      q5_gof_hypotheses: {
        selected: q5,
        correct_answer: "2,5",
        result: (!q5 || q5.length === 0) ? "unattempted" : (q5.toString() === "2,5" ? "correct" : "incorrect")
      },
      q6_gof_test_result: {
        selected: q6,
        correct_answer: "3",
        result: q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect")
      },
      q7_gof_conclusion_in_context: {
        selected: q7,
        correct_answer: "3",
        result: q7 === null ? "unattempted" : (q7 == 3 ? "correct" : "incorrect")
      },
      q8_ind_num_variables: {
        selected: q8,
        correct_answer: "2",
        result: q8 === null ? "unattempted" : (q8 == 2 ? "correct" : "incorrect")
      },
      q9_ind_inference_type: {
        selected: q9,
        correct_answer: "2",
        result: q9 === null ? "unattempted" : (q9 == 2 ? "correct" : "incorrect")
      },
      q10_ind_variable_types: {
        selected: q10,
        correct_answer: "2",
        result: q10 === null ? "unattempted" : (q10 == 2 ? "correct" : "incorrect")
      },
      q11_ind_hypotheses: {
        selected: q11,
        correct_answer: "1",
        result: q11 === null ? "unattempted" : (q11 == 1 ? "correct" : "incorrect")
      },
      q12_ind_test_result: {
        selected: q12,
        correct_answer: "3",
        result: q12 === null ? "unattempted" : (q12 == 3 ? "correct" : "incorrect")
      },
      q13_ind_conclusion_in_context: {
        selected: q13,
        correct_answer: "1",
        result: q13 === null ? "unattempted" : (q13 == 1 ? "correct" : "incorrect")
      }
    }
  };
}

function toBase64(str) {
  return btoa(unescape(encodeURIComponent(str)));
}

question_hash = {
  q1; q2; q3; q4; q5; q6; q7; q8; q9; q10; q11; q12; q13;
  return toBase64(JSON.stringify(buildQuestionResults()));
}

html`<div style="font-family: monospace; font-size: 0.85em; background: #f5f5f5; padding: 12px; border-radius: 6px; word-break: break-all; border: 1px solid #ddd; user-select: all; cursor: pointer;" onclick="navigator.clipboard.writeText(this.innerText)">
  ${question_hash}
</div>
<p style="margin-top: 8px; font-size: 0.9em; color: #555;">
  Click the box to copy to clipboard.
</p>`

Exercise Hash

Click the button below to generate your exercise submission code. This hash encodes your work on the graded code exercises in this activity.

You must have attempted the graded exercises before clicking — clicking generates a snapshot of your current results. If you have completed the activity over multiple sessions, please go back through and hit the Run Code button on each graded exercise before generating the hash below, to ensure your most recent results are recorded.

Summary

Main Takeaways

The \(\chi^2\) Goodness of Fit test is used when we have a single categorical variable with two or more levels and want to test whether a population follows an assumed distribution. The null hypothesis specifies the assumed proportions for each level; the alternative hypothesis simply states that the distribution is different.
The \(\chi^2\) Test of Independence is used when we have two categorical variables and want to test whether they are associated. The null hypothesis states that the variables are independent; under independence, the expected count for each cell is the total sample size times the product of the marginal probabilities.
Both tests use the same \(\chi^2\) test statistic: \[\chi^2 = \sum_{i=1}^{k} \frac{(\text{observed} - \text{expected})^2}{\text{expected}}\] where \(k\) is the number of groups (GoF) or cells (independence).
The degrees of freedom differ between the two tests:
- Goodness of Fit: \(df = (\text{number of groups}) - 1\)
- Test of Independence: \(df = (\text{levels in first variable} - 1) \times (\text{levels in second variable} - 1)\)
The \(p\)-value is always the upper tail of the \(\chi^2\) distribution. We compute this area with 1 - pchisq(test_stat, df).
Conditions for inference require a random sample and expected counts of at least 5 in each group or cell.

Looking Ahead

With this activity, you’ve now developed tools for testing claims involving categorical data across a wide range of scenarios — single proportions, two-proportion comparisons, goodness of fit, and tests of independence. In the coming activities, we’ll make a significant shift and begin working with numerical data. This will introduce the \(t\)-distribution, which becomes necessary when we don’t know the population standard deviation — which is almost always. The core logic of hypothesis testing and confidence intervals remains unchanged; what changes is the distribution we use and the standard error formula we apply.