Topic 19: Linear Regression (Lab)

function ok_checkbox(response, n) {
  if (!response || response.length === 0) 
    return html`<span style="color:purple">You haven't answered yet.</span>`;
  if (response.toString() === n) 
    return html`<span style="color:green">Correct ✓</span>`;
  return html`<span style="color:red">Not Yet! ✗</span>`;
}

About

This lab introduces simple and multiple linear regression. We’ll analyze data from a study on student course evaluations at the University of Texas at Austin, exploring how instructor and course characteristics relate to evaluation scores.

License

This is a derivative of a product of OpenIntro released under a Creative Commons Attribution-ShareAlike 3.0 Unported license. This lab was adapted from a lab written by Mine Çetinkaya-Rundel and Andrew Bray.

An Introduction to Linear Regression

Consider the inference tasks we’ve worked through so far. We’ve compared numerical or categorical variables across one or two populations, and extended that to three or more groups with ANOVA and \(\chi^2\). In those cases, the grouping variable was always categorical. Now we ask: what if both variables are numerical? Can we ask whether there is an association between a numerical \(X\) and a numerical \(Y\)? The answer is yes — and the technique is called linear regression.

Let’s check in with a few short videos from OpenIntro to develop the idea.

Simple linear regression uses a single numerical predictor to predict a numerical response. The model takes the form of a straight line:

\[\mathbb{E}[y] = \beta_0 + \beta_1 x\]

where \(\beta_0\) is the intercept and \(\beta_1\) is the slope. The full model includes an error term \(\varepsilon\) representing unexplained noise, but since we assume \(\varepsilon \sim N(0, \sigma)\), we typically write the model in terms of the expected (average) response. Regression models are most reliable for interpolation — making predictions within the range of observed predictor values — and should be used with caution for extrapolation beyond that range.

The Data

Many college courses conclude by giving students the opportunity to evaluate the course and instructor anonymously. However, the use of these evaluations as indicators of teaching effectiveness is often criticized because they may reflect non-teaching-related characteristics such as the physical appearance of the instructor. The article “Beauty in the classroom: instructors’ pulchritude and putative pedagogical productivity” (Hamermesh and Parker, 2005) found that instructors perceived as more attractive receive higher instructional ratings.

In this lab we analyze data from that study to understand what goes into a positive professor evaluation. The data were gathered from end-of-semester student evaluations for a large sample of professors at the University of Texas at Austin. Six students also rated each professor’s physical appearance. Each row of the dataset represents a different course.

The evals dataset contains the following variables:

Variable	Description
`score`	average professor evaluation score: (1) very unsatisfactory – (5) excellent
`rank`	rank of professor: teaching, tenure track, tenured
`ethnicity`	ethnicity of professor: not minority, minority
`gender`	gender of professor: female, male
`language`	language of school where professor received education: english or non-english
`age`	age of professor
`cls_perc_eval`	percent of students in class who completed evaluation
`cls_did_eval`	number of students in class who completed evaluation
`cls_students`	total number of students in class
`cls_level`	class level: lower, upper
`cls_profs`	number of professors teaching sections in course in sample: single, multiple
`cls_credits`	number of credits of class: one credit (lab, PE, etc.), multi credit
`bty_f1lower`	beauty rating from lower-level female student: (1) lowest – (10) highest
`bty_f1upper`	beauty rating from upper-level female student
`bty_f2upper`	beauty rating from second upper-level female student
`bty_m1lower`	beauty rating from lower-level male student
`bty_m1upper`	beauty rating from upper-level male student
`bty_m2upper`	beauty rating from second upper-level male student
`bty_avg`	average beauty rating of professor
`pic_outfit`	outfit of professor in picture: not formal, formal
`pic_color`	color of professor’s picture: color, black & white

Check Your Understanding: Study Design I

What is the difference between an observational study and an experiment?

mutable ok_response = (response, n) => { return html`Loading...` };
viewof q1 = Inputs.radio(
  new Map([
    ["An experiment must be conducted in a lab.", 1],
    ["An experiment includes randomization and a manipulated condition to be tested while an observational study lacks manipulation.", 2],
    ["Participants in an experiment must sign a waiver.", 3],
    ["An experiment includes much more data.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q1_selected") ?? "null")}
);

{
  localStorage.setItem("q1_selected", JSON.stringify(q1));
  localStorage.setItem("q1_correct", "2");
  localStorage.setItem("q1_result", q1 === null ? "unattempted" : (q1 == 2 ? "correct" : "incorrect"));
}

ok_response(q1, "2");

Check Your Understanding: Study Design II

Is this an observational study or an experiment?

viewof q2 = Inputs.radio(
  new Map([
    ["An observational study.", 1],
    ["An experiment.", 2]
  ]),
  {value: JSON.parse(localStorage.getItem("q2_selected") ?? "null")}
);

{
  localStorage.setItem("q2_selected", JSON.stringify(q2));
  localStorage.setItem("q2_correct", "1");
  localStorage.setItem("q2_result", q2 === null ? "unattempted" : (q2 == 1 ? "correct" : "incorrect"));
}

ok_response(q2, "1");

Check Your Understanding: Study Design III

The original research question asks whether beauty leads directly to differences in course evaluations. Given the study design, is it possible to answer this question as phrased?

viewof q3 = Inputs.radio(
  new Map([
    ["Yes, it is possible to answer the question as phrased.", 1],
    ["No. The question invokes a causal relationship — answering it as phrased requires data collected via an experiment.", 2],
    ["No, the study did not include enough data to answer the question as phrased.", 3],
    ["No. The question posed is not a data question.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q3_selected") ?? "null")}
);

{
  localStorage.setItem("q3_selected", JSON.stringify(q3));
  localStorage.setItem("q3_correct", "2");
  localStorage.setItem("q3_result", q3 === null ? "unattempted" : (q3 == 2 ? "correct" : "incorrect"));
}

ok_response(q3, "2");

Exploratory Analysis

Use the code block below to draw a histogram of the score variable in the evals data frame. Include the following labels for the plot:

labs(
    title = "Distribution of Course Evaluation Scores",
    x = "Score",
    y = ""
  )

Hint 2

Add a geom_histogram() layer. For a histogram, only map x — the heights of the bars are computed automatically from the data.

evals |>
  ggplot() +
  geom_histogram(aes(x = ___))

Hint 3 (Solved)

Don’t forget to add the labels.

evals |>
  ggplot() +
  geom_histogram(aes(x = score)) +
  labs(
    title = "Distribution of Course Evaluation Scores",
    x = "Score",
    y = ""
  )


evals |>
  ggplot() +
  geom_histogram(aes(x = score)) +
  labs(
    title = "Distribution of Course Evaluation Scores",
    x = "Score",
    y = ""
  )

Check Your Understanding: Score Distribution I

Describe the distribution of score.

viewof q4 = Inputs.radio(
  new Map([
    ["The distribution is approximately normal.", 1],
    ["The distribution is approximately uniform.", 2],
    ["The distribution is bimodal.", 3],
    ["The distribution is skewed left.", 4],
    ["The distribution is skewed right.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q4_selected") ?? "null")}
);

{
  localStorage.setItem("q4_selected", JSON.stringify(q4));
  localStorage.setItem("q4_correct", "4");
  localStorage.setItem("q4_result", q4 === null ? "unattempted" : (q4 == 4 ? "correct" : "incorrect"));
}

ok_response(q4, "4");

Check Your Understanding: Score Distribution II

What does this tell you about how students typically rate courses?

viewof q5 = Inputs.radio(
  new Map([
    ["Students give high scores more often than low scores.", 1],
    ["Students give low scores more often than high scores.", 2],
    ["Students typically give moderate scores with fewer giving very high or very low scores.", 3]
  ]),
  {value: JSON.parse(localStorage.getItem("q5_selected") ?? "null")}
);

{
  localStorage.setItem("q5_selected", JSON.stringify(q5));
  localStorage.setItem("q5_correct", "1");
  localStorage.setItem("q5_result", q5 === null ? "unattempted" : (q5 == 1 ? "correct" : "incorrect"));
}

ok_response(q5, "1");

Use the code block below to explore relationships between other variables in the evals data frame. Try grouped summaries and additional plots as you see fit.

Simple Linear Regression

The fundamental phenomenon suggested by Hamermesh and Parker is that better-looking instructors receive higher evaluation scores. Create a scatterplot with bty_avg on the horizontal axis and score on the vertical axis to see whether this appears to be the case. Include the following labels:

Title: Association between Attractiveness and Evaluation Score
x-axis: Average Beauty Rating
y-axis: Course Evaluation Score

Hint 3 (Solved)

evals |>
  ggplot() +
  geom_point(aes(x = bty_avg, y = score)) +
  labs(
    title = "Association between Attractiveness and Evaluation Score",
    x = "Average Beauty Rating",
    y = "Course Evaluation Score"
  )


evals |>
  ggplot() +
  geom_point(aes(x = bty_avg, y = score)) +
  labs(
    title = "Association between Attractiveness and Evaluation Score",
    x = "Average Beauty Rating",
    y = "Course Evaluation Score"
  )

Before drawing conclusions, compare the number of observations in evals with the number of visible points in your scatterplot. Does something seem off? Use the code block below to replot using geom_jitter() instead of geom_point(). What was misleading about the original scatterplot?

Now let’s fit a linear model. The code block below builds a simple linear regression model predicting evaluation score from average beauty rating. Run it and use the summary output to answer the questions that follow.

Check Your Understanding: Simple Regression I

Is average beauty rating a statistically significant predictor of overall evaluation score?

viewof q6 = Inputs.radio(
  new Map([
    ["Yes, the coefficient estimate is not 0.", 1],
    ["No, the p-value associated with the intercept is smaller than the p-value for bty_avg.", 2],
    ["No, the R-squared values are quite small.", 3],
    ["Yes, the p-value associated with bty_avg is below 0.05.", 4],
    ["No, the p-value associated with bty_avg is above 0.05.", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q6_selected") ?? "null")}
);

{
  localStorage.setItem("q6_selected", JSON.stringify(q6));
  localStorage.setItem("q6_correct", "4");
  localStorage.setItem("q6_result", q6 === null ? "unattempted" : (q6 == 4 ? "correct" : "incorrect"));
}

ok_response(q6, "4");

Check Your Understanding: Simple Regression II

What is the approximate value of the intercept?

viewof q7 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.067", 2],
    ["0.076", 3],
    ["3.88", 4],
    ["2e-16", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q7_selected") ?? "null")}
);

{
  localStorage.setItem("q7_selected", JSON.stringify(q7));
  localStorage.setItem("q7_correct", "4");
  localStorage.setItem("q7_result", q7 === null ? "unattempted" : (q7 == 4 ? "correct" : "incorrect"));
}

ok_response(q7, "4");

Check Your Understanding: Simple Regression III

What is the slope of the regression model with respect to average beauty rating?

viewof q8 = Inputs.radio(
  new Map([
    ["0.067", 1],
    ["3.88", 2],
    ["5.05e-05", 3],
    ["2e-16", 4],
    ["0.535", 5]
  ]),
  {value: JSON.parse(localStorage.getItem("q8_selected") ?? "null")}
);

{
  localStorage.setItem("q8_selected", JSON.stringify(q8));
  localStorage.setItem("q8_correct", "1");
  localStorage.setItem("q8_result", q8 === null ? "unattempted" : (q8 == 1 ? "correct" : "incorrect"));
}

ok_response(q8, "1");

Use the code block below to add the regression line to your jittered scatterplot using geom_abline() with the slope and intercept from the model summary.

Hint 2

The slope and intercept arguments in geom_abline() are not mapped from data columns, so they go outside of any aes() call.

evals |>
  ggplot() +
  geom_jitter(aes(x = bty_avg, y = score)) +
  geom_abline(slope = ___, intercept = ___) +
  labs(
    title = "Association between Attractiveness and Evaluation Score",
    x = "Average Beauty Rating",
    y = "Course Evaluation Score"
  )

Hint 3 (Solved)

Use the slope and intercept values from the model summary.

evals |>
  ggplot() +
  geom_jitter(aes(x = bty_avg, y = score)) +
  geom_abline(slope = 0.06664, intercept = 3.88034) +
  labs(
    title = "Association between Attractiveness and Evaluation Score",
    x = "Average Beauty Rating",
    y = "Course Evaluation Score"
  )

Write out the equation for the linear model and interpret the slope in context.

Check Your Understanding: Practical Significance I

From the plot, does average beauty rating seem to be a practically significant predictor of evaluation score?

viewof q9 = Inputs.radio(
  new Map([
    ["No. We should not expect a model using bty_avg to yield accurate predictions of evaluation scores.", 1],
    ["Yes. There is no difference between statistical and practical significance.", 2],
    ["Yes. Students give better scores to instructors who are better looking.", 3],
    ["Yes. Students give worse scores to instructors who are better looking.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q9_selected") ?? "null")}
);

{
  localStorage.setItem("q9_selected", JSON.stringify(q9));
  localStorage.setItem("q9_correct", "1");
  localStorage.setItem("q9_result", q9 === null ? "unattempted" : (q9 == 1 ? "correct" : "incorrect"));
}

ok_response(q9, "1");

Check Your Understanding: Practical Significance II

What does it mean that average beauty rating is a statistically significant but not practically significant predictor of evaluation score?

viewof q10 = Inputs.radio(
  new Map([
    ["The model resulted in a Type I error, claiming a significant relationship where there was none.", 1],
    ["The significant relationship is a result of noise in the sample data.", 2],
    ["The regression model suggests that there is implicit bias in course evaluation scores depending on the attractiveness of the instructor.", 3],
    ["The attractiveness of an instructor is a primary driver of how students rate their courses.", 4]
  ]),
  {value: JSON.parse(localStorage.getItem("q10_selected") ?? "null")}
);

{
  localStorage.setItem("q10_selected", JSON.stringify(q10));
  localStorage.setItem("q10_correct", "3");
  localStorage.setItem("q10_result", q10 === null ? "unattempted" : (q10 == 3 ? "correct" : "incorrect"));
}

ok_response(q10, "3");

Multiple Linear Regression

We now expand the model to include additional predictors so we can better understand which instructor and course characteristics, on average, lead to the highest evaluation scores. We’ll start with a large model and then use backward elimination — removing the least significant predictor one at a time — until all remaining predictors are statistically significant.

First, let’s check in with Dr. Çetinkaya-Rundel again for a brief introduction to multiple regression.

Multiple regression generalizes simple regression by allowing more than one predictor:

\[\mathbb{E}[y] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]

Predictors may be numerical (each coefficient is a slope, describing the expected change in the response per one-unit increase in that predictor while all others constant) or categorical (each coefficient is a shift in the intercept relative to a reference level).

Model quality is often measured with adjusted \(R^2\), which captures the proportion of variability in the response explained by the model, while penalizing unnecessary complexity. Values closer to 1 are better.

Note

You’ll notice that rather than including all six beauty rating variables, we include only bty_avg. This is because the individual beauty ratings are highly correlated with one another — they encode essentially the same information. Including highly correlated predictors can cause problems for regression. Look for a full course in regression analysis to learn more.

Run the code block below to build and inspect the full model.

Check Your Understanding: Full Model I

Using \(\alpha = 0.05\), which predictor variables in the full model are not statistically significant? Select all that apply.

viewof q11 = Inputs.checkbox(
  new Map([
    ["ethnicity", 1],
    ["gender", 2],
    ["language", 3],
    ["age", 4],
    ["cls_perc_eval", 5],
    ["cls_students", 6],
    ["cls_level", 7],
    ["cls_profs", 8],
    ["cls_credits", 9],
    ["bty_avg", 10],
    ["pic_color", 11],
    ["pic_outfit", 12]
  ]),
  {value: JSON.parse(localStorage.getItem("q11_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q11_selected", JSON.stringify(q11));
  localStorage.setItem("q11_correct", "1,6,7,8,12");
  localStorage.setItem("q11_result", (!q11 || q11.length === 0) ? "unattempted" : (q11.toString() === "1,6,7,8,12" ? "correct" : "incorrect"));
}

ok_checkbox(q11.toString(), "1,6,7,8,12");

Check Your Understanding: Full Model II

How many predictor variables should be removed before re-running the model?

viewof q12 = Inputs.radio(
  new Map([
    ["0 — all predictors are significant.", 1],
    ["1 — remove only the least significant predictor, then reassess.", 2],
    ["All insignificant predictors at once.", 3]
  ]),
  {value: JSON.parse(localStorage.getItem("q12_selected") ?? "null")}
);

{
  localStorage.setItem("q12_selected", JSON.stringify(q12));
  localStorage.setItem("q12_correct", "2");
  localStorage.setItem("q12_result", q12 === null ? "unattempted" : (q12 == 2 ? "correct" : "incorrect"));
}

ok_response(q12, "2");

Why Remove Only One at a Time?

When a predictor is removed, all of the remaining coefficients, standard errors, and \(p\)-values change. A predictor that appeared insignificant in the full model may become significant once another predictor is removed. Backward elimination removes predictors one at a time — always dropping the one with the highest \(p\)-value — to account for this.

Use the code block below to remove the predictor with the highest \(p\)-value from the full model, re-run it, and inspect the summary.

Hint 2 (Solved)

cls_profs had the highest \(p\)-value in the full model. Remove it from the formula.

m_reduce1 <- lm(score ~ ethnicity + gender + language + age + cls_perc_eval +
                  cls_students + cls_level + cls_credits +
                  bty_avg + pic_color + pic_outfit,
                data = evals)
summary(m_reduce1)


m_reduce1 <- lm(score ~ ethnicity + gender + language + age + cls_perc_eval +
                  cls_students + cls_level + cls_credits +
                  bty_avg + pic_color + pic_outfit,
                data = evals)
summary(m_reduce1)

Continue the backward elimination process in the code block below, removing one predictor at a time until all remaining predictors are statistically significant.

Once you’ve identified the final model, use the code block below to build it as m_final and print the summary.


m_final <- lm(score ~ ethnicity + gender + language + age + cls_perc_eval +
                cls_credits + bty_avg + pic_color,
              data = evals)
summary(m_final)

Check Your Understanding: Final Model

Based on the final model, select the characteristics associated with a higher predicted evaluation score. Select all that apply.

viewof q13 = Inputs.checkbox(
  new Map([
    ["A non-minority instructor.", 1],
    ["A male instructor.", 2],
    ["An instructor whose education was conducted in English.", 3],
    ["An older instructor.", 4],
    ["A class with a high percentage of students completing evaluations.", 5],
    ["A single-credit course.", 6],
    ["An instructor with a higher average beauty rating.", 7],
    ["An instructor whose picture is printed in color.", 8]
  ]),
  {value: JSON.parse(localStorage.getItem("q13_selected") ?? "[]") ?? []}
);

{
  localStorage.setItem("q13_selected", JSON.stringify(q13));
  localStorage.setItem("q13_correct", "1,2,3,5,6,7");
  localStorage.setItem("q13_result", (!q13 || q13.length === 0) ? "unattempted" : (q13.toString() === "1,2,3,5,6,7" ? "correct" : "incorrect"));
}

ok_checkbox(q13.toString(), "1,2,3,5,6,7");

As with the simple regression model, we’ve identified several statistically significant predictors of course evaluation score — but statistical significance does not imply practical significance.

Replicated Findings and Course Evaluations

What we have replicated is the core finding of Hamermesh and Parker: there are meaningful implicit biases embedded in the instrument used to measure teaching quality. The fact that instructor ethnicity, gender, language background, attractiveness, and picture format all have explanatory value in predicting evaluation scores raises serious questions about what these evaluations actually measure.

If you found this lab interesting, linear regression is just the beginning of a powerful family of statistical modeling techniques. Look for full courses in regression analysis, predictive modeling, statistical learning, or machine learning to go deeper.

Submit

If you are part of a course with an instructor who is grading your work on these activities, please copy and submit both of the hashes below using the method your instructor has requested.

Question Hash

The hash below encodes your responses to the multiple choice and checkbox questions in this activity.

function buildQuestionResults() {
  return {
    notebook: "Topic 19: Linear Regression (Lab)",
    type: "questions",
    timestamp: new Date().toISOString(),
    questions: {
      q1_study_design_obs_vs_exp: {
        selected: q1,
        correct_answer: "2",
        result: q1 === null ? "unattempted" : (q1 == 2 ? "correct" : "incorrect")
      },
      q2_study_design_type: {
        selected: q2,
        correct_answer: "1",
        result: q2 === null ? "unattempted" : (q2 == 1 ? "correct" : "incorrect")
      },
      q3_study_design_causation: {
        selected: q3,
        correct_answer: "2",
        result: q3 === null ? "unattempted" : (q3 == 2 ? "correct" : "incorrect")
      },
      q4_score_distribution_shape: {
        selected: q4,
        correct_answer: "4",
        result: q4 === null ? "unattempted" : (q4 == 4 ? "correct" : "incorrect")
      },
      q5_score_distribution_meaning: {
        selected: q5,
        correct_answer: "1",
        result: q5 === null ? "unattempted" : (q5 == 1 ? "correct" : "incorrect")
      },
      q6_simple_reg_significance: {
        selected: q6,
        correct_answer: "4",
        result: q6 === null ? "unattempted" : (q6 == 4 ? "correct" : "incorrect")
      },
      q7_simple_reg_intercept: {
        selected: q7,
        correct_answer: "4",
        result: q7 === null ? "unattempted" : (q7 == 4 ? "correct" : "incorrect")
      },
      q8_simple_reg_slope: {
        selected: q8,
        correct_answer: "1",
        result: q8 === null ? "unattempted" : (q8 == 1 ? "correct" : "incorrect")
      },
      q9_practical_significance: {
        selected: q9,
        correct_answer: "1",
        result: q9 === null ? "unattempted" : (q9 == 1 ? "correct" : "incorrect")
      },
      q10_practical_significance_meaning: {
        selected: q10,
        correct_answer: "3",
        result: q10 === null ? "unattempted" : (q10 == 3 ? "correct" : "incorrect")
      },
      q11_full_model_insignificant: {
        selected: q11,
        correct_answer: "1,6,7,8,12",
        result: (!q11 || q11.length === 0) ? "unattempted" : (q11.toString() === "1,6,7,8,12" ? "correct" : "incorrect")
      },
      q12_backward_elim_how_many: {
        selected: q12,
        correct_answer: "2",
        result: q12 === null ? "unattempted" : (q12 == 2 ? "correct" : "incorrect")
      },
      q13_final_model_higher_scores: {
        selected: q13,
        correct_answer: "1,2,3,5,6,7",
        result: (!q13 || q13.length === 0) ? "unattempted" : (q13.toString() === "1,2,3,5,6,7" ? "correct" : "incorrect")
      }
    }
  };
}

function toBase64(str) {
  return btoa(unescape(encodeURIComponent(str)));
}

question_hash = {
  q1; q2; q3; q4; q5; q6; q7; q8; q9; q10; q11; q12; q13;
  return toBase64(JSON.stringify(buildQuestionResults()));
}

html`<div style="font-family: monospace; font-size: 0.85em; background: #f5f5f5; padding: 12px; border-radius: 6px; word-break: break-all; border: 1px solid #ddd; user-select: all; cursor: pointer;" onclick="navigator.clipboard.writeText(this.innerText)">
  ${question_hash}
</div>
<p style="margin-top: 8px; font-size: 0.9em; color: #555;">
  Click the box to copy to clipboard.
</p>`

Exercise Hash

Click the button below to generate your exercise submission code. This hash encodes your work on the graded code exercises in this activity.

You must have attempted the graded exercises before clicking — clicking generates a snapshot of your current results. If you have completed the activity over multiple sessions, please go back through and hit the Run Code button on each graded exercise before generating the hash below, to ensure your most recent results are recorded.

Summary

Main Takeaways

Simple linear regression models the relationship between a single numerical predictor \(x\) and a numerical response \(y\) using a straight line: \(\mathbb{E}[y] = \beta_0 + \beta_1 x\). The slope \(\beta_1\) describes the expected change in \(y\) per one-unit increase in \(x\).
Statistical significance and practical significance are not the same thing. A predictor can be statistically significant — meaning the data provide evidence that the true coefficient is nonzero — while explaining very little of the variability in the response and yielding poor predictions.
Multiple linear regression extends the simple model to include multiple predictors, which may be numerical or categorical. Each numerical predictor’s coefficient is a slope; categorical predictor coefficients shift the intercept.
Backward elimination is one strategy for building a parsimonious model: start with all predictors, remove the least significant one at a time, and reassess after each removal. Removing only one predictor at a time is important because all coefficients and \(p\)-values change when a predictor is dropped.
Observational data cannot establish causation. The finding that beauty rating predicts evaluation score does not mean that attractiveness causes higher ratings — it means there is an association. An experiment with random assignment would be needed to establish causality.

Looking Ahead

This lab completes the first pass through the full introductory statistics curriculum. If you’ve completed all of these activities, then you’ve traveled from data types and sampling all the way through linear regression — building a coherent framework for asking questions with data, quantifying uncertainty, and drawing defensible conclusions. The tools you’ve learned here are the foundation for more advanced work in regression modeling, machine learning, causal inference, and beyond. Well done!

I’m planning to develop similar series’ of activities for other courses. I hope you’ll check back in to see if I’ve got anything else that can be useful to you.