Topic 16: Inference Practice (Part A)

About

No new content is introduced in this activity. This first of three practice notebooks contains four scaffolded inference problems involving probabilities, hypothesis tests, and confidence intervals for means and proportions. Use these problems to build your independence before moving on to unscaffolded homework problems.

Inference Practice

In this activity, you’ll have an opportunity to practice a set of problems. Guiding questions are still provided for each problem to help you identify the necessary steps — think carefully about each step and how it relates to the larger inference process, since you’ll need to navigate these steps independently on upcoming homework problems.

For convenience, the reference documents are linked here:

For any problems involving hypothesis tests, assume \(\alpha = 0.05\) unless otherwise stated.

No Hints

There are no hints included in this activity. Use the reference documents linked above and your previous notebooks to guide you. These four problems include links to fully worked solutions for emergencies — but try to work through each problem on your own first. The next two practice activities won’t have those supports.


Problem 1

The distribution of the number of eggs laid by a certain species of hen during their breeding period has a mean of 35 eggs and a standard deviation of 18.2. Suppose a group of researchers randomly samples 45 hens of this species, counts the number of eggs laid during their breeding period, and records a sample mean of 37 eggs. Find the probability of observing a sample of 45 hens whose mean number of eggs laid during the breeding period is at least 37. (See worked solution)

Problem 1, Part I

To answer the question as asked, we should:

Problem 1, Part II

How many groups are sampled in this application?

Problem 1, Part III

Do we know the population standard deviation for number of eggs laid?

Problem 1, Part IV

How should the standard error of the sampling distribution be computed?

Problem 1, Part V

Which distribution best describes the sampling distribution?

Use the code block below to compute the probability of observing a random sample of 45 hens from this species which lays an average of at least 37 eggs.

1 - pnorm(37, mean = 35, sd = 18.2/sqrt(45))

1 - pnorm(37, mean = 35, sd = 18.2/sqrt(45))

Problem 2

A recent study looked at average ages of male and female inmates on Death Row. A random sample of 9 women and 14 men on Death Row resulted in the following:

Men Women
\(\bar{x}\) 39 44
\(s\) 4.5 6.2

Conduct a hypothesis test to determine whether there is evidence to suggest that men on Death Row are, on average, younger than women on Death Row. (See worked solution)

Problem 2, Part I

To answer the question as asked, we should:

Problem 2, Part II

What is the level of significance associated with this test?

Problem 2, Part III

Does this hypothesis test involve testing a statement about a mean (\(\mu\)), a proportion (\(p\)), or something else?

Problem 2, Part IV

How many groups are being compared in this test?

Problem 2, Part V

Which of the following are the hypotheses associated with this test?

Problem 2, Part VI

Do we know the population standard deviations (\(\sigma\)) for age in each group?

Problem 2, Part VII

Are the observations in the two groups paired?

Problem 2, Part VIII

Which standard error formula should be used?

Problem 2, Part IX

Which distribution does the test statistic follow?

Use the code block below to compute the point estimate.

39 - 44

39 - 44

Use the code block below to compute the null value.

0

0

Use the code block below to compute the standard error.

sqrt((4.5^2 / 14) + (6.2^2 / 9))

sqrt((4.5^2 / 14) + (6.2^2 / 9))

Use the code block below to compute the test statistic.

(39 - 44) / sqrt((4.5^2 / 14) + (6.2^2 / 9))

(39 - 44) / sqrt((4.5^2 / 14) + (6.2^2 / 9))

Use the code block below to compute the \(p\)-value.

ts <- (39 - 44) / sqrt((4.5^2 / 14) + (6.2^2 / 9)) pt(-abs(ts), df = 8)

ts <- (39 - 44) / sqrt((4.5^2 / 14) + (6.2^2 / 9))
pt(-abs(ts), df = 8)
Problem 2, Part X

What is the result of the test?

Problem 2, Part XI

The result of the test means that:


Problem 3

A recent random sample of 30 customers who switched to a car insurance company boasting that new customers save an average of $534 per year resulted in an average savings of $525 with a standard deviation of $40. Find a 90% confidence interval for the true mean savings of customers switching to this company. (See worked solution)

Problem 3, Part I

To answer the question as asked, we should:

Problem 3, Part II

What is the desired level of confidence?

Problem 3, Part III

Is your confidence interval being built to capture a mean (\(\mu\)), a proportion (\(p\)), or something else?

Problem 3, Part IV

Does the population parameter belong to a single group or is it a comparison of multiple groups?

Problem 3, Part V

Do we know the population standard deviation (\(\sigma\)) for dollars saved?

Problem 3, Part VI

Which standard error formula should be used?

Problem 3, Part VII

Which distribution should be used to identify the critical value?

Use the code block below to compute the critical value.

qt(0.95, df = 29)

qt(0.95, df = 29)

Use the code block below to compute the point estimate.

525

525

Use the code block below to compute the standard error.

40 / sqrt(30)

40 / sqrt(30)

Use the code block below to compute the lower bound of the confidence interval.

525 - qt(0.95, df = 29) * (40 / sqrt(30))

525 - qt(0.95, df = 29) * (40 / sqrt(30))

Use the code block below to compute the upper bound of the confidence interval.

525 + qt(0.95, df = 29) * (40 / sqrt(30))

525 + qt(0.95, df = 29) * (40 / sqrt(30))
Problem 3, Part VIII

The correct interpretation of this confidence interval is:


Problem 4

In January 2011, The Marist Poll published a report stating that 66% of adults nationally think licensed drivers should be required to retake their road test once they reach 65 years of age. In that same year, 200 random citizens of New Hampshire were asked whether they were in favor of an additional road test for drivers at 65 years of age — 118 responded that they were in favor. Construct a 95% confidence interval for the true proportion of New Hampshirites who are in favor of this proposal. (See worked solution)

Problem 4, Part I

To answer the question as asked, we should:

Problem 4, Part II

What is the desired level of confidence?

Problem 4, Part III

Is your confidence interval being built to capture a mean (\(\mu\)), a proportion (\(p\)), or something else?

Problem 4, Part IV

Does the population parameter belong to a single group or is it a comparison of multiple groups?

Problem 4, Part V

Which standard error formula should be used?

Problem 4, Part VI

Which distribution should be used to identify the critical value?

Use the code block below to compute the critical value.

qnorm(0.975)

qnorm(0.975)

Use the code block below to compute the point estimate.

118 / 200

118 / 200

Use the code block below to compute the standard error.

sqrt((118/200) * (1 - 118/200) / 200)

sqrt((118/200) * (1 - 118/200) / 200)

Use the code block below to compute the lower bound of the confidence interval.

p_hat <- 118/200 p_hat - qnorm(0.975) * sqrt(p_hat * (1 - p_hat) / 200)

p_hat <- 118/200
p_hat - qnorm(0.975) * sqrt(p_hat * (1 - p_hat) / 200)

Use the code block below to compute the upper bound of the confidence interval.

p_hat <- 118/200 p_hat + qnorm(0.975) * sqrt(p_hat * (1 - p_hat) / 200)

p_hat <- 118/200
p_hat + qnorm(0.975) * sqrt(p_hat * (1 - p_hat) / 200)
Problem 4, Part VII

The correct interpretation of this confidence interval is:

Submit

If you are part of a course with an instructor who is grading your work on these activities, please copy and submit both of the hashes below using the method your instructor has requested.

Question Hash

The hash below encodes your responses to the multiple choice questions in this activity.

Exercise Hash

Click the button below to generate your exercise submission code. This hash encodes your work on the graded code exercises in this activity.

You must have attempted the graded exercises before clicking — clicking generates a snapshot of your current results. If you have completed the activity over multiple sessions, please go back through and hit the Run Code button on each graded exercise before generating the hash below, to ensure your most recent results are recorded.

Summary

Main Takeaways

These four problems covered the range of inference tasks you’ll encounter most frequently. A few reminders worth carrying forward:

  • Always identify the task first — probability, confidence interval, hypothesis test, or sample size. The task determines everything that follows.
  • The decision tree is your guide for identifying the correct standard error formula and distribution. The key questions: means or proportions? one group or two? population standard deviation known or unknown?
  • When \(\sigma\) is unknown, use \(s\) in the standard error formula and the \(t\)-distribution for critical values and \(p\)-values. When \(\sigma\) is known, use the normal distribution.
  • Interpretation matters. A confidence interval captures a population parameter with a stated level of confidence. It is not a probability statement about the parameter, and not a statement about individual observations.
Looking Ahead

Part B continues with four more mixed inference problems. The scaffolding remains, but the worked solution links are removed — you’re increasingly on your own. Take stock of what is difficult for you and be sure to ask a teacher, mentor, or friend when you need help. Keep the decision tree and general strategy documents close at hand. You’ll find them to be helpful.