Reading Output from Statistical Software

Dr. Gilbert

November 21, 2024

The Highlights

It is often the case that practitioners will utilize software to conduct statistical inference

We’ll just focus on navigating and interpreting the software output in these slides

After completing several examples, we’ll return to our long list of practice scenarios

Scenario I: Average Highway Fuel Economy

Scenario: Researchers are interested in understanding the fuel efficiency of cars on the highway. To investigate, they collect data on the highway gas mileage (in miles per gallon) of a random sample of 234 cars. The researchers want to determine whether the average highway gas mileage for all cars exceeds 22.5 mpg. They conduct a test at the \(\alpha = 0.10\) level of significance and the results appear below. Write the hypotheses for the test and determine the results of the test in the context of the scenario.

Single numerical variable
n = 234, y-bar = 23.4402, s = 5.9546
H0: mu = 22.5
HA: mu > 22.5
t = 2.4152, df = 233
p_value = 0.0082

FYI…

…the code to run this inference is

inference(y = hwy, data = mpg, type = "ht", 
          statistic = "mean", method = "theoretical", 
          null = 22.5, alternative = "greater", 
          show_eda_plot = FALSE, show_inf_plot = FALSE)

Using the inference() function requires the {statsr} package to be loaded – which you can do by running library(statsr)

Scenario II: Comparing Transmission Types

Scenario: The researchers from the previous scenario would like to determine whether the proportion front-wheel drive vehicles which have manual transmissions differs from the proportion of rear-wheel drive vehicles having a manual transmission. They use their random sample of 234 cars and conduct this test at the \(\alpha = 0.05\) level of significance. The results of the test appear below. Write the hypotheses for the test and state the conclusion in the context of the scenario.

Response variable: categorical (2 levels, success: manual)
Explanatory variable: categorical (2 levels) 
n_front = 106, p_hat_front = 0.3868
n_rear = 25, p_hat_rear = 0.32
H0: p_front =  p_rear
HA: p_front != p_rear
z = 0.6208
p_value = 0.5347

FYI…

…after a bit of data manipulation, the code to run this inference is

inference(y = trans, x = drv, data = drive_trans, 
          type = "ht", statistic = "proportion", 
          method = "theoretical", null = 0,
          alternative = "twosided", 
          success = "manual",
          show_eda_plot = FALSE, 
          show_inf_plot = FALSE)

Scenario III: City Gas Mileage for Four-Wheel Drive Vehicles

Scenario: The researchers are now interested in estimating the city gas mileage for four-wheel drive vehicles. Using the four-wheel drive vehicles from their random sample, they construct a 95% confidence interval for city gas mileage. The results appear below. Interpret them in the context of the scenario.

Single numerical variable
n = 103, y-bar = 14.3301, s = 2.8745
95% CI: (13.7683 , 14.8919)

FYI…

…after a bit of data manipulation, the code to run this inference is

inference(y = cty, data = four_wd, type = "ci", 
          statistic = "mean", method = "theoretical", 
          conf_level = 0.95, show_eda_plot = FALSE, 
          show_inf_plot = FALSE)

Scenario IV: Difference in Highway and City Fuel Economy

Scenario: As one final investigation, the researchers would like to estimate the difference in highway gas mileage and city gas mileage. They build a 90% confidence interval for the difference in gas mileage and the results appear below. Interpret the results in context.

Response variable: numerical, Explanatory variable: categorical (2 levels)
n_hwy = 234, y_bar_hwy = 23.4402, s_hwy = 5.9546
n_cty = 234, y_bar_cty = 16.859, s_cty = 4.2559
90% CI (hwy - cty): (5.791 , 7.3714)

FYI…

…after a bit of data manipulation, the code to run this inference is

inference(y = mpg, x = environment, 
          data = cty_hwy_mpg, type = "ci", 
          statistic = "mean", method = "theoretical", 
          conf_level = 0.90, show_eda_plot = FALSE, 
          show_inf_plot = FALSE)

Next Up…

Let’s work through more examples from our list of practice scenarios