November 11, 2024
Cross-validation is a procedure that we can use to obtain more reliable performance assessments for our models
In \(k\)-fold cross validation, we create \(k\) models and obtain \(k\) performance estimates – the average of these performance estimates can be referred to as the cross-validation performance estimate
Our cross-validation procedure does NOT result in a fitted model, but results in the cross-validation performance estimate and an estimated standard error for that model performance
Cross-validation makes our choices and inferences less susceptible to random chance (the randomly chosen training and test observations)
Our approach to linear regression so far has perhaps led us to the intuition that we should start with a large model and then reduce it down to include only statistically significant terms
This approach, called backward elimination, is commonly utilized
There is also an opposite approach, called forward selection
We’ll switch to using the ames
dataset for this discussion
That dataset contains features and selling prices for 2,930 homes sold in Ames, Iowa between 2006 and 2010
Open your MAT300
project in RStudio and create a new Quarto document
Use a setup chunk to load the {tidyverse}
and {tidymodels}
The ames
data set is contained in the {modeldata}
package, which is loaded with {tidymodels}
– take a preliminary look at this dataset
Split your data into training and test sets
Create five or ten cross-validation folds
Consider the model (or us, as modelers) as a shopper in a market that sells predictors
(Backward elimination) Our model begins by putting every item in the store into its shopping cart, and then puts back the items it doesn’t need
(Forward selection) Our model begins with an empty cart and wanders the store, finding the items it needs most to add to its cart one-by-one
At first, these approaches may seem reasonable, if inefficient
At first, these approaches may seem reasonable, if inefficient
Statistical Standpoint: We’re evaluating lots of \(t\)-tests in determining statistical significance of predictors
At first, these approaches may seem reasonable, if inefficient
Statistical Standpoint: We’re evaluating lots of \(t\)-tests in determining statistical significance of predictors
At first, these approaches may seem reasonable, if inefficient
Statistical Standpoint: We’re evaluating lots of \(t\)-tests in determining statistical significance of predictors
Model-Fit Perspective: The more predictors a model has access to, the more flexible it is, the better it will fit the training data, and the more likely it is to become overfit
At first, these approaches may seem reasonable, if inefficient
Statistical Standpoint: We’re evaluating lots of \(t\)-tests in determining statistical significance of predictors
Model-Fit Perspective: The more predictors a model has access to, the more flexible it is, the better it will fit the training data, and the more likely it is to become overfit
By allowing a model to “shop” freely for its predictors, we are encouraging our model to become overfit
Giving our model a “budget” to spend on its shopping trip would force our model to be more selective about the predictors it chooses, and lowers the likelihood that it becomes overfit
We’ve hidden the math that fits our models up until this point, but its worth a look now
\[\mathbb{E}\left[y\right] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]
The optimization procedure we’ve been using to find the \(\beta\)-coefficients is called Ordinary Least Squares
Ordinary Least Squares: Find \(\beta_0, \beta_1, \cdots, \beta_k\) in order to minimize
\[\sum_{i = 1}^{n}{\left(y_{\text{obs}_i} - y_{\text{pred}_i}\right)^2}\]
We’ve hidden the math that fits our models up until this point, but its worth a look now
\[\mathbb{E}\left[y\right] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]
The optimization procedure we’ve been using to find the \(\beta\)-coefficients is called Ordinary Least Squares
Ordinary Least Squares: Find \(\beta_0, \beta_1, \cdots, \beta_k\) in order to minimize
\[\sum_{i = 1}^{n}{\left(y_{\text{obs}_i} - \left(\beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \cdots + \beta_k x_{ik}\right)\right)^2}\]
We’ve hidden the math that fits our models up until this point, but its worth a look now
\[\mathbb{E}\left[y\right] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]
The optimization procedure we’ve been using to find the \(\beta\)-coefficients is called Ordinary Least Squares
Ordinary Least Squares: Find \(\beta_0, \beta_1, \cdots, \beta_k\) in order to minimize
\[\sum_{i = 1}^{n}{\left(y_{\text{obs}_i} - \left(\beta_0 + \sum_{j = 1}^{k}{\beta_j x_{ij}}\right)\right)^2}\]
This is the procedure that allows our model to shop freely for predictors
Regularization refers to techniques design to constrain models and reduce the likelihood of overfitting
For linear regression, there are two commonly used methods
Each of these methods makes an adjustment to the Ordinary Least Squares procedure we just saw
\[\mathbb{E}\left[y\right] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]
Ridge Regression: Find \(\beta_0, \beta_1, \cdots, \beta_k\) in order to minimize
\[\sum_{i = 1}^{n}{\left(y_{\text{obs}_i} - \left(\beta_0 + \sum_{j = 1}^{k}{\beta_j x_{ij}}\right)\right)^2}\]
subject to the constraint
\[\sum_{j = 1}^{k}{\beta_j^2} \leq C\]
Note: \(C\) is a constraint which can be thought of as our budget for coefficients
The Result: Ridge regression encourages very small coefficients on unimportant predictors
\[\mathbb{E}\left[y\right] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\]
LASSO: Find \(\beta_0, \beta_1, \cdots, \beta_k\) in order to minimize
\[\sum_{i = 1}^{n}{\left(y_{\text{obs}_i} - \left(\beta_0 + \sum_{j = 1}^{k}{\beta_j x_{ij}}\right)\right)^2}\]
subject to the constraint
\[\sum_{j = 1}^{k}{\left|\beta_j\right|} \leq C\]
Note: Like with Ridge Regression, \(C\) is a constraint which can be thought of as our budget for coefficients
The Result: The LASSO pushes coefficients of unimportant predictors to \(0\)
The choice between Ridge Regression and the LASSO depends on your goals
The LASSO is better at variable selection because it sends coefficients on unimportant predictors to exactly \(0\)
LASSO won’t always out-perform Ridge though, so you might try both and see which is better for your use case
If a predictor \(x_i\) is on a larger scale than the response \(y\), then that predictor becomes artificially cheap to include in a model
Similarly, if a predictor \(x_j\) is on a smaller scale than the response, then that predictor becomes artificially expensive to include in a model
We don’t want any of our predictors to be artificially advantaged or disadvantaged, so we must ensure that all of our numerical predictors are on the same scale as one another
Min-Max Scaling: Projects each numerical predictor down to the interval \(\left[0, 1\right]\) via \(\displaystyle{\frac{x - \min\left(x\right)}{\max\left(x\right) - \min\left(x\right)}}\)
step_range()
in a recipe to utilize min-max scalingStandard Scaling: Converts observed measurements into standard deviations (\(z\)-scores) via \(\displaystyle{\frac{x - \text{mean}\left(x\right)}{\text{sd}\left(x\right)}}\)
step_normalize()
in a recipe to utilize standard scalingThe {tidymodels}
framework is great because it provides us with standardized structure for defining and fitting models
Ridge and the LASSO are still linear regression models, but they’re no longer fit using OLS
This puts them in a class of models called Generalized Linear Models . . .
We’ll need to change our fitting engine from "lm"
to something that can fit these GLMs – we’ll use "glmnet"
Required Parameters for "glmnet"
:
mixture
can be set to any value between \(0\) and \(1\)
mixture = 0
results in Ridge Regressionmixture = 1
results in the LASSOpenalty
is the amount of regularization being applied
penalty
For now, we’ll just pick a value and see how it performs
We can experiment with several if we like, and then choose the one that results in the best performance
We’ll talk about a better strategy next time
The "glmnet"
engine requires that no missing values are included in any fold
step_impute_*()
function added to a recipeWe’ll use the ames
housing data set that we’ve seen from time to time this semester
We’ll read it in, remove the rows with missing Sale_Price
(our response), split the data into training and test sets, and create our cross-validation folds
ridge_reg_spec <- linear_reg(mixture = 0, penalty = 1e4) %>%
set_engine("glmnet")
ridge_reg_rec <- recipe(Sale_Price ~ ., data = ames_train) %>%
step_impute_knn(all_predictors()) %>%
step_normalize(all_numeric_predictors()) %>%
step_other(all_nominal_predictors()) %>%
step_dummy(all_nominal_predictors())
ridge_reg_wf <- workflow() %>%
add_model(ridge_reg_spec) %>%
add_recipe(ridge_reg_rec)
ridge_reg_results <- ridge_reg_wf %>%
fit_resamples(ames_folds)
ridge_reg_results %>%
collect_metrics()
.metric | .estimator | mean | n | std_err | .config |
---|---|---|---|---|---|
rmse | standard | 33512.4783290 | 5 | 3528.8858182 | Preprocessor1_Model1 |
rsq | standard | 0.8285515 | 5 | 0.0232516 | Preprocessor1_Model1 |
While we wouldn’t generally fit the Ridge Regression model at this time, you can see how to do that and examine the estimated model below.
term | estimate | penalty |
---|---|---|
(Intercept) | 170590.91440 | 10000 |
Lot_Frontage | 1458.08999 | 10000 |
Lot_Area | 1919.73420 | 10000 |
Year_Built | 4443.44715 | 10000 |
Year_Remod_Add | 5652.52473 | 10000 |
Mas_Vnr_Area | 6469.12318 | 10000 |
BsmtFin_SF_1 | -109.24172 | 10000 |
BsmtFin_SF_2 | 619.57697 | 10000 |
Bsmt_Unf_SF | -1897.35902 | 10000 |
Total_Bsmt_SF | 7471.51817 | 10000 |
First_Flr_SF | 8231.08414 | 10000 |
Second_Flr_SF | 8576.05018 | 10000 |
Gr_Liv_Area | 13259.91733 | 10000 |
Bsmt_Full_Bath | 2910.80533 | 10000 |
Bsmt_Half_Bath | -872.28855 | 10000 |
Full_Bath | 4132.02383 | 10000 |
Half_Bath | 2520.03561 | 10000 |
Bedroom_AbvGr | -2600.41177 | 10000 |
Kitchen_AbvGr | -3311.98370 | 10000 |
TotRms_AbvGrd | 4547.30499 | 10000 |
Fireplaces | 4625.12646 | 10000 |
Garage_Cars | 5750.22932 | 10000 |
Garage_Area | 4248.50579 | 10000 |
Wood_Deck_SF | 1899.00078 | 10000 |
Open_Porch_SF | 224.51413 | 10000 |
Enclosed_Porch | 1142.26429 | 10000 |
Three_season_porch | 487.29049 | 10000 |
Screen_Porch | 3569.00946 | 10000 |
Pool_Area | -1114.39753 | 10000 |
Misc_Val | -5085.90458 | 10000 |
Mo_Sold | -203.12359 | 10000 |
Year_Sold | -835.49586 | 10000 |
Longitude | 650.40625 | 10000 |
Latitude | 5328.30675 | 10000 |
MS_SubClass_One_Story_1945_and_Older | -4126.27103 | 10000 |
MS_SubClass_One_and_Half_Story_Finished_All_Ages | -1861.67028 | 10000 |
MS_SubClass_Two_Story_1946_and_Newer | -1514.91540 | 10000 |
MS_SubClass_One_Story_PUD_1946_and_Newer | -4117.65295 | 10000 |
MS_SubClass_other | -1733.26832 | 10000 |
MS_Zoning_Residential_Medium_Density | -5316.31128 | 10000 |
MS_Zoning_other | -5176.60141 | 10000 |
Street_other | -13837.22636 | 10000 |
Alley_other | -1461.83621 | 10000 |
Lot_Shape_Slightly_Irregular | 3700.17083 | 10000 |
Lot_Shape_other | 961.05230 | 10000 |
Land_Contour_other | 3230.49668 | 10000 |
Utilities_other | -8536.41564 | 10000 |
Lot_Config_CulDSac | 11616.63150 | 10000 |
Lot_Config_Inside | 357.65609 | 10000 |
Lot_Config_other | -5147.76184 | 10000 |
Land_Slope_other | 3944.23456 | 10000 |
Neighborhood_College_Creek | 1894.52790 | 10000 |
Neighborhood_Old_Town | -1468.07869 | 10000 |
Neighborhood_Edwards | -7179.02156 | 10000 |
Neighborhood_Somerset | 12865.77564 | 10000 |
Neighborhood_Northridge_Heights | 35341.47415 | 10000 |
Neighborhood_Gilbert | -19167.19093 | 10000 |
Neighborhood_other | 8049.59728 | 10000 |
Condition_1_Norm | 10577.33452 | 10000 |
Condition_1_other | 2555.17042 | 10000 |
Condition_2_other | 7230.83188 | 10000 |
Bldg_Type_TwnhsE | -11776.64761 | 10000 |
Bldg_Type_other | -15016.92568 | 10000 |
House_Style_One_Story | 2517.46869 | 10000 |
House_Style_Two_Story | -1834.73368 | 10000 |
House_Style_other | -3583.23800 | 10000 |
Overall_Cond_Above_Average | 387.29301 | 10000 |
Overall_Cond_Good | 5453.63364 | 10000 |
Overall_Cond_other | -198.67916 | 10000 |
Roof_Style_Hip | 9826.95184 | 10000 |
Roof_Style_other | -6285.85166 | 10000 |
Roof_Matl_other | 2956.86795 | 10000 |
Exterior_1st_MetalSd | 2401.59277 | 10000 |
Exterior_1st_Plywood | -423.31025 | 10000 |
Exterior_1st_VinylSd | 430.87774 | 10000 |
Exterior_1st_Wd.Sdng | 364.15513 | 10000 |
Exterior_1st_other | 9216.05135 | 10000 |
Exterior_2nd_MetalSd | 2810.59106 | 10000 |
Exterior_2nd_Plywood | -4782.09685 | 10000 |
Exterior_2nd_VinylSd | 1979.06774 | 10000 |
Exterior_2nd_Wd.Sdng | 3586.14793 | 10000 |
Exterior_2nd_other | 1975.61474 | 10000 |
Mas_Vnr_Type_None | 7067.96660 | 10000 |
Mas_Vnr_Type_Stone | 6432.96754 | 10000 |
Mas_Vnr_Type_other | -12999.85758 | 10000 |
Exter_Cond_Typical | -1223.58374 | 10000 |
Exter_Cond_other | -8374.06611 | 10000 |
Foundation_CBlock | -3752.43518 | 10000 |
Foundation_PConc | 4777.91201 | 10000 |
Foundation_other | 12.67181 | 10000 |
Bsmt_Cond_other | -1096.68254 | 10000 |
Bsmt_Exposure_Gd | 18454.01931 | 10000 |
Bsmt_Exposure_Mn | -6292.37350 | 10000 |
Bsmt_Exposure_No | -9993.27851 | 10000 |
Bsmt_Exposure_other | -3349.21932 | 10000 |
BsmtFin_Type_1_BLQ | -979.66229 | 10000 |
BsmtFin_Type_1_GLQ | 9866.03456 | 10000 |
BsmtFin_Type_1_LwQ | -4371.15616 | 10000 |
BsmtFin_Type_1_Rec | -2205.86621 | 10000 |
BsmtFin_Type_1_Unf | -2137.50633 | 10000 |
BsmtFin_Type_1_other | -54.65570 | 10000 |
BsmtFin_Type_2_other | -1995.42781 | 10000 |
Heating_other | -1333.94062 | 10000 |
Heating_QC_Good | -4130.03350 | 10000 |
Heating_QC_Typical | -7280.75514 | 10000 |
Heating_QC_other | -10421.28767 | 10000 |
Central_Air_Y | 305.63110 | 10000 |
Electrical_SBrkr | -843.81162 | 10000 |
Electrical_other | 145.95150 | 10000 |
Functional_other | -16837.88549 | 10000 |
Garage_Type_BuiltIn | 2912.19245 | 10000 |
Garage_Type_Detchd | -1782.91691 | 10000 |
Garage_Type_No_Garage | 1405.71932 | 10000 |
Garage_Type_other | -13424.85604 | 10000 |
Garage_Finish_No_Garage | 2074.34345 | 10000 |
Garage_Finish_RFn | -7304.15017 | 10000 |
Garage_Finish_Unf | -3707.86157 | 10000 |
Garage_Cond_Typical | 587.45190 | 10000 |
Garage_Cond_other | -4325.25632 | 10000 |
Paved_Drive_Paved | 1759.89335 | 10000 |
Paved_Drive_other | 3091.17128 | 10000 |
Pool_QC_other | 12835.93973 | 10000 |
Fence_No_Fence | -1541.73821 | 10000 |
Fence_other | -1034.11743 | 10000 |
Misc_Feature_other | 6558.20291 | 10000 |
Sale_Type_WD. | -6374.52035 | 10000 |
Sale_Type_other | -7784.24102 | 10000 |
Sale_Condition_Normal | 6003.00979 | 10000 |
Sale_Condition_Partial | 13836.66751 | 10000 |
Sale_Condition_other | 3630.29900 | 10000 |
lasso_reg_spec <- linear_reg(mixture = 1, penalty = 1e4) %>%
set_engine("glmnet")
lasso_reg_rec <- recipe(Sale_Price ~ ., data = ames_train) %>%
step_impute_knn(all_predictors()) %>%
step_normalize(all_numeric_predictors()) %>%
step_other(all_nominal_predictors()) %>%
step_dummy(all_nominal_predictors())
lasso_reg_wf <- workflow() %>%
add_model(lasso_reg_spec) %>%
add_recipe(lasso_reg_rec)
lasso_reg_results <- lasso_reg_wf %>%
fit_resamples(ames_folds)
lasso_reg_results %>%
collect_metrics()
.metric | .estimator | mean | n | std_err | .config |
---|---|---|---|---|---|
rmse | standard | 40537.9245852 | 5 | 3303.1924456 | Preprocessor1_Model1 |
rsq | standard | 0.7821723 | 5 | 0.0229465 | Preprocessor1_Model1 |
Again, we wouldn’t generally fit the LASSO model at this time, however you can see how to do that and examine the estimated model below.
term | estimate | penalty |
---|---|---|
(Intercept) | 174537.449 | 10000 |
Lot_Frontage | 0.000 | 10000 |
Lot_Area | 0.000 | 10000 |
Year_Built | 7478.097 | 10000 |
Year_Remod_Add | 6606.480 | 10000 |
Mas_Vnr_Area | 3398.018 | 10000 |
BsmtFin_SF_1 | 0.000 | 10000 |
BsmtFin_SF_2 | 0.000 | 10000 |
Bsmt_Unf_SF | 0.000 | 10000 |
Total_Bsmt_SF | 12271.407 | 10000 |
First_Flr_SF | 0.000 | 10000 |
Second_Flr_SF | 0.000 | 10000 |
Gr_Liv_Area | 26200.798 | 10000 |
Bsmt_Full_Bath | 0.000 | 10000 |
Bsmt_Half_Bath | 0.000 | 10000 |
Full_Bath | 0.000 | 10000 |
Half_Bath | 0.000 | 10000 |
Bedroom_AbvGr | 0.000 | 10000 |
Kitchen_AbvGr | 0.000 | 10000 |
TotRms_AbvGrd | 0.000 | 10000 |
Fireplaces | 3226.727 | 10000 |
Garage_Cars | 6992.504 | 10000 |
Garage_Area | 4368.112 | 10000 |
Wood_Deck_SF | 0.000 | 10000 |
Open_Porch_SF | 0.000 | 10000 |
Enclosed_Porch | 0.000 | 10000 |
Three_season_porch | 0.000 | 10000 |
Screen_Porch | 0.000 | 10000 |
Pool_Area | 0.000 | 10000 |
Misc_Val | 0.000 | 10000 |
Mo_Sold | 0.000 | 10000 |
Year_Sold | 0.000 | 10000 |
Longitude | 0.000 | 10000 |
Latitude | 0.000 | 10000 |
MS_SubClass_One_Story_1945_and_Older | 0.000 | 10000 |
MS_SubClass_One_and_Half_Story_Finished_All_Ages | 0.000 | 10000 |
MS_SubClass_Two_Story_1946_and_Newer | 0.000 | 10000 |
MS_SubClass_One_Story_PUD_1946_and_Newer | 0.000 | 10000 |
MS_SubClass_other | 0.000 | 10000 |
MS_Zoning_Residential_Medium_Density | 0.000 | 10000 |
MS_Zoning_other | 0.000 | 10000 |
Street_other | 0.000 | 10000 |
Alley_other | 0.000 | 10000 |
Lot_Shape_Slightly_Irregular | 0.000 | 10000 |
Lot_Shape_other | 0.000 | 10000 |
Land_Contour_other | 0.000 | 10000 |
Utilities_other | 0.000 | 10000 |
Lot_Config_CulDSac | 0.000 | 10000 |
Lot_Config_Inside | 0.000 | 10000 |
Lot_Config_other | 0.000 | 10000 |
Land_Slope_other | 0.000 | 10000 |
Neighborhood_College_Creek | 0.000 | 10000 |
Neighborhood_Old_Town | 0.000 | 10000 |
Neighborhood_Edwards | 0.000 | 10000 |
Neighborhood_Somerset | 0.000 | 10000 |
Neighborhood_Northridge_Heights | 22598.017 | 10000 |
Neighborhood_Gilbert | 0.000 | 10000 |
Neighborhood_other | 0.000 | 10000 |
Condition_1_Norm | 0.000 | 10000 |
Condition_1_other | 0.000 | 10000 |
Condition_2_other | 0.000 | 10000 |
Bldg_Type_TwnhsE | 0.000 | 10000 |
Bldg_Type_other | 0.000 | 10000 |
House_Style_One_Story | 0.000 | 10000 |
House_Style_Two_Story | 0.000 | 10000 |
House_Style_other | 0.000 | 10000 |
Overall_Cond_Above_Average | 0.000 | 10000 |
Overall_Cond_Good | 0.000 | 10000 |
Overall_Cond_other | 0.000 | 10000 |
Roof_Style_Hip | 0.000 | 10000 |
Roof_Style_other | 0.000 | 10000 |
Roof_Matl_other | 0.000 | 10000 |
Exterior_1st_MetalSd | 0.000 | 10000 |
Exterior_1st_Plywood | 0.000 | 10000 |
Exterior_1st_VinylSd | 0.000 | 10000 |
Exterior_1st_Wd.Sdng | 0.000 | 10000 |
Exterior_1st_other | 0.000 | 10000 |
Exterior_2nd_MetalSd | 0.000 | 10000 |
Exterior_2nd_Plywood | 0.000 | 10000 |
Exterior_2nd_VinylSd | 0.000 | 10000 |
Exterior_2nd_Wd.Sdng | 0.000 | 10000 |
Exterior_2nd_other | 0.000 | 10000 |
Mas_Vnr_Type_None | 0.000 | 10000 |
Mas_Vnr_Type_Stone | 0.000 | 10000 |
Mas_Vnr_Type_other | 0.000 | 10000 |
Exter_Cond_Typical | 0.000 | 10000 |
Exter_Cond_other | 0.000 | 10000 |
Foundation_CBlock | 0.000 | 10000 |
Foundation_PConc | 3534.049 | 10000 |
Foundation_other | 0.000 | 10000 |
Bsmt_Cond_other | 0.000 | 10000 |
Bsmt_Exposure_Gd | 10297.300 | 10000 |
Bsmt_Exposure_Mn | 0.000 | 10000 |
Bsmt_Exposure_No | 0.000 | 10000 |
Bsmt_Exposure_other | 0.000 | 10000 |
BsmtFin_Type_1_BLQ | 0.000 | 10000 |
BsmtFin_Type_1_GLQ | 7603.684 | 10000 |
BsmtFin_Type_1_LwQ | 0.000 | 10000 |
BsmtFin_Type_1_Rec | 0.000 | 10000 |
BsmtFin_Type_1_Unf | 0.000 | 10000 |
BsmtFin_Type_1_other | 0.000 | 10000 |
BsmtFin_Type_2_other | 0.000 | 10000 |
Heating_other | 0.000 | 10000 |
Heating_QC_Good | 0.000 | 10000 |
Heating_QC_Typical | 0.000 | 10000 |
Heating_QC_other | 0.000 | 10000 |
Central_Air_Y | 0.000 | 10000 |
Electrical_SBrkr | 0.000 | 10000 |
Electrical_other | 0.000 | 10000 |
Functional_other | 0.000 | 10000 |
Garage_Type_BuiltIn | 0.000 | 10000 |
Garage_Type_Detchd | 0.000 | 10000 |
Garage_Type_No_Garage | 0.000 | 10000 |
Garage_Type_other | 0.000 | 10000 |
Garage_Finish_No_Garage | 0.000 | 10000 |
Garage_Finish_RFn | 0.000 | 10000 |
Garage_Finish_Unf | 0.000 | 10000 |
Garage_Cond_Typical | 0.000 | 10000 |
Garage_Cond_other | 0.000 | 10000 |
Paved_Drive_Paved | 0.000 | 10000 |
Paved_Drive_other | 0.000 | 10000 |
Pool_QC_other | 0.000 | 10000 |
Fence_No_Fence | 0.000 | 10000 |
Fence_other | 0.000 | 10000 |
Misc_Feature_other | 0.000 | 10000 |
Sale_Type_WD. | 0.000 | 10000 |
Sale_Type_other | 0.000 | 10000 |
Sale_Condition_Normal | 0.000 | 10000 |
Sale_Condition_Partial | 0.000 | 10000 |
Sale_Condition_other | 0.000 | 10000 |
Here are only the predictors with non-zero coefficients
term | estimate | penalty |
---|---|---|
(Intercept) | 174537.449 | 10000 |
Year_Built | 7478.097 | 10000 |
Year_Remod_Add | 6606.480 | 10000 |
Mas_Vnr_Area | 3398.018 | 10000 |
Total_Bsmt_SF | 12271.407 | 10000 |
Gr_Liv_Area | 26200.798 | 10000 |
Fireplaces | 3226.727 | 10000 |
Garage_Cars | 6992.504 | 10000 |
Garage_Area | 4368.112 | 10000 |
Neighborhood_Northridge_Heights | 22598.017 | 10000 |
Foundation_PConc | 3534.049 | 10000 |
Bsmt_Exposure_Gd | 10297.300 | 10000 |
BsmtFin_Type_1_GLQ | 7603.684 | 10000 |
The more predictors we include in a model, the more flexible that model is
We can use regularization methods to constrain our models and make overfitting less likely
Two techniques commonly used with linear regression models are Ridge Regression and the LASSO
These methods alter the optimization problem that obtains the estimated \(\beta\)-coefficients for our model
Ridge Regression attaches very small coefficients to uninformative predictors, while the LASSO attaches coefficients of \(0\) to them
Both Ridge Regression and the LASSO require all numerical predictors to be scaled
We can fit/cross-validate these models in nearly the same way that we have been working with ordinary linear regression models
set_engine("glmnet")
rather than set_engine("lm")
for ridge and LASSOmixture = 0
for Ridge Regression and mixture = 1
for the LASSOpenalty
parameter which determines the amount of regularization (constraint) applied to the modelOther Classes of Regression Model