Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning.
Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a regularization term. This assignment will use parsnip for model fitting and recipes and workflows to perform the transformations, and tune and dials to tune the hyperparameters of the model. You will be using the Hitters data set from the ISLR package. You wish to predict the baseball players Salary based on several different characteristics which are included in the data set. Since you wish to predict Salary, then you need to remove any missing data from that column. Otherwise, you won't be able to run the models. Set output as Hitters librarytidymodels libraryISLR # Your code here # Hitters Hitters ISLR::Hitters Hitters Hitters dropnaSalary # your code here Attaching packages tidymodels broom recipes dials rsample dplyr tibble ggplot tidyr infer tune modeldata workflows parsnip workflowsets purrr yardstick Conflicts tidymodelsconflicts purrr::discard masks scales::discard dplyr::filter masks stats::filter dplyr::lag masks stats::lag recipes::step masks stats::step Use suppressPackageStartupMessages to eliminate package startup messages # Hidden Tests You will use the glmnet package to perform ridge regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linearreg and set mixture to specify a ridge model. The mixture argument specifies the amount of different types of regularization, mixture specifies only ridge regularization and mixture specifies only lasso regularization. Setting mixture to a value between and lets us use both. When using the glmnet engine you also need to set a penalty to be able to fit the model. You will set this value to for now, it is not the best value, but you will look at how to select the best value in a little bit. ridgespec linearregmixture penalty setmoderegression setengineglmnet Once the specification is created you can fit it to you data. You will use all the predictors. Use the fit function here. ridgefit fitridgespec, Salary ~ data Hitters The glmnet package will fit the model for all values of penalty at once, so you can now see see what the parameter estimate for the model is now that you have penalty You can use the tidy function to accomplish this specific task. tidyridgefit Loading required package: Matrix Attaching package: Matrix The following objects are masked from package:tidyr: expand, pack, unpack Loaded glmnet A tibble: times term estimate penalty Intercepte AtBat e Hits e HmRun e Runs e RBI e Walks e Years e CAtBat e CHits e CHmRun e CRuns e CRBI e CWalks e LeagueN e DivisionW e PutOuts e Assists e Errors e NewLeagueN e Let us instead see what the estimates would be if the penalty was Store your output to tidy What do you notice? # Your code here # tidy tidy tidy linearregpenalty mixture setmoderegression setengineglmnet fitSalary ~ data Hitters # Print the parameter estimates for penalty tidy # your code here A tibble: times term estimate penalty Intercept AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun CRuns CRBI CWalks LeagueN DivisionW PutOuts Assists Errors NewLeagueN # Hidden Tests Look below at the parameter estimates for penalty Store your output to tidy Once again, use the tidy function to accomplish this task. # Your code here # tidy tidy tidy linearregpenalty mixture setmoderegression setengineglmnet fitSalary ~ data Hitters # Print the parameter estimates for penalty tidy # your code here A tibble: times term estimat
Linear Model Selection and Regularization
This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a regularization term. This assignment will use parsnip for model fitting and recipes and workflows to perform the transformations, and tune and dials to tune the hyperparameters of the model.
You will be using the Hitters data set from the ISLR package. You wish to predict the baseball players Salary based on several different characteristics which are included in the data set.
Since you wish to predict Salary, then you need to remove any missing data from that column. Otherwise, you won't be able to run the models.
Set output as Hitters
librarytidymodels
libraryISLR
# Your code here
# Hitters
Hitters ISLR::Hitters
Hitters Hitters dropnaSalary
# your code here
Attaching packages tidymodels
broom recipes
dials rsample
dplyr tibble
ggplot tidyr
infer tune
modeldata workflows
parsnip workflowsets
purrr yardstick
Conflicts tidymodelsconflicts
purrr::discard masks scales::discard
dplyr::filter masks stats::filter
dplyr::lag masks stats::lag
recipes::step masks stats::step
Use suppressPackageStartupMessages to eliminate package startup messages
# Hidden Tests
You will use the glmnet package to perform ridge regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linearreg and set mixture to specify a ridge model. The mixture argument specifies the amount of different types of regularization, mixture specifies only ridge regularization and mixture specifies only lasso regularization.
Setting mixture to a value between and lets us use both. When using the glmnet engine you also need to set a penalty to be able to fit the model. You will set this value to for now, it is not the best value, but you will look at how to select the best value in a little bit.
ridgespec linearregmixture penalty
setmoderegression
setengineglmnet
Once the specification is created you can fit it to you data. You will use all the predictors. Use the fit function here.
ridgefit fitridgespec, Salary ~ data Hitters
The glmnet package will fit the model for all values of penalty at once, so you can now see see what the parameter estimate for the model is now that you have penalty You can use the tidy function to accomplish this specific task.
tidyridgefit
Loading required package: Matrix
Attaching package: Matrix
The following objects are masked from package:tidyr:
expand, pack, unpack
Loaded glmnet
A tibble: times
term estimate penalty
Intercepte
AtBat e
Hits e
HmRun e
Runs e
RBI e
Walks e
Years e
CAtBat e
CHits e
CHmRun e
CRuns e
CRBI e
CWalks e
LeagueN e
DivisionW e
PutOuts e
Assists e
Errors e
NewLeagueN e
Let us instead see what the estimates would be if the penalty was Store your output to tidy What do you notice?
# Your code here
# tidy
tidy tidy
linearregpenalty mixture
setmoderegression
setengineglmnet
fitSalary ~ data Hitters
# Print the parameter estimates for penalty
tidy
# your code here
A tibble: times
term estimate penalty
Intercept
AtBat
Hits
HmRun
Runs
RBI
Walks
Years
CAtBat
CHits
CHmRun
CRuns
CRBI
CWalks
LeagueN
DivisionW
PutOuts
Assists
Errors
NewLeagueN
# Hidden Tests
Look below at the parameter estimates for penalty Store your output to tidy Once again, use the tidy function to accomplish this task.
# Your code here
# tidy
tidy tidy
linearregpenalty mixture
setmoderegression
setengineglmnet
fitSalary ~ data Hitters
# Print the parameter estimates for penalty
tidy
# your code here
A tibble: times
term estimat
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started