Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 02, 2024

Linear Model Selection and Regularization This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning.

Linear Model Selection and Regularization

This programming assignment will use the Tidy Models platform. It will take a look at regularization models and hyperparameter tuning. These models contain a regularization term. This assignment will use parsnip for model fitting and recipes and workflows to perform the transformations, and tune and dials to tune the hyperparameters of the model.

You will be using the Hitters data set from the ISLR package. You wish to predict the baseball players Salary based on several different characteristics which are included in the data set.

Since you wish to predict Salary, then you need to remove any missing data from that column. Otherwise, you won't be able to run the models.

Set output as Hitters

library $($ tidymodels $)$

library $($ ISLR $2)$

# Your code here

# Hitters $< -$

Hitters $< -$ ISLR $2$ ::Hitters

Hitters $< -$ Hitters $% > %$ drop $_$ na $($ Salary $)$

# your code here

Attaching packages tidymodels $1.0.0$

broom $1.0.4$ recipes $1.0.5$

dials $1.1.0$ rsample $1.1.1$

dplyr $1.1.0$ tibble $3.2.0$

ggplot $2 3.4.1$ tidyr $1.3.0$

infer $1.0.4$ tune $1.0.1$

modeldata $1.1.0$ workflows $1.1.3$

parsnip $1.0.4$ workflowsets $1.0.0$

purrr $1.0.1$ yardstick $1.1.0$

Conflicts tidymodels $_$ conflicts $()$

purrr::discard $()$ masks scales::discard $()$

dplyr::filter $()$ masks stats::filter $()$

dplyr::lag $()$ masks stats::lag $()$

recipes::step $()$ masks stats::step $()$

Use suppressPackageStartupMessages $()$ to eliminate package startup messages

# Hidden Tests

You will use the glmnet package to perform ridge regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linear $_$ reg $()$ and set mixture $= 0$ to specify a ridge model. The mixture argument specifies the amount of different types of regularization, mixture $= 0$ specifies only ridge regularization and mixture $= 1$ specifies only lasso regularization.

Setting mixture to a value between $0$ and $1$ lets us use both. When using the glmnet engine you also need to set a penalty to be able to fit the model. You will set this value to $0$ for now, it is not the best value, but you will look at how to select the best value in a little bit.

ridge $_$ spec $< -$ linear $_$ reg $($ mixture $= 0,$ penalty $= 0) % > %$

set $_$ mode $("$ regression $") % > %$

set $_$ engine $("$ glmnet $")$

Once the specification is created you can fit it to you data. You will use all the predictors. Use the fit function here.

ridge $_$ fit $< -$ fit $($ ridge $_$ spec, Salary ~ $.,$ data $=$ Hitters $)$

The glmnet package will fit the model for all values of penalty at once, so you can now see see what the parameter estimate for the model is now that you have penalty $= 0 .$ You can use the tidy function to accomplish this specific task.

tidy $($ ridge $_$ fit $)$

Loading required package: Matrix

Attaching package: Matrix

The following objects are masked from package:tidyr:

expand, pack, unpack

Loaded glmnet $4.1 - 6$

A tibble: $20 \$ times $3$

term estimate penalty

$($ Intercept $) 8.112693$ e $+ 01 0$

AtBat $- 6.815959$ e $- 01 0$

Hits $2.772312$ e $+ 00 0$

HmRun $- 1.365680$ e $+ 00 0$

Runs $1.014826$ e $+ 00 0$

RBI $7.130224$ e $- 01 0$

Walks $3.378558$ e $+ 00 0$

Years $- 9.066800$ e $+ 00 0$

CAtBat $- 1.199478$ e $- 03 0$

CHits $1.361029$ e $- 01 0$

CHmRun $6.979958$ e $- 01 0$

CRuns $2.958896$ e $- 01 0$

CRBI $2.570711$ e $- 01 0$

CWalks $- 2.789666$ e $- 01 0$

LeagueN $5.321272$ e $+ 01 0$

DivisionW $- 1.228345$ e $+ 02 0$

PutOuts $2.638876$ e $- 01 0$

Assists $1.698796$ e $- 01 0$

Errors $- 3.685645$ e $+ 00 0$

NewLeagueN $- 1.810510$ e $+ 01 0$

Let us instead see what the estimates would be if the penalty was $11498 .$ Store your output to tidy $2 .$ What do you notice?

# Your code here

# tidy $2 < -$

tidy $2 < -$ tidy $($

linear $_$ reg $($ penalty $= 11498,$ mixture $= 0) % > %$

set $_$ mode $("$ regression $") % > %$

set $_$ engine $("$ glmnet $") % > %$

fit $($ Salary ~ $.,$ data $=$ Hitters $)$

$)$

# Print the parameter estimates for penalty $= 11498$

tidy $2$

# your code here

A tibble: $20 \$ times $3$

term estimate penalty

$($ Intercept $) 407.205936774 11498$

AtBat $0.037003083 11498$

Hits $0.138357552 11498$

HmRun $0.525195508 11498$

Runs $0.230978290 11498$

RBI $0.240114775 11498$

Walks $0.289971555 11498$

Years $1.108832399 11498$

CAtBat $0.003135215 11498$

CHits $0.011666684 11498$

CHmRun $0.087642789 11498$

CRuns $0.023406258 11498$

CRBI $0.024165723 11498$

CWalks $0.025042117 11498$

LeagueN $0.086629234 11498$

DivisionW $- 6.225431332 11498$

PutOuts $0.016506596 11498$

Assists $0.002616335 11498$

Errors $- 0.020564158 11498$

NewLeagueN $0.302922899 11498$

# Hidden Tests

Look below at the parameter estimates for penalty $= 705 .$ Store your output to tidy $3 .$ Once again, use the tidy function to accomplish this task.

# Your code here

# tidy $3 < -$

tidy $3 < -$ tidy $($

linear $_$ reg $($ penalty $= 705,$ mixture $= 0) % > %$

set $_$ mode $("$ regression $") % > %$

set $_$ engine $("$ glmnet $") % > %$

fit $($ Salary ~ $.,$ data $=$ Hitters $)$

$)$

# Print the parameter estimates for penalty $= 705$

tidy $3$

# your code here

A tibble: $20 \$ times $3$

term estimat

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

Apply constructivist principles to classroom practice including using inquiry, problem-based learning, and cognitive apprenticeships.

Answered: 1 week ago

Question

★★★★★

The operations vice-president of the Regal Bank of Canada, Kristin Wu, has been interested in investigating the efficiency of the banks operations. She has been particularly concerned about the costs...

Answered: 1 week ago

Question

★★★★★

In an article published June 24, 2018 for Accounting in Europe, Richard Barker and Alan Teixeira maintain: The stated purpose of the IFRS Conceptual Framework is to assist the IASB to develop...

Answered: 1 week ago

Question

★★★★★

2,152,500 271,250 458,000 100% 13% 21% Sales Direct Labor Expense Direct Material Expense MOH Machine related Exps 336,000 Setup Labor 40,000 Receiving&Production Control 180,000 Engineering 100,000...

Answered: 1 week ago

Question

★★★★★

6.2 Estimate horizontal and vertical tail sizes for a twin-turboprop regional transport with wing area of 40 m, aspect ratio of 10, taper ratio of 0.5, leading-edge sweep of 5 deg, and tail moment...

Answered: 1 week ago

Question

★★★★★

1. Jane wants to set aside funds to take an around the world cruise in four years. Assuming that Jane has $20,000 to invest today in an account expected to earn 6% per annum, how much will she have...

Answered: 1 week ago

Question

★★★★★

Can you show that the non-dimensionally, power of hydraulic machines as P 3D5 e D2 D'wD

Answered: 1 week ago

Question

★★★★★

The purpose of this research assignment is to have you, complete research on a topic that may have an impact on Strategic Planning in Supply Chain. Please add a maximum of three exhibits ( graphs,...

Answered: 1 week ago

Question

★★★★★

Five faculty members will be issued laptop computers with network interface controllers (NICS). The faculty members want to use their laptops in their classrooms and offices. There are also...

Answered: 1 week ago

Previous Question Next Question