Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 02, 2024

Linear Model Selection and Regularization You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge

Linear Model Selection and Regularization

You use the glmnet package to perform lasso regression. parsnip does not have a dedicated function to create a ridge regression model specification. You need to use linear

_

reg

()

and set mixture

= 1

to specify a lasso model. The mixture argument specifies the amount of different types of regularization, mixture

= 0

specifies only ridge regularization and mixture

= 1

specifies only lasso regularization. Setting mixture to a value between

0

and

1

lets us use both.

The following procedure will be very similar to what we saw in the ridge regression section. The preprocessing needed is the same, but let us write it out again.

# Run this code from the previous assignment to get you properly started.

library

(

tidymodels

)

library

(

ISLR

2)

Hitters

< -

_

tibble

(

Hitters

) % > %

filter

(!

.

(

Salary

))

Hitters

_

split

< -

initial

_

split

(

Hitters

,

strata

=

"Salary"

)

Hitters

_

train

< -

training

(

Hitters

_

split

)

Hitters

_

test

< -

testing

(

Hitters

_

split

)

Hitters

_

fold

< -

vfold

_

(

Hitters

_

train, v

= 10)

Run the Block of code below

lasso

_

recipe

< -

recipe

(

formula

=

Salary ~

.,

data

=

Hitters

_

train

) % > %

step

_

novel

(

all

_

nominal

_

predictors

()) % > %

step

_

dummy

(

all

_

nominal

_

predictors

()) % > %

step

_

(

all

_

predictors

()) % > %

step

_

normalize

(

all

_

predictors

())

Next, finish the lasso regression workflow. Have the two outputs lasso

_

spec and lasso

_

workflow respectively. For the lasso

_

spec output use the functions linear

_

reg, set

_

mode and set

_

engine functions. For the lasso

_

workflow output use the add

_

recipe and add

_

model outputs.

lasso

_

spec

< -

linear

_

reg

(

penalty

=

tune

(),

mixture

= 1) % > %

set

_

mode

("

regression

") % > %

set

_

engine

("

glmnet

")

lasso

_

workflow

< -

workflow

() % > %

add

_

recipe

(

lasso

_

recipe

) % > %

add

_

model

(

lasso

_

spec

)

While you are doing a different kind of regularization you will still use the same penalty argument. I have picked a different range for the values of penalty since I know it will be a good range. You would in practice have to cast a wide net at first and then narrow on the range of interest. Use the output penalty

_

grid. Use

50

levels and set a range going from

[- 2, 2] .

Use the function grid

_

regular.

*

your code here

*

#penalty

_

grid

< -

penalty

_

grid

< -

grid

_

regular

(

penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2))

)

# your code here

Error in penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2))

: unused argument

(

levels

= 50)

Traceback:

1 .

grid

_

regular

(

penalty

(

tune

(),

levels

= 50,

range

=

(- 2, 2)))

library

(

testthat

)

expect

_

equal

(

penalty

_

grid$penalty

[1], 0.01)

expect

_

equal

(

penalty

_

grid$penalty

[25], 0.910298177991522)

expect

_

equal

(

penalty

_

grid$penalty

[50], 100)

You can tune

_

grid

()

again. Use the output tune

_

res along with the function tune

_

grid. Use autoplot to plot your tune

_

res outout. Your output should resemble this plot.

# your code here

Next, you should select the best value of penalty using select

_

best

() .

Your output variable here is best

_

penalty. Use

"

rsq

"

as the metric.

*

your code here

*

# best

_

penalty

< -

# your code here

You should now refit using the whole training data set. Your output variable should be lasso

_

final with the function finalize

_

workflow and your second output variable should be lasso

_

final

_

fit with the fit function.

# your code here

Finalize this by calculating the rsq value for the lasso model. You will see tha seee that for this data ridge regression does better than lasso regession. Verify this using the augment then the rsq function. Store the output to the variable rsq