Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 23, 2024

Homework 4 Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting

Homework

4

Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting homework. For most of the questions below you can modify the code in the examples provided. Please turn in a Jupyter notebook with the answers.

1 .

This homework is a continuation of HW

3 .

Use the same Auto.csv dataset as in HW

3

and the binary variable mpg

_

high

_

low you created in HW

3

2 .

Split the dataset into

75 %

training and

25 %

test and use

10

fold cross validation for the models below

3 .

Fit an SVM model to the training set to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Use a rbf kernel and cost parameter found by tuning using grid search of

10

evenly linearly spaced numbers in the range

0.1

100

and the gamma parameter found by searching

10

evenly logarithmically spaced numbers with a start value of

- 9

and stop value of

3 (

hint: use numpy logspace

) .

Predict the mpg

_

high

_

low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F

1

measure.

4 .

Fit a decision tree model to the training set to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Predict the mpg

_

high

_

low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F

1

measure.

5 .

Fit a Random Forest model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Use a n

_

estimator parameter found by searching amongst the values

50, 100, 200, 500

and max

_

depth parameter found by searching over the values

2, 5, 10

and

15 .

Predict the mpg

_

high

_

low using the test dataset.

6 .

Fit a XGBoost model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Use a learning rate found by tuning using grid search of

10

evenly linearly spaced numbers in the range

0.1

1 .

Report the accuracy, precision, recall, specificity, F

1

score and AUC.

7 .

Fit a Stacked Classifier model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. The models you need to stack are

SVM

,

decision tree, KNN

,

and Na

ve Bayes. Report the accuracy, precision, recall, specificity and F

1

score.

8 .

Summarize the performance of the all the above models by creating a dataframe with

6

columns

Model

_

Name, Accuracy, Precision, Recall, Specificity, F

1

Score. The data frame should contain one row for each model you built above with each of the columns filled in with the appropriate metric. Print out the dataframe. Which model performed the best from an accuracy point of view and which model performed best from a recall point of view? Of all the models you built in HW