Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 30, 2024

How to fix the attached error. Using the auto data set and using the scikit learn library 2 . Create and add a binary variable

How to fix the attached error.

Using the auto data set and using the scikit learn library

2 .

Create and add a binary variable column called mpg

_

high

_

low to the dataset that is set to High if mpg is a value above

30,

and a Low if mpg is a value less than or equal to

30 .

Make sure the mpg

_

high

_

low column is of type category.

3 .

Check if the auto data is imbalanced with respect to mpg

_

high

_

low. Report the percentage of the data that belong to the two classes

(

High and Low

) .

4 .

Split the dataset into

75 %

training and

25 %

test and use

10

fold cross validation for the models below

5 .

Fit a logistic regression model to the training set to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Predict the mpg

_

high

_

low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F

1

measure.

6 .

Alter the threshold for classifying a Low to

0.6

and report the changes in the test performance metrics from those reported in Qn

5 .

7 .

Find the optimal threshold by drawing the ROC curve. Change the threshold to the optimal value you found from the ROC curve and report the changes in the test performance metrics from those reported in Qn

5 .

8 .

Fit a Na

ve Bayes model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Predict the mpg

_

high

_

low using the test dataset. Plot the ROC curve and report the best threshold on the ROC curve plot. Report the AUC on the curve plot as well. Report the accuracy, precision, recall, specificity and F

1

score.

9 .

Fit a KNN model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Use a grid search between

3

and

10

to find the best value of k

.

Report the accuracy, precision, recall, specificity, F

1

score and AUC.

10 .

Fit a LDA model to the training data to predict mpg

_

high

_

low using all the other features

/

variables except mpg

,

year, origin, and name. Report the accuracy, precision, recall, specificity and F

1

score.

11 .

Summarize the performance of the all the above models by creating a dataframe with

4

columns

Model

_

Name, Accuracy, Precision, Recall, Specificity, F

1

Score. The data frame should contain one row for each model you built above with each of the columns filled in with the appropriate metric. Print out the dataframe. Which model performed the best from an accuracy point of view and which model performed best from a recall point of view without adjusting for the threshold?Alter the threshold for classifying a Low to

0.6

and report the changes in the test performance metrics from those reported in On

5 .

O I Alar the threbheld for classlfyling a Lou to e

. 6

threshols

=

.

prist

("

Mioglstic Ragresiton with Thwobheld of

{0.62}^{7})

("

Accuracy:

",

Asc

_

log

_

rez.thresh

)

oriat

("

Prectisioni

",

prec

_

leg

_

ref

_

thresh

)

("

Recall:

",

rec

_

log

.

reg, threnh

)

DypeError Tracabuck

(

nowt recent call last

),

Alter the threshold fer classifying a tew to

6.0

Dbefrnor: dats 'vpe 'catesiry' not inserateas

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional Microsoft SQL Server 2012 Administration

Authors: Adam Jorgensen, Steven Wort

1st Edition

Question 1 (13 pts) Read the following and use it to answer the following questions: James owns a factory that produces fuel from bio waste and sells the fuel to the agricultural industry. He holds a...

Answered: 1 week ago

Previous Question Next Question