Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Homework 4 Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting

Homework 4
Use the scikit learn library for all the models except when mentioned to use another library. Review examples provided on Blackboard before attempting homework. For most of the questions below you can modify the code in the examples provided. Please turn in a Jupyter notebook with the answers.
1. This homework is a continuation of HW 3. Use the same Auto.csv dataset as in HW3 and the binary variable mpg_high_low you created in HW 3
2. Split the dataset into 75% training and 25% test and use 10 fold cross validation for the models below
3. Fit an SVM model to the training set to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a rbf kernel and cost parameter found by tuning using grid search of 10 evenly linearly spaced numbers in the range 0.1 to 100 and the gamma parameter found by searching 10 evenly logarithmically spaced numbers with a start value of -9 and stop value of 3(hint: use numpy logspace). Predict the mpg_high_low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F1 measure.
4. Fit a decision tree model to the training set to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Predict the mpg_high_low using the test dataset and report the Accuracy, Precision, Recall, Specificity, and F1 measure.
5. Fit a Random Forest model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a n_estimator parameter found by searching amongst the values 50,100,200,500 and max_depth parameter found by searching over the values 2,5,10 and 15. Predict the mpg_high_low using the test dataset.
6. Fit a XGBoost model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. Use a learning rate found by tuning using grid search of 10 evenly linearly spaced numbers in the range 0.1 to 1. Report the accuracy, precision, recall, specificity, F1 score and AUC.
7. Fit a Stacked Classifier model to the training data to predict mpg_high_low using all the other features/variables except mpg, year, origin, and name. The models you need to stack are SVM, decision tree, KNN, and Nave Bayes. Report the accuracy, precision, recall, specificity and F1 score.
8. Summarize the performance of the all the above models by creating a dataframe with 6 columns Model_Name, Accuracy, Precision, Recall, Specificity, F1 Score. The data frame should contain one row for each model you built above with each of the columns filled in with the appropriate metric. Print out the dataframe. Which model performed the best from an accuracy point of view and which model performed best from a recall point of view? Of all the models you built in HW3 and HW4 which one performed best from an F1 score perspective?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

=+3. How can either be made stronger?

Answered: 1 week ago