Answered step by step
Verified Expert Solution
Question
1 Approved Answer
using Rstudio 5. In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas
using Rstudio
5. In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set. (a) Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median. Make sure to dispose of original mpg value in the end (Auto$mpg = NULL). (b) Run set.seed(1). Fit a support vector classifier to the data with various values of cost (similar to the lab), in order to predict whether a car gets high or low gas mileage. Report the cross-validation errors associated with different values of this parameter. (c) Which cost value gave you the misclassification error? Print the confusion matrix for the optimal cost from part (c). Describe the two types of errors your model can make, and which of those two types does it commit more frequently? (d) Plot the fitted boundary of the optimal support vector classifier for a couple of predictor pairings. Hint: In the lab, we used the plot() function for sum objects only in cases with p = 2. When p > 2, you can use the plot() function to create plots displaying pairs of variables at a time. Essentially, instead of typing > plot (svmfit, dat) where sum fit contains your fitted model and dat is a data frame containing your data, you can type > plot (svmfit, dat, x1-x4) in order to plot just the first and fourth variables. However, you must replace 1 and 4 with the correct variable names. To find out more, type ?plot.sum. 5. In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set. (a) Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median. Make sure to dispose of original mpg value in the end (Auto$mpg = NULL). (b) Run set.seed(1). Fit a support vector classifier to the data with various values of cost (similar to the lab), in order to predict whether a car gets high or low gas mileage. Report the cross-validation errors associated with different values of this parameter. (c) Which cost value gave you the misclassification error? Print the confusion matrix for the optimal cost from part (c). Describe the two types of errors your model can make, and which of those two types does it commit more frequently? (d) Plot the fitted boundary of the optimal support vector classifier for a couple of predictor pairings. Hint: In the lab, we used the plot() function for sum objects only in cases with p = 2. When p > 2, you can use the plot() function to create plots displaying pairs of variables at a time. Essentially, instead of typing > plot (svmfit, dat) where sum fit contains your fitted model and dat is a data frame containing your data, you can type > plot (svmfit, dat, x1-x4) in order to plot just the first and fourth variables. However, you must replace 1 and 4 with the correct variable names. To find out more, type ?plot.sumStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started