Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Auto Consider the 'Auto' data set from the ISLR package in R. The dependent variable in the data set is 'mpg'. ##### R commands to
Auto
Consider the 'Auto' data set from the ISLR package in R. The dependent variable in the data set is 'mpg'. ##### R commands to load the data library(ISLR) Auto=na.omit(Auto) # remove observations having N.A. variable ##### 1. Obtain summary statistics for the variables and perform any data cleaning that you think is necessary. 2. Perform feature selection on the independent variables to determine which ones should be included in the model. 3. Create a linear regression model, robust regression model, and polynomial regression model to determine which provides the most accurate predictions. a. Make sure to split the data set into training/test sets. Estimate the model coefficients using the training set, and make predictions onto the test set. b. For the robust regression model, do not remove any outliers that might exist in the Auto data set. C. For the polynomial regression model, determine which degree of the polynomial is best to prevent overfitting the model onto the training data (i.e., good balance between accuracy in training and testing data sets). 4. Select the model with the best performance from Q3 and perform k-fold cross validation to observe how sensitive it is with respect to different training/test sets (feel free to try different values of k when performing the cross validation). 5. Build a neural network using the 'neuralnet' package and determine if it provides better approximations than the other models that you built in Q3. a. Try different neural network configurations by adjusting the number of hidden layers and the number of nodes within each hidden layer. Consider the 'Auto' data set from the ISLR package in R. The dependent variable in the data set is 'mpg'. ##### R commands to load the data library(ISLR) Auto=na.omit(Auto) # remove observations having N.A. variable ##### 1. Obtain summary statistics for the variables and perform any data cleaning that you think is necessary. 2. Perform feature selection on the independent variables to determine which ones should be included in the model. 3. Create a linear regression model, robust regression model, and polynomial regression model to determine which provides the most accurate predictions. a. Make sure to split the data set into training/test sets. Estimate the model coefficients using the training set, and make predictions onto the test set. b. For the robust regression model, do not remove any outliers that might exist in the Auto data set. C. For the polynomial regression model, determine which degree of the polynomial is best to prevent overfitting the model onto the training data (i.e., good balance between accuracy in training and testing data sets). 4. Select the model with the best performance from Q3 and perform k-fold cross validation to observe how sensitive it is with respect to different training/test sets (feel free to try different values of k when performing the cross validation). 5. Build a neural network using the 'neuralnet' package and determine if it provides better approximations than the other models that you built in Q3. a. Try different neural network configurations by adjusting the number of hidden layers and the number of nodes within each hidden layerStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started