Question
Using R We now fit a GAM to predict Salary in the Hitters dataset. First, we remove the observations for whom the salary information is
Using R
We now fit a GAM to predict Salary in the Hitters dataset. First, we remove the observations for whom the salary information is unknown, and then we split the data set into a training set and a test set by using the following command lines.
library(ISLR)
data("Hitters")
Hitters <- Hitters[!is.na(Hitters$Salary),]
set.seed(10) train <- sample(nrow(Hitters), 200)
Hitters.train <- Hitters[train, ]
Hitters.test <- Hitters[-train, ]
(a)
Using log(Salary) (log-transformation of Salary) as response and the other variables as the predictors, perform forward stepwise selection on the training set in order to identify a satisfactory model that uses just a subset of the predictors.
(b)
Fit a GAM on the training data, using log(Salary) as the response and the features selected in the previous step as the predictors. Plot the results, and explain your findings.
(c)
Evaluate the model obtained on the test set. Try difference tuning parameters (if you are using smoothing splines s() then try different df's; if you are using local regression lo() then try different span's) and explain the results obtained.
(d)
For which variables, if any, is there evidence of a non-linear relationship with the response?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started