Question
Fit a logistic regression model on the Caravan data set from the R package ISLR. This data set, also analyzed in Sec 4.6.6 of ISLR,
Fit a logistic regression model on the "Caravan" data set from the R package "ISLR". This data set, also analyzed in Sec 4.6.6 of ISLR, has 85 predictors and the response variable is "Purchase" that is equal to "Yes" or "No".
We use the first 1000 obs as the test data and the remaining as the training data. In the test data, there are 941 "No" and 59 "Yes". For each of the approaches below, report the number of mis-classified samples among the 941 "No" and the number of mis-classified samples among 59 "Yes", if we use 0.25 as the predicted probability cut-off. Also use the R package "pROC" to report the corresponding AUC. For the definition of AUC and ROC, read pp146-149 of ISLR.
Fit a logistic regression model using all 85 predictors, and obtain the predicted probabilities on the test data.
- If we use 0.25 as the probability cut-off, we misclassify ________[a1] (an integer) samples among 941 "No" and misclassifty ________[b1] (an integer) samples among 59 "Yes".
- The AUC for this classifier is ______[c1] (round to 3 digits after the decimal point).
Apply forward variable selection using AIC. Use the selected model to obtain the predicted probabilities on the test data.
- We use a model with ______[d2] (a non-negative integer) non-intercept predictors.
- If we use 0.25 as the probability cut-off, we misclassify ______ [a2] (an integer) samples among 941 "No" and misclassifty ______ [b2] (an integer) samples among 59 "Yes".
- The AUC for this classifier is ______ [c2] (round to 3 digits after the decimal point).
Apply forward variable selection using BIC. Use the selected model to obtain the predicted probabilities on the test data.
- We use a model with ______ [d3] (a non-negative integer) non-intercept predictors.
- If we use 0.25 as the probability cut-off, we misclassify ______ [a3] (an integer) samples among 941 "No" and misclassifty ______ [b3] (an integer) samples among 59 "Yes".
- The AUC for this classifier is ______ [c3] (round to 3 digits after the decimal point).
Use L1 penalty to select a subset of the predictors. Use the glmnet package and set lambda = 0.004, and use the default options such as standardize = TRUE, intercept=TRUE. Use the selected model to obtain the predicted probabilities on the test data.
- We use a model with ______ [d4] (a non-negative integer) non-intercept predictors.
- If we use 0.25 as the probability cut-off, we misclassify ______ [a4] (an integer) samples among 941 "No" and misclassifty ______ [b4] (an integer) samples among 59 "Yes".
- The AUC for this classifier is ______ [c4] (round to 3 digits after the decimal point).
Result for:
a1:
b1:
c1
d2
a2
b2
c2
d3
a3
b3
c3
d4
a4
ba
ca
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started