Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this exercise using RStudio, generate simulated data, and will then use this data to perform best subset selection. Use the following code to generate

In this exercise using RStudio, generate simulated data, and will then use this data to perform best subset selection.

  1. Use the following code to generate a predictor X of length n = 100, as well as a noise vector ? of length n=100.

set.seed(1)

x <- rnorm(100)

eps <- rnorm(100)

2. Generate a response vector Y of length n=100 according to the model:

Y=?0+?1X1+?2X2+?3X3+?

Where ?0, ?1, ?2, ?3 are constants of your choice.

Sample code (replace the b0, b1, b2, b3 values of your choice):

b0 <- 2

b1 <- 3

b2 <- -1

b3 <- 0.5

y <- b0 + b1 * x + b2 * x^2 + b3 * x^3 + eps

3. Use the regsubsets() function to perform best subset selection in order to choose the best model containing the predictors X,X2,?,X10. What is the best model obtained according to Cp, BIC, and adjusted R2? Show some plots to provide evidence for your answer, and report the coefficients of the best model obtained. Note you will need to use the data.frame() function to create a single data set containing both X and Y (sample code is provided below).

install.packages("leaps")

library(leaps)

data.full <- data.frame(y = y, x = x)

regfit.full <- regsubsets(y ~ x + I(x^2) + I(x^3) + I(x^4) + I(x^5) + I(x^6) + I(x^7) + I(x^8) + I(x^9) + I(x^10), data = data.full, nvmax = 10)

reg.summary <- summary(regfit.full)

par(mfrow = c(2, 2))

plot(reg.summary$cp, xlab = "Number of variables", ylab = "C_p", type = "l")

points(which.min(reg.summary$cp), reg.summary$cp[which.min(reg.summary$cp)], col = "red", cex = 2, pch = 20)

plot(reg.summary$bic, xlab = "Number of variables", ylab = "BIC", type = "l")

points(which.min(reg.summary$bic), reg.summary$bic[which.min(reg.summary$bic)], col = "red", cex = 2, pch = 20)

plot(reg.summary$adjr2, xlab = "Number of variables", ylab = "Adjusted R^2", type = "l")

points(which.max(reg.summary$adjr2), reg.summary$adjr2[which.max(reg.summary$adjr2)], col = "red", cex = 2, pch = 20)

coef(regfit.full, which.max(reg.summary$adjr2))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Essentials Of Business Analytics

Authors: Jeffrey Camm, James Cochran, Michael Fry, Jeffrey Ohlmann, David Anderson, Dennis Sweeney, Thomas Williams

1st Edition

128518727X, 978-1337360135, 978-1285187273

More Books

Students also viewed these Computer Network questions

Question

=+a) Make a decision tree for these decisions.

Answered: 1 week ago

Question

Differentiate. y = ln(3x + 1) ln(5x + 1)

Answered: 1 week ago