Question
Q2 Please do the following in R-Studio Assume the following multiple linear regression model: Y = 0 + 1X1 + 2X2 + 3X3 + N(0,
Q2 Please do the following in R-Studio
Assume the following multiple linear regression model:
Y = 0 + 1X1 + 2X2 + 3X3 +
N(0, ^2 )
First, simulate a dataset for multiple linear regression in R-code. The dataset will consist of one outcome variable (Y ) and three predictor variables (X = (X1, X2, X3)). The X has to be simulated from a multivariate normal distribution. You can use the following simulation codes:
library(MASS)
## Simulation for correlated predictors ## set.seed("12345")
nsample <- 10; nsim <- 100
sig2 <- rchisq(1, df = 1) ## The true error variance
bet <- c(rnorm(3, 0, 1), 0) ## 4 values of beta that is beta0, beta1, beta2, beta3 = 0
muvec <- rnorm(3, 0, 1)
sigmat <- diag(rchisq(3, df = 4))
X <- mvrnorm(nsample, mu = muvec, Sigma = sigmat)
Xmat <- cbind(1, X)
## Simulate the response ##
bets <- matrix(NA, ncol = length(bet), nrow = nsim)
for(i in 1:nsim){
Y <- Xmat%*%bet + rnorm(nsample, 0, sqrt(sig2))
model1 <- lm(Y ~ X)
bets[i,] <- coef(model1) }
There are a few things to be noted from these simulations. The has four values, 0, 1, 2 and 3. You can see that 3 = 0, i.e., the third predictor is not linearly related with the response. Here sigmat is the variance-covariance matrix for X the independent predictors, where the diagonal elements are variances (not standard deviations) and the off diagonals are covariances.
(a). First assume that the correlation between the three predictors are zero, i.e., the off diagonals of sigmat are zero, like the codes provided above. Set the number of simulations nsim = 100 and sample size for each simulation to 10. Generate Y for each simulation. Then run simple linear regression for each of the three variables separately. Obtain the regression parameter estimates and their variances from the coefficients tables obtained from the lm function. Comment on whether the estimators are unbiased. That is calculate the mean of all regression parameter estimates and check it the values are approximately equal to the true values.
(b). Now fit a multiple linear regression and obtain the regression parameter estimates along with their variances from each simulation. Again check the unbiasedness and the variances. Compare the results with step (a). Remember - in step (a) we are fitting incorrect models and in step (b) we are fitting the correct model.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started