Question
Hi, here is my question about stat 331. My R code is ## RESIDUAL PLOTS V.S. FITTED VALUES Selected_Model2=lm(formula = wt ~ gestation + parity
Hi, here is my question about stat 331.
My R code is
## RESIDUAL PLOTS V.S. FITTED VALUES
Selected_Model2=lm(formula = wt ~ gestation + parity + meth + mage + med + mht +
mwt + feth + fed + time + number + gestation:med + gestation:number +
gestation:mage + med:mht + mage:feth + gestation:feth + mwt:fed +
mage:mht, data = Original_Data)
## Models to compare
M1=Selected_Model1
M2=Selected_Model2
Mnames=expression(M[Selected_Model1],M[Selected_Model2])
## Cross-validation setup
nreps =2e3 # 2000 replications
ntot = nrow(Original_Data) # total number of observations
ntrain =1030 # randomly choose 1030 observations as training set
ntest = ntot-ntrain # size of test set
mspe1 = rep(NA, nreps) # sum-of-square errors for each CV replication
mspe2 = rep(NA, nreps)
logLambda = rep(NA, nreps) # log-likelihod ratio statistic for each replication
for(ii in 1:nreps) {
if(ii%%400 == 0) message("ii = ", ii)
# randomly select training observations
train.ind = sample(ntot, ntrain) # training observations
M1.cv = update(M1, subset = train.ind)
M2.cv = update(M2, subset = train.ind)
# out-of-sample residuals for both models
# that is, testing data - predictions with training parameters
M1.res = Original_Data$wt[-train.ind] -
predict(M1.cv, newdata = Original_Data[-train.ind,])
M2.res = Original_Data$wt[-train.ind] -
predict(M2.cv, newdata = Original_Data[-train.ind,])
# mean-square prediction errors
mspe1[ii] = mean(M1.res^2)
mspe2[ii] = mean(M2.res^2)
# out-of-sample likelihood ratio
M1.sigma = sqrt(sum(resid(M1.cv)^2)/ntrain) # MLE of sigma
M2.sigma = sqrt(sum(resid(M2.cv)^2)/ntrain)
# since res = y - pred, dnorm(y, pred, sd) = dnorm(res, 0, sd)
logLambda[ii] = sum(dnorm(M1.res, mean = 0, sd = M1.sigma, log = TRUE))
logLambda[ii] = logLambda[ii] -
sum(dnorm(M2.res, mean = 0, sd = M2.sigma, log = TRUE))
}
and I got a error message:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
factor number has new levels more than 60 smokes per day
In addition: Warning message:
In predict.lm(M2.cv, newdata = Original_Data[-train.ind, ]) :
How should I fix my data?
prediction from a rank-deficient fit may be misleading
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started