Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hi, here is my question about stat 331. My R code is ## RESIDUAL PLOTS V.S. FITTED VALUES Selected_Model2=lm(formula = wt ~ gestation + parity

Hi, here is my question about stat 331.

My R code is

## RESIDUAL PLOTS V.S. FITTED VALUES

Selected_Model2=lm(formula = wt ~ gestation + parity + meth + mage + med + mht +

mwt + feth + fed + time + number + gestation:med + gestation:number +

gestation:mage + med:mht + mage:feth + gestation:feth + mwt:fed +

mage:mht, data = Original_Data)

## Models to compare

M1=Selected_Model1

M2=Selected_Model2

Mnames=expression(M[Selected_Model1],M[Selected_Model2])

## Cross-validation setup

nreps =2e3 # 2000 replications

ntot = nrow(Original_Data) # total number of observations

ntrain =1030 # randomly choose 1030 observations as training set

ntest = ntot-ntrain # size of test set

mspe1 = rep(NA, nreps) # sum-of-square errors for each CV replication

mspe2 = rep(NA, nreps)

logLambda = rep(NA, nreps) # log-likelihod ratio statistic for each replication

for(ii in 1:nreps) {

if(ii%%400 == 0) message("ii = ", ii)

# randomly select training observations

train.ind = sample(ntot, ntrain) # training observations

M1.cv = update(M1, subset = train.ind)

M2.cv = update(M2, subset = train.ind)

# out-of-sample residuals for both models

# that is, testing data - predictions with training parameters

M1.res = Original_Data$wt[-train.ind] -

predict(M1.cv, newdata = Original_Data[-train.ind,])

M2.res = Original_Data$wt[-train.ind] -

predict(M2.cv, newdata = Original_Data[-train.ind,])

# mean-square prediction errors

mspe1[ii] = mean(M1.res^2)

mspe2[ii] = mean(M2.res^2)

# out-of-sample likelihood ratio

M1.sigma = sqrt(sum(resid(M1.cv)^2)/ntrain) # MLE of sigma

M2.sigma = sqrt(sum(resid(M2.cv)^2)/ntrain)

# since res = y - pred, dnorm(y, pred, sd) = dnorm(res, 0, sd)

logLambda[ii] = sum(dnorm(M1.res, mean = 0, sd = M1.sigma, log = TRUE))

logLambda[ii] = logLambda[ii] -

sum(dnorm(M2.res, mean = 0, sd = M2.sigma, log = TRUE))

}

and I got a error message:

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :

factor number has new levels more than 60 smokes per day

In addition: Warning message:

In predict.lm(M2.cv, newdata = Original_Data[-train.ind, ]) :

How should I fix my data?

prediction from a rank-deficient fit may be misleading

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Statistical Techniques in Business and Economics

Authors: Douglas A. Lind, William G Marchal

17th edition

1259666360, 978-1259666360

More Books

Students also viewed these Mathematics questions