Answered step by step
Verified Expert Solution
Question
1 Approved Answer
This question use R This project is to be completed individually. You should submit a project report and a source code file This project will
This question use R
This project is to be completed individually. You should submit a project report and a source code file This project will use the diabetes data in Efron et al. (2003), which consists of ten baseline variables: age, sex, body mass index, average blood pressure, and six blood serum measurements obtained for n = 442 diabetes patients, as well as the response (a quantitative measure of disease progression one year after baseline). The data is available in R package lars. You can load the data as follows library (1ars) ## Loaded lars 1.2 data(diabetes) data.all -data.frame (cbind(diabetes#x, y-diabetes$y)) # change to normal formatting Use the sample function in R to partition the patients into training dataset (-70%) and testing dataset (-30%). Please use use the random number generator seed specified below before randomly splitting the data set.seed (312019) # set random numbeY generator seed to enable reproducibility of results We have used the similar random splitting on Page 13 of Lecture 6. You may use it as a reference if you are not certain how to obtain the training dataset and testing dataset Project Requirements tequirement 1: Exploratory Data Analysis (e.g., graphical and numerical descriptive statistics, correlation coefficients) Requirement 2: Fit the following regression models on the training dataset. Summarize the fit of these models 1. Fit the full model called ml, which is the linear regression model using all ten predictors 2. Fit the sub model called m2 after removing predictors that are not significant at alpha-0.05 in ml 3. Use the step function to choose a sub model called m3 by AIC in a Stepwise Algorithm: m3-step(m1) Please see https://stat.ethz.ch/R-manual/R-devel/library/stats/html/step.html for more details Compare the coefficient estimates among three models. Compare three models using F-test Requirement 3: For each model, calculate the "mean prediction error" in the testing dateset This project is to be completed individually. You should submit a project report and a source code file This project will use the diabetes data in Efron et al. (2003), which consists of ten baseline variables: age, sex, body mass index, average blood pressure, and six blood serum measurements obtained for n = 442 diabetes patients, as well as the response (a quantitative measure of disease progression one year after baseline). The data is available in R package lars. You can load the data as follows library (1ars) ## Loaded lars 1.2 data(diabetes) data.all -data.frame (cbind(diabetes#x, y-diabetes$y)) # change to normal formatting Use the sample function in R to partition the patients into training dataset (-70%) and testing dataset (-30%). Please use use the random number generator seed specified below before randomly splitting the data set.seed (312019) # set random numbeY generator seed to enable reproducibility of results We have used the similar random splitting on Page 13 of Lecture 6. You may use it as a reference if you are not certain how to obtain the training dataset and testing dataset Project Requirements tequirement 1: Exploratory Data Analysis (e.g., graphical and numerical descriptive statistics, correlation coefficients) Requirement 2: Fit the following regression models on the training dataset. Summarize the fit of these models 1. Fit the full model called ml, which is the linear regression model using all ten predictors 2. Fit the sub model called m2 after removing predictors that are not significant at alpha-0.05 in ml 3. Use the step function to choose a sub model called m3 by AIC in a Stepwise Algorithm: m3-step(m1) Please see https://stat.ethz.ch/R-manual/R-devel/library/stats/html/step.html for more details Compare the coefficient estimates among three models. Compare three models using F-test Requirement 3: For each model, calculate the "mean prediction error" in the testing datesetStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started