Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This question use R This project is to be completed individually. You should submit a project report and a source code file This project will

image text in transcribed

This question use R

This project is to be completed individually. You should submit a project report and a source code file This project will use the diabetes data in Efron et al. (2003), which consists of ten baseline variables: age, sex, body mass index, average blood pressure, and six blood serum measurements obtained for n = 442 diabetes patients, as well as the response (a quantitative measure of disease progression one year after baseline). The data is available in R package lars. You can load the data as follows library (1ars) ## Loaded lars 1.2 data(diabetes) data.all -data.frame (cbind(diabetes#x, y-diabetes$y)) # change to normal formatting Use the sample function in R to partition the patients into training dataset (-70%) and testing dataset (-30%). Please use use the random number generator seed specified below before randomly splitting the data set.seed (312019) # set random numbeY generator seed to enable reproducibility of results We have used the similar random splitting on Page 13 of Lecture 6. You may use it as a reference if you are not certain how to obtain the training dataset and testing dataset Project Requirements tequirement 1: Exploratory Data Analysis (e.g., graphical and numerical descriptive statistics, correlation coefficients) Requirement 2: Fit the following regression models on the training dataset. Summarize the fit of these models 1. Fit the full model called ml, which is the linear regression model using all ten predictors 2. Fit the sub model called m2 after removing predictors that are not significant at alpha-0.05 in ml 3. Use the step function to choose a sub model called m3 by AIC in a Stepwise Algorithm: m3-step(m1) Please see https://stat.ethz.ch/R-manual/R-devel/library/stats/html/step.html for more details Compare the coefficient estimates among three models. Compare three models using F-test Requirement 3: For each model, calculate the "mean prediction error" in the testing dateset This project is to be completed individually. You should submit a project report and a source code file This project will use the diabetes data in Efron et al. (2003), which consists of ten baseline variables: age, sex, body mass index, average blood pressure, and six blood serum measurements obtained for n = 442 diabetes patients, as well as the response (a quantitative measure of disease progression one year after baseline). The data is available in R package lars. You can load the data as follows library (1ars) ## Loaded lars 1.2 data(diabetes) data.all -data.frame (cbind(diabetes#x, y-diabetes$y)) # change to normal formatting Use the sample function in R to partition the patients into training dataset (-70%) and testing dataset (-30%). Please use use the random number generator seed specified below before randomly splitting the data set.seed (312019) # set random numbeY generator seed to enable reproducibility of results We have used the similar random splitting on Page 13 of Lecture 6. You may use it as a reference if you are not certain how to obtain the training dataset and testing dataset Project Requirements tequirement 1: Exploratory Data Analysis (e.g., graphical and numerical descriptive statistics, correlation coefficients) Requirement 2: Fit the following regression models on the training dataset. Summarize the fit of these models 1. Fit the full model called ml, which is the linear regression model using all ten predictors 2. Fit the sub model called m2 after removing predictors that are not significant at alpha-0.05 in ml 3. Use the step function to choose a sub model called m3 by AIC in a Stepwise Algorithm: m3-step(m1) Please see https://stat.ethz.ch/R-manual/R-devel/library/stats/html/step.html for more details Compare the coefficient estimates among three models. Compare three models using F-test Requirement 3: For each model, calculate the "mean prediction error" in the testing dateset

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional IPhone And IPad Database Application Programming

Authors: Patrick Alessi

1st Edition

0470636173, 978-0470636176

More Books

Students also viewed these Databases questions

Question

6. Have you used solid reasoning in your argument?

Answered: 1 week ago