Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

(b) Which case has the highest leverage? (c) Which case has the highest inuence on the regression line, as measured by the Cook's distance? (d)

image text in transcribedimage text in transcribed
(b) Which case has the highest leverage? (c) Which case has the highest inuence on the regression line, as measured by the Cook's distance? (d) Which case has the highest PRESS residual? Now, consider tting the model Jig : [3'1 = ,83 = 0. Only the intercept and :32 are included in this model. Assume that we t M2 and obtain SSE = 1501.163. (e) Find R2 for model .112. (4 points) (f) Perform a signicant test (F-test) for model M2. Specify the F-statistic explicitly. (4 points) Now, let us compare models M1 and M2. (g) Based on the R2, which of the two models M1 and M2 we should pick? (2 points) (h) Is model selection based on R2 justied when choosing between .431 and M2? Argue for or against. (2 points) _ _"''___ \\_" I"'_'""'/ Consider the setup of the previous problem (housing data with three covariates m1, m2 and :63), but assume that we look at a different subset of the data, now with sample size n = 100. Consider the the following ve models: (Note that the models are NOT named the same way as the previous problem.) Ml: y=0+5 M2: y=50+51331+52582+5a M3: y=50+3171+52$2+53$3+fa (2) M4: y=50+31$1+52$2+53$3+a M5: y=50+31$1+52582+53$1$2+a Either treat these model expressions as those of the population level (i.e., y, 2:1, mm, etc. are all scalars), or if you want to treat them as vector quantities, interpret the nonlinear operations elementwise, e.g. $1932 E R\" is a vector with ith element 9:112:12, where 931-1 is the ith element of 271 6 IR\" and so on. (a) Specify which of the following pairs of models we can directly compare using an F-test: (1 point each) 1. 2 3 4. 5 .. ...........n........ .. \\""" ru'a'uu/ Consider the regression model y = 5011a + 131181 + [32932 + 53933 + e e R\". As usual assume 5 N N (0, 021'\"). The following is a description of the variables: y medv Median value of owner-occupied homes in $1000's :61 age proportion of owner-occupied units built prior to 1940 2:2 rm average number of rooms per dwelling m3 dis weighted distances to ve Boston employment centers Consider tting model 111 : g = g = 0, that is, we only include the intercept and $1 in the model. This is a simple regression model. Below is the output of R summarizing the result of tting the model to the data: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 28.49769 0.95917 29.711

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modeling the Dynamics of Life Calculus and Probability for Life Scientists

Authors: Frederick R. Adler

3rd edition

840064187, 978-1285225975, 128522597X, 978-0840064189

More Books

Students also viewed these Mathematics questions