Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

= = = - (XTX) Tiyi - Y), Q3. Consider the multiple linear regression model given by Y = X8+ where Y nx 1 vector

image text in transcribedimage text in transcribed

= = = - (XTX) Tiyi - Y), Q3. Consider the multiple linear regression model given by Y = X8+ where Y nx 1 vector of the dependent variables, X nx (p+1) design matrix of full rank, B (p+1) x1 vector of regression coefficients, nx 1 vector of random errors satisfying e~ N(0,1,02), and In is the n x n identity matrix. Given the vector y of the n observations, the least squares estimator of B is given by @= (x+x)-'X'y leading to the fitted model = X. Now consider removing observation i from the data. Let X() be the (n 1) (p+1) matrix X with row i deleted. Let Y() be the (n 1) x 1 vector y with observation yi deleted. Let Bo be the estimate of B with observation i deleted, and let x] be the th row of X. Thus, X7 X(i) = XTX 2;2] is of order (p+1)x(p+1), and X7,4(1) = X"y = Xty-Riyi is of order (p+1) x 1. It can be shown that you do not need to prove this) S - 1-hi where hii is the ith diagonal element of H = X(XTX)-1XT. Let SSE =y"{In - X(XTX)-'XT}y denote the residual sum of squares based on all n data points. Further, let SSE(8) = 47,{In-1 X(1)(X7X())-X)}() denote the residual sum of squares when the ith data point is deleted. (b) In a study of the effects of cystic fibrosis, data were collected from 25 patients on variables related to body size and lung function. = = = maximal static expiratory pressure (cm H20), a measure of malnutrition; X body mass (weight/height?) as a percentage of the age-specific median in normal individuals; X2 weight (kg); X3 residual volume; X4 forced expiratory volume in one second. Using the statistical computing package R, a multiple linear regression model con- taining only the four variables, X1, X2, X3, X4, has been fitted to the cystic fibrosis data and has been followed by an analysis of the effect of deleting the ith observation in turn from the data for i = 1,..., 25. The results are presented on the next page. Here, "hat" and "resid" contain the values hi from the matrix H, and the residuals, ei, (i = 1,..., n), from the multiple linear regression model based on all n = 25 observations. Also, sigmahat" is the residual standard error for the model with the ith obser- vation deleted. The remaining five columns contain the change in B; effected by deleting the ith observation for a model containing X1, X2, X3, X4. At the bottom of the table are Bi, (j = 0,1,. 4), together with their respective standard errors, and finally, the residual standard error, "sigmahat" for the model based on all n observations. Comment on the hat values and the residuals for the individual observations. Which observations have most influence, that is, effect most change, on the residual standard error and the Bi, (j = 0,1,...,4), and why? [8 marks] (c) For any ONE row in the table below, show numerically how the "hat", "resid and "sigmahat" values are related to "sigma_hat for all data". [3 marks]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

CIA Part 1 Essentials Of Internal Auditing Certified Internal Auditor 2019

Authors: Muhammad Zain

1st Edition

1091949182, 978-1091949188

More Books

Students also viewed these Accounting questions

Question

2. In what ways have your peers and your parents helped shape you?

Answered: 1 week ago