Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

2. (14 points) In a simple linear regression, the r2 or squared correlation between :13 and y, is often used as a measure of how

image text in transcribedimage text in transcribed
2. (14 points) In a simple linear regression, the r2 or squared correlation between :13 and y, is often used as a measure of how well the linear model ts the data. That is, r2 measures how strong is the linear relationship between the response y and the single predictor variable 51;. It is natural to ask how we can generalize this concept to the setting of a regression on multiple predictor variables X1, . . . ,Xp. That is, we would like to ask, how strong is the linear relationship between the response y and the full set of predictors X1, . . . , Xp. The usual way that people generalize this concept is the multiple R2, which we dened in class as the variance of the tted values divided by the variance of the response: 0' '6)[0 R2: 0' \"t? We mentioned in lecture that it is often referred to as the fraction or percentage of \"variance explained by the regression,\" but we did not explain why people call it that. In fact, the multiple R2 is a popular measure of how well the regression ts the data, because it has several interpretations: 1. R2 is the correlation of the prediction vector Q with the response y, so if it is very high, the response lines up almost exactly with the predictions, but if it is very low, the response has almost no association with the predictions. This is shown in part (e). 2. 1 R2 is the variance of the residuals as a fraction of the original cry, so if it is large, the residuals are very small compared to the variation of y. This is shown in part (d). 3. In the special case of OLS regression with only a single predictor variable, we can still compute R2 and it is exactly the same as the squared correlation 7'2 between :1: and y. This is shown in optional part (f). For all parts of this question, you can freely assume that any vectors like y, e, X j, and so on are non-constant if we are calculating correlations (correlations with constant vectors are undened). (a) (3 points) We have been using the notation (If, and a: in class to denote the variance of vectors :1; and y, which are dened as the average squared deviation from the :E and g, respectively: 1 1 2__ '_2 2__ ._2 05,\" E, (33, :13), tryn E, (y, y). This is closely related to the idea of the variance of a random variable, which we will cover later in the course, but for this problem you do not need to think about random variables, we are only referring to this specic sum, which measures how \"spread out\" the coordinates of a vector are around the average. To get practice working with the variance, show that if 373- : a + bmi, that the mean is Q = a + bf and the variance is 2 _ 2 2 09 I) am

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Calculus (Single Variable)

Authors: Michael Sullivan

1st Edition

1464142912, 9781464142918

More Books

Students also viewed these Mathematics questions

Question

What other publications/presentations does the person have?

Answered: 1 week ago

Question

1. What does this mean for me?

Answered: 1 week ago