Question: 4.6 Understanding r2 . Exercise 3.8 (page 81) described a study of emotional distress as a function of time since a breakup. The scatterplot you
4.6 Understanding r2 . Exercise 3.8 (page 81) described a study of emotional distress as a function of time since a breakup. The scatterplot you made showed a clear linear pattern. As a pedagogical exercise, we will find the value of r2 ina step-by-step manner to understand what it represents. Here are the data from the study and some corresponding calculations:
x y y^ y −y^
y −y¯
0 3.3 3.2763 0.0237 0.325 2 3.1 3.1424 −0.0424 0.125 6 2.9 2.8746 0.0254 −0.075 10 2.6 2.6068 −0.0068 −0.375
a. The mean of the four average distress values is y¯=2.975.
Compute the square of the four differences (y-y¯ ) and add them up. The value you get, Σ(y-y¯)2, corresponds to the total variation in y . (Notice that this is the numerator of the equation for the variance of y from Chapter 2.)
b. Now square the regression residuals (y-y^ ) and add them up.
The value you get, Σ(y-y^)2, is the variation in y that is still unexplained after fitting the linear model. Notice that this value is smaller than the value you computed in part a.
c. What fraction of the variation in y is explained by the linear model? Subtract the value you obtained in part b from the value you obtained in part
a, then divide this difference by the value you obtained in part
a. As with an item on sale (10% off or 50% off), this calculation tells you how much lower the variability of the regression residuals is relative to the total variability of y . This value is r2 .
d. Use technology to obtain the value of r2 . The result should agree with your own computation, up to roundoff error. Interpret this value in the context of the problem.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
