Question
Consider having SAS perform a regression analysis of the hospital labor needs data using all 17 hospitals and the model y= 0 + 1 x
Consider having SAS perform a regression analysis of the hospital labor needs data using all 17 hospitals and the model
y=0 + 1x1 + 2x2 + 3x3 + 4DL + ?
where DL is a dummy variable for large hospitals 14, 15, 16, and 17. If we do this, we find that the least squares point estimates of the model parameters and their associated p-values (given in parentheses) are b0 = 2462.2164(0.0004), b1 = 0.0482(0.0016), b2 = 0.7843(3 = -432.4095(0.0006), and b4 = 2871.7828(.0003). In addition, Figure 5.39(page 276) gives the SAS output of outlying and influential observation diagnostics for this model.
a. By interpreting b4 = 2871.7828 and its associated p-value of .0003, discuss why there seems to be an important effect due to the inefficiency of large hospitals.
b. Given the large-hospital inefficiency, is hospital 14 an outlier with respect to its y value? Explain your answer.
c. Identify the hospital having the largest Cook's D. Does this hospital seem less influential than hospital 17 did when we used all 17 observations (including hospital 14) and no dummy variable to perform the regression analysis? See the output in Figure 5.20 (page 256). Explain your answer.
d. Although the remedial actions taken in Exercise 5.15(remove hospital 14) and in this exercise (use a dummy variable) have lessened the influence of the larger hospitals (hospitals 14, 15, 16, and 17), these larger hospitals generally seem more influential than the small to medium-sized hospitals. This probably implies that we need more data concerning large hospitals to develop a better regression model for evaluating hospitals whose efficiency the navy questions. Since we do not now have such data, we will use the data we have to choose a model for evaluating the questionable hospitals. To do this, first note that the s for the dummy variable model of this exercise is 363.8542 and the s for the model of Exercise 5.15 is 387.1598. Also, not that both of these values of s are substantially smaller than the s of 614.7794 for the model using all 17 hospitals and no dummy variable. Next, consider a questionable large hospital (Dt=1) for which Xray = 56,194, BedDays = 14,077.88, and Length = 6.89. Such a hospital has the following 95% prediction intervals for labor needs: [15,175 , 17,030] if using the dummy variable in this exercise [14,906 , 16,886] if using the model in Exercise 5.15 and [14,511 , 17,618] if using the model of Exercise 4.4 (page 199) which does not employ a dummy variable and uses all 17 hospitals. Which of the three models gives the shortest prediction interval?
e. Figure 5.40(a) presents the plots of the residuals versus the predicted values for the model in Exercise 5.15 when hospital 14 is omitted. Figure 5.40(b) presents the same graph for the dummy variable model in this exercise when using all 17 hospitals. Which residual plot as the most horizontal band appearance (or constant variance)?
f. Combining all the available information in Exercises 4.4, 5.14, 5.15, and 5.16, and in Figure 5.20, which of the three models seems best for evaluating the efficiency of questionable hospitals?
Figure 5.39
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started