Answer.the following questions as attached below.
1. McDonald and Ayers 1978 present data from an early study that examined the possible link between air pollution and mortality. (data_table_B15.xlex file summarizes the data). The data description is as follows: MORT = total age-adjusted mortality from all causes, in deaths per 100,000 population. PRECIP = mean annual precipitation . (in inches), EDUC = median number of school years completed for persons of age 25 years or older, NONWHITE = percentage of the 1960 population that is nonwhite NOX = relative pollution potential of oxides of nitrogen SO2 = relative pollution potential of sulfur dioxide. "Relative pollution potential" is the product of the tons emitted per day per square kilometer and a factor correcting the SMSA dimensions and exposure. a. Fit a multiple linear regression model relating the mortality rate to these regressors. b. Test for significance of regression. What conclusions can you draw? c. Use t tests to assess the contribution of each regressor to the model. Discuss your findings. d. Find a 95% CI for the regression coefficient for SO2. e. Run all possible models and choose the best one with justifications. (You may not consider PRESS statistic) f. Run forward, backward and stepwise regression on the data. g. Do all 3 procedures picked the same model? If yes: Should it happen all the time, If NO: Why don't they pick the same? h. Perform the residual analysis of your final model and provide the final estimated model (you must describe the rule you applied).122 MULTIPLE LINEAR REGRESSION 3.4 Reconsider the National Football League data from Problem 3.1. Fit a model to these data using only x, and x; as the regressors. a. Test for significance of regression. b. Calculate R' and RA.. How do these quantities compare to the values computed for the model in Problem 3.1, which included an additional regressor (x2)? c. Calculate a 95% CI on By. Also find a 95% CI on the mean number of games won by a team when x, = 56.0 and x; = 2100. Compare the lengths of these CIs to the lengths of the corresponding CIs from Problem 3.3. d. What conclusions can you draw from this problem about the consequences of omitting an important regressor from a model?PROBLEMS 11 Consider the National Football League data in Table B.1. a. Fit a multiple linear regression model relating the number of games won to the team's passing yardage (x2), the percentage of rushing plays (x7), and the opponents' yards rushing (x8). b. Construct the analysis-of-variance table and test for significance of regression. c. Calculate t statistics for testing the hypotheses Ho: B2 = 0, Ho: B, = 0, and Ho: B: = 0. What conclusions can you draw about the roles the variables X2, 17, and x8 play in the model? d. Calculate R2 and RAdj for this model. e. Using the partial F test, determine the contribution of x7 to the model. How is this partial F statistic related to the t test for B, calculated in part c above? 3.2 Using the results of Problem 3.1, show numerically that the square of the simple correlation coefficient between the observed values y; and the fitted values y, equals R2. 13 Refer to Problem 3.1. a. Find a 95% CI on B. b. Find a 95% CI on the mean number of games won by a team when *2 = 2300, x7 = 56.0, and X8 = 2100