1. Open the Excel worksheet containing your Team Project Data. 2. As you learned in Module 2 , you will be using the set of potentially meaningful numerical independent variables and the one selected "two-category" dummy variable in your study to develop a "best" multiple regression model for predicting your numerical response variable Y. A. Start with a visual assessment of the possible relationships of your numerical dependent variable Y with each potential predictor variable by developing the scatterplot matrix (use JMP) and paste this into your report. B. Then fit a preliminary multiple regression model using these potential numerical predictor variables and, at most, one categorical dummy variable. C. Then assess collinearity with VIF until you are satisfied that you have a final set of possible predictors that are "independent," i.e, not unduly correlated with each other. Note your observations. D. Use stepwise regression approaches to fit a multiple regression model with this set of potentially meaningful numerical independent variables (and, if appropriate, the one selected categorical dummy variable). - (1) Based on the forward modeling criterion determine which independent variables should be included in your regression model. - (2) Based on the backward selection modeling criterion determine which independent variables should be included in your regression model. - (3) Based on the mixed selection modeling criterion determine which independent variables should be included in your regression model. - (4) Based on the Adjusted r2 criterion determine which independent variables should be included in your regression model. E. Comment on the consistency of your findings in Step 2D (1))-(4) (if (or if not) they are the same, explain why? hint: see VIFs), F. Paste screenshots of (1), (2), and (3) outputs from Step 2D above into your report. 6. Based on Step 2D (along with the principle of parsimony if necessary) select a "best"multiple regression model. Note your finding H. Using the predictor variables from your selected "best" multiple regression model, rerun the multiple regression model in order to assess its assumptions, You may use Excel or JMP for this step. 1. Look at the set of residual plots, cut and paste them into the report, and briefly comment on the I. Look at the set of residual plots, cut and paste them into the report, and briefly comment on the appropriateness of your fitted model. - (1) If the assumptions are met and the fitted model is appropriate, continue to Step 2 J. - (2) If the normality assumption is problematic, state this but continue to Step 2J. Note: You do not need to check the assumption of independence in your project. That assumption is met because your project is not time-dependent. - (3) If either the linearity or equality of variance assumption is violated in one or two scatter plots of Y with individual predictors then transform the particular independent variables involved (try log, square root, or etc.) and rerun the multiple regression model as in Step 2H. J. Assess the significance of the overall fitted model. Note your observation. K. Assess the significance of each predictor variable. Note your observations. 3. Write the sample multiple regression equation for the "final best" model you have developed. A. Interpret the meaning of the Y intercept and interpret the meaning of all the slopes for your fitted model (but do this in whatever units you used for Y to build this model). B. Interpret and describe the meaning of the coefficient of multiple determination r2. C. Interpret and describe the meaning of the standard error of the estimate SYX (in the units you used to build this model). D. Determine the 95% confidence interval estimate for each coefficient estimate (that you find for the independent variables) and interpret their impact on the dependent variable (Y) accordingly. Explain why we need to consider/study the confidence interval estimate of coefficients (hint: sample data). E. Select one value for each of your independent variables in their respective relevant ranges: F. Predict y^ and include the units in the results 1. Open the Excel worksheet containing your Team Project Data. 2. As you learned in Module 2 , you will be using the set of potentially meaningful numerical independent variables and the one selected "two-category" dummy variable in your study to develop a "best" multiple regression model for predicting your numerical response variable Y. A. Start with a visual assessment of the possible relationships of your numerical dependent variable Y with each potential predictor variable by developing the scatterplot matrix (use JMP) and paste this into your report. B. Then fit a preliminary multiple regression model using these potential numerical predictor variables and, at most, one categorical dummy variable. C. Then assess collinearity with VIF until you are satisfied that you have a final set of possible predictors that are "independent," i.e, not unduly correlated with each other. Note your observations. D. Use stepwise regression approaches to fit a multiple regression model with this set of potentially meaningful numerical independent variables (and, if appropriate, the one selected categorical dummy variable). - (1) Based on the forward modeling criterion determine which independent variables should be included in your regression model. - (2) Based on the backward selection modeling criterion determine which independent variables should be included in your regression model. - (3) Based on the mixed selection modeling criterion determine which independent variables should be included in your regression model. - (4) Based on the Adjusted r2 criterion determine which independent variables should be included in your regression model. E. Comment on the consistency of your findings in Step 2D (1))-(4) (if (or if not) they are the same, explain why? hint: see VIFs), F. Paste screenshots of (1), (2), and (3) outputs from Step 2D above into your report. 6. Based on Step 2D (along with the principle of parsimony if necessary) select a "best"multiple regression model. Note your finding H. Using the predictor variables from your selected "best" multiple regression model, rerun the multiple regression model in order to assess its assumptions, You may use Excel or JMP for this step. 1. Look at the set of residual plots, cut and paste them into the report, and briefly comment on the I. Look at the set of residual plots, cut and paste them into the report, and briefly comment on the appropriateness of your fitted model. - (1) If the assumptions are met and the fitted model is appropriate, continue to Step 2 J. - (2) If the normality assumption is problematic, state this but continue to Step 2J. Note: You do not need to check the assumption of independence in your project. That assumption is met because your project is not time-dependent. - (3) If either the linearity or equality of variance assumption is violated in one or two scatter plots of Y with individual predictors then transform the particular independent variables involved (try log, square root, or etc.) and rerun the multiple regression model as in Step 2H. J. Assess the significance of the overall fitted model. Note your observation. K. Assess the significance of each predictor variable. Note your observations. 3. Write the sample multiple regression equation for the "final best" model you have developed. A. Interpret the meaning of the Y intercept and interpret the meaning of all the slopes for your fitted model (but do this in whatever units you used for Y to build this model). B. Interpret and describe the meaning of the coefficient of multiple determination r2. C. Interpret and describe the meaning of the standard error of the estimate SYX (in the units you used to build this model). D. Determine the 95% confidence interval estimate for each coefficient estimate (that you find for the independent variables) and interpret their impact on the dependent variable (Y) accordingly. Explain why we need to consider/study the confidence interval estimate of coefficients (hint: sample data). E. Select one value for each of your independent variables in their respective relevant ranges: F. Predict y^ and include the units in the results