Question

1 Approved Answer

Posted on Sep 21, 2024

Hello, this is BAS 320 regression and modeling with R. Can you please tell me what codes/formulas you would use in these questions? I need

Hello, this is BAS 320 regression and modeling with R. Can you please tell me what codes/formulas you would use in these questions? I need help answering these, I would highly appreciate it if you know.

#2: Use the dataframe Q2. First, fit a multiple regression model predicting 'grade' in Q2 from all remaining variables in Q2. Copy/paste the p-value of the test of equal spread into Google Form.

#3: Conduct a partial F test to determine if the two least significant predictors (i.e.with the largest p-values) in your model from #2 can be dropped. Refer to the previous activity for guidance and an example for the partial F test. Copy/paste the p-value of the partial F test into Google Form after using the anova().

#4: Using the regression model from #2, copy/paste into Google Form the lower bound of a 90% prediction interval when each predictor is equal to 80. Note: If it is asking for confidence interval then the confidence interval refers to the average of the 'grade'. For prediction interval, it refers to the particular value of the 'grade'. For guidance on creating the TO.PREDICT dataframe, refer to an in-class activity on the matter.

#5: Fit a regression model using the data Q2 predicting 'grade' from q5 and q3 , including the interaction. Copy/paste into Google Form the coefficient of the interaction term.

#6: Use the dataframe Q6. Fit a model predicting 'Donation.Amount' from LIFETIME_MAX_GIFT_AMT and URBANICITY (no interaction). Copy/paste the p-value of the test for significance of the predictor URBANICITY . If it is <2.2e-16 enter 0. Note: The predictor URBANICITY is a categorical variable so the test should be done by using drop1(). Check the previous activity and make sure to specify the correct test when using drop1().

#7: Use the dataframe Q7. Fit a model predicting 'Donation.Amount' from RECENT_RESPONSE_PROP and OVERLAY_SOURCE (with interaction). For which level of OVERLAY_SOURCE is the relationship between 'Donation.Amount' and RECENT_RESPONSE_PROP the weakest ? Answer B, M, N, or P on Google Form (regardless of whether the difference in strengths is statistically significant or not).

#8: Use the Q8 dataframe. Fit a logistic regression model predicting 'Smoking' from all remaining variables in Q8 (no interactions). Copy/paste the p-value of SpouseWeight (even if this variable is not statistically significant). If it is <2.2e-16 enter 0.

#9: Report the misclassification rate of the model you fit in #8. Note: your answer should be a number between 0 and 100.

#10: Use dataframe Q10. Fit a logistic regression model predicting Smoking from SpouseAge and Race with an interaction. Copy/paste the p-value of the interaction. If it is <2.2e-16 enter 0. Note: The predictor Race is a categorical variable so the test should be done by using drop1(). Check the previous activity and make sure to specify the correct test when using drop1().

#11: Use the Q11 dataframe. You want a 'good' model predicting 'Wins' from all remaining variables in Q11 (without interactions). Perform the 'all possible' approach, which produces a list of models with whose AICs are within 4 of the overall lowest AIC of all considered models. From this list of models, identify the predictors in the model with the FEWEST predictors (it could be the case your list has only one model). Redo the 'all possible' approach considering just those predictors ALONG WITH all two-way interactions between them to produce yet another list of models (your list might only have one). Report the AIC of the model at the top of this list, i.e., the one with the LOWEST AIC. Note: do not add ANY extra arguments (e.g., nbest or nvmax) to any of the commands.

#12: Use dataframe Q12. First set the seed number to be 320. Split the dataframe Q12 into a training sample (75%) and holdout sample (25%). Build a predictive logistic regression model predicting 'Buy' from all remaining predictors in Q12 (no interactions), choosing as your final model the one suggested by the one standard deviation rule (make sure to use 320 for the seed number). Fit the model on the training sample, then report its misclassification rate on your holdout sample. The answer should be a number between 0 and 100. Note: Make sure to set the random number seed to 320 when splitting the data into traning, holdout sample and building a predictive logistic regression model.