Question
Using the bankloan_cs.sav data set do the following. 1. On an 80% training set, build a regression model for credit to debt ratio. Transform the
Using the bankloan_cs.sav data set do the following. 1. On an 80% training set, build a regression model for credit to debt ratio. Transform the dependent variable and the ones below as needed so that the residuals are as good as you can get them. Build a model on the training set that includes the following independent variables.
a. Age and the square of Age (original variable and a polynomial in your text) b. The four quantiles (25%, 50%, 75%, and 100%) of time at employment formulated as dichotomous indicators (step function in your text) c. Income and the square of a income when education level is 1 assigned to a single variable (piecewise polynomial) d. Default and default times age (dichotomous and interaction)
2. Interpret the coefficient and ANOVA table from the regression model. Be sure to state the overall significance of the model. Determine which variables should be retained (if any) and which should be removed. Investigate all assumptions of regression.
3. Without rebuilding the model, predict credit to debt ratio on the test set. What is the MSE? What is the R^2? What is the MAE? What is the ME?
4. Finally, build your own model on the training set, predict the test set, and assess its metrics in comparison with the first model. Which performed better and why? WHich model would be better for explaining? Which would be better for predicting? Why?