Reconsider Problem 3.17. Partition the historical records into a training partition (60 percent of the 800 records)
Question:
Reconsider Problem 3.17. Partition the historical records into a training partition (60 percent of the 800 records) and a validation partition (the remaining 40 percent of the 800 records).
a. Using a maximum of seven splits as the stopping rule, generate the best pruned regression tree to predict the number of late payments based on the predictor variables of annual income and credit score.
b. Generate the lift chart when applied to the validation data.
c. Use the resulting regression tree to predict the number of late payments for each of the three applicants from Problem 3.17 (a, b, and c) under consideration.
Data from Problem 3.17.
As first described in Problem 2.16, Friendly Bank is very active with making loans to deserving people in the local community. However, the bank does need to carefully evaluate each loan to make sure that the recipient of the loan will likely repay the loan as scheduled. Therefore, the bank needs to obtain a prediction of whether this is likely and what the probability is. The bank primarily uses the annual income and the credit rating of the person applying for the loan as the predictor variables for obtaining this prediction. The bank has compiled all of the historical records of substantial loans and their outcomes over recent years. This information is provided in the spreadsheet titled Friendly Bank Data available in www.mhhe.com/Hillier7e. Only loans that have concluded (either paid off in full or ending in default) are included, resulting in 4,985 total records. The bank currently is evaluating the three loan applications described below. Using all the data (unpartitioned) on the Clean Data worksheet tab with the data rescaled using standardization, apply the KNN algorithm with k = 10 to this problem to classify each of the following applicants as either likely to default (defined as more than a 10 percent chance of default) or not likely to default (defined as a 10 percent chance of default or less). Also indicate the estimated probability of default for each.
Step by Step Answer:
Introduction To Management Science and Business Analytics A Modeling And Case Studies Approach With Spreadsheets
ISBN: 9781260716290
7th Edition
Authors: Frederick S. Hillier, Mark S. Hillier