Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. Personal Loan Acceptance. Universal Bank is a relatively young bank growing rapidly in terms of overall customer acquisition. Universal bank wants to convert its

image text in transcribed
1. Personal Loan Acceptance. Universal Bank is a relatively young bank growing rapidly in terms of overall customer acquisition. Universal bank wants to convert its liability customers (depositors) into personal loan customers (while retaining them as depositors). A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise smarter campaigns with better target marketing. The goal of our analysis is to model the previous campaign's customer behavior to analyze what combination of factors make a customer more likely to take out a personal loan. The file UniversalBank.xls contains data on 5,000 customers. The data include demographic information (age, income, etc.), the customer's relationship with the bank (mortgage, securities account, etc.), and the customer's response to the last personal loan campaign (variable = Personal Loan). Among the 5,000 customers, only 480 (9.6%) accepted the personal loan offer in the last campaign (textbook reference - 7.1). Partition the data into training (60%) and validation (40%) sets. a. Perform a k-NN classification with all input variables except ID and ZIP CODE using k = 1. (Remember to transform categorical variables into binary dummy variables). Specify the success class as "1" (loan accepted), and use the default cutoff value of 0.5. How would the following new customer be classified using your model: Age-40, Experience=10, Income-84, Family 2, CCAvg=2, Education_1=0, Education_2-1, Education_30, Mortgage-0, Securities Account=0, CD Account=0, Online=1, and Credit Card=1? b. What is the choice of k that balances between overfitting and ignoring the predictor information? (Hint: Run k-NN for k values 1 to 10). C. Using the Confusion Matrix for the validation data in Part b, how many customers were classified correctly? How many customers were classified incorrectly? d. Classify the new customer using the best k. e. Repartition the data; this time into training, validation, and test sets (50%: 30%: 20%). Apply the R-NN method with the chosen above. Compare the Confusion Matrix of the test set with that of the training and validation sets. Comment on the differences and their reason. What is your assessment of the performance of this model? 1. Personal Loan Acceptance. Universal Bank is a relatively young bank growing rapidly in terms of overall customer acquisition. Universal bank wants to convert its liability customers (depositors) into personal loan customers (while retaining them as depositors). A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise smarter campaigns with better target marketing. The goal of our analysis is to model the previous campaign's customer behavior to analyze what combination of factors make a customer more likely to take out a personal loan. The file UniversalBank.xls contains data on 5,000 customers. The data include demographic information (age, income, etc.), the customer's relationship with the bank (mortgage, securities account, etc.), and the customer's response to the last personal loan campaign (variable = Personal Loan). Among the 5,000 customers, only 480 (9.6%) accepted the personal loan offer in the last campaign (textbook reference - 7.1). Partition the data into training (60%) and validation (40%) sets. a. Perform a k-NN classification with all input variables except ID and ZIP CODE using k = 1. (Remember to transform categorical variables into binary dummy variables). Specify the success class as "1" (loan accepted), and use the default cutoff value of 0.5. How would the following new customer be classified using your model: Age-40, Experience=10, Income-84, Family 2, CCAvg=2, Education_1=0, Education_2-1, Education_30, Mortgage-0, Securities Account=0, CD Account=0, Online=1, and Credit Card=1? b. What is the choice of k that balances between overfitting and ignoring the predictor information? (Hint: Run k-NN for k values 1 to 10). C. Using the Confusion Matrix for the validation data in Part b, how many customers were classified correctly? How many customers were classified incorrectly? d. Classify the new customer using the best k. e. Repartition the data; this time into training, validation, and test sets (50%: 30%: 20%). Apply the R-NN method with the chosen above. Compare the Confusion Matrix of the test set with that of the training and validation sets. Comment on the differences and their reason. What is your assessment of the performance of this model

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions