Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You have been tasked with performing an analysis on customers of a credit card company. Specifically, you will be developing a classification model to classify

image text in transcribed
image text in transcribed
You have been tasked with performing an analysis on customers of a credit card company. Specifically, you will be developing a classification model to classify whether or not specific customers will fail to pay their next credit card payment. You decide to approach this problem with a logistic regression classifier. The first 5 rows of our data are shown below. education marriage age falled payment 28465 1 40 1 27622 1 2 23 28376 2 1 36 0 10917 1 54 27234 1 35 0 The numerical data in the education and marriage columns correspond to the following cate- gories: education: 1 - graduate school; 2 - university; 3 - high school; 4 - other marriage: 1 - married: 2 - single: 3 - other Our response variable, labeled as failed payment, can have values of 0 (makes their next pay- ment) or 1 (fails to make their next payment). You use the logistic regression model y = P(Y = 1\x) = g(x+8). Assume that the following value of 8 minimizes un-regularized mean cross-entropy loss for this data set: 0 = (-1.30.0.08. -0.08,0.001) Here, -1.30 is the intercept term, 0.08 corresponds to education, -0.08 corresponds to marriage status, and 0.001 corresponds to age. (a) [2 Pts] Consider a customer who is 50 years old, married, and only has a high school education. Compute the chance that they fail to pay their next credit card payment. Give your answer as a probability in terms of o. (b) (2 P.) This specific customer fortunately made their next payment on time! Compute the crow-entropy loss of the prediction in part a. Leave your answers in terms of a ( 12 P Suppose with the above threshold you achieve a training accuracy of 100%. Can you conclude your training data was linearly separable in the feature space? Answer yes or no, and explain in one sentence. () 12 Pu How does a one-unit increase in age impact the logodes of making a filed payment? Give a precise, wamerical answer, not just increase or decreases." () 12 P) To further your analysis, you also create a random forest classifier To compare classifiers you generate a ROC curve for both models. Which of the two models would you choose to use, based on the ROC curve? Explain in one sentence. (Do not worry about the implementation details of how ROC curves are created for random forests.) (d) 13 P Let's consider all customers who are married and whose highest level of education is high school. What is the minimum age of such a customer, such that they more likely to fill the next payment than make their next payment, under our logistic regression 1 a por what we decide to ply the education feature by and age fetely sicer dataWhat is the value of that served e suppose you choose a test - The decision boundary of the resulting classifier is of the form A education g+C.age+ D - What are the wes of A, B, C, and you may contain but should not Show your work If you don't believe it's posle to wil wys for your commenience, the value of that mid unregularid men crowy loss on the original data was --00.000 here...30 is the percorso education 003cponds to marriage and 0.001 cep tage Note: This question is independent of pendrie do not me that we achieved traming accuracy of 100% You have been tasked with performing an analysis on customers of a credit card company. Specifically, you will be developing a classification model to classify whether or not specific customers will fail to pay their next credit card payment. You decide to approach this problem with a logistic regression classifier. The first 5 rows of our data are shown below. education marriage age falled payment 28465 1 40 1 27622 1 2 23 28376 2 1 36 0 10917 1 54 27234 1 35 0 The numerical data in the education and marriage columns correspond to the following cate- gories: education: 1 - graduate school; 2 - university; 3 - high school; 4 - other marriage: 1 - married: 2 - single: 3 - other Our response variable, labeled as failed payment, can have values of 0 (makes their next pay- ment) or 1 (fails to make their next payment). You use the logistic regression model y = P(Y = 1\x) = g(x+8). Assume that the following value of 8 minimizes un-regularized mean cross-entropy loss for this data set: 0 = (-1.30.0.08. -0.08,0.001) Here, -1.30 is the intercept term, 0.08 corresponds to education, -0.08 corresponds to marriage status, and 0.001 corresponds to age. (a) [2 Pts] Consider a customer who is 50 years old, married, and only has a high school education. Compute the chance that they fail to pay their next credit card payment. Give your answer as a probability in terms of o. (b) (2 P.) This specific customer fortunately made their next payment on time! Compute the crow-entropy loss of the prediction in part a. Leave your answers in terms of a ( 12 P Suppose with the above threshold you achieve a training accuracy of 100%. Can you conclude your training data was linearly separable in the feature space? Answer yes or no, and explain in one sentence. () 12 Pu How does a one-unit increase in age impact the logodes of making a filed payment? Give a precise, wamerical answer, not just increase or decreases." () 12 P) To further your analysis, you also create a random forest classifier To compare classifiers you generate a ROC curve for both models. Which of the two models would you choose to use, based on the ROC curve? Explain in one sentence. (Do not worry about the implementation details of how ROC curves are created for random forests.) (d) 13 P Let's consider all customers who are married and whose highest level of education is high school. What is the minimum age of such a customer, such that they more likely to fill the next payment than make their next payment, under our logistic regression 1 a por what we decide to ply the education feature by and age fetely sicer dataWhat is the value of that served e suppose you choose a test - The decision boundary of the resulting classifier is of the form A education g+C.age+ D - What are the wes of A, B, C, and you may contain but should not Show your work If you don't believe it's posle to wil wys for your commenience, the value of that mid unregularid men crowy loss on the original data was --00.000 here...30 is the percorso education 003cponds to marriage and 0.001 cep tage Note: This question is independent of pendrie do not me that we achieved traming accuracy of 100%

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Essays On The Quality Of Audited Financial Statements

Authors: Ulf Mohrmann

1st Edition

3832541853, 978-3832541859

More Books

Students also viewed these Accounting questions

Question

What does stickiest refer to in regard to social media

Answered: 1 week ago