2. (30 marks) Banks often build bearuing models to predict the likelihood of customers definulting on bome loan, i.e. the peokability that a customer cannot make hosan repayment in time. If an applicant shaws high risk of defaulting, the bank can reject the boame boan application upfront. These models are ussally brist bassed on: - Custumer profiles: gendes, age, profiession, family stage and education level. - Credit histary: ummbher of credit cardi payments in urrears, credit monomant, and credit limiit. - Finnacinl situation: gross income, expenises and transactions, total taxable income, and salary:- - Asset information: house prioe, lavd area and future growth fuctors Suppose we bave a set of 1442 past home loan applications and their corresponding inforration. You built a xgboost clusifyer that corrextly identifies 8 out of 15 troe defiulters, and 1384 out of 1427 trise non-defaulters. (a) Show a complete confusion matrix. What are the sersitivity and sqecificity values? (4 masks) (b) Suppose we incorporate the information that it is 5 times more costly to approval a loan that would potentially default than to deny a losen to a good customer. What are the possible ways that the sensitivity and sperificity may get affected? (4 marks) (c) How wonld yoa build a xgboost mobel in this context? Inchude the following asperts in your brief description: (15 marks) - preparation of the data attributes - decisions regarding potential outlors, missing vahes und variable scaling - fitting and regularisation of xgboost 2. (30 marks) Banks often build bearuing models to predict the likelihood of customers definulting on bome loan, i.e. the peokability that a customer cannot make hosan repayment in time. If an applicant shaws high risk of defaulting, the bank can reject the boame boan application upfront. These models are ussally brist bassed on: - Custumer profiles: gendes, age, profiession, family stage and education level. - Credit histary: ummbher of credit cardi payments in urrears, credit monomant, and credit limiit. - Finnacinl situation: gross income, expenises and transactions, total taxable income, and salary:- - Asset information: house prioe, lavd area and future growth fuctors Suppose we bave a set of 1442 past home loan applications and their corresponding inforration. You built a xgboost clusifyer that corrextly identifies 8 out of 15 troe defiulters, and 1384 out of 1427 trise non-defaulters. (a) Show a complete confusion matrix. What are the sersitivity and sqecificity values? (4 masks) (b) Suppose we incorporate the information that it is 5 times more costly to approval a loan that would potentially default than to deny a losen to a good customer. What are the possible ways that the sensitivity and sperificity may get affected? (4 marks) (c) How wonld yoa build a xgboost mobel in this context? Inchude the following asperts in your brief description: (15 marks) - preparation of the data attributes - decisions regarding potential outlors, missing vahes und variable scaling - fitting and regularisation of xgboost