Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1.Load the LoanData.csv data set into R.It lists the outcome of 5611 loans.The data variables include loan status (current, late or in default), credit grade

1.Load the LoanData.csv data set into R.It lists the outcome of 5611 loans.The data variables include loan status (current, late or in default), credit grade (from best rating AA to the worst one, HC for heavy risk), loan amount, loan age (in months), borrower's interest rate and the debt to income ratio. Code loan status as a binary outcome (0 for current loans, 1 for late or default loans). Code debt-to-income ratio into three levels ('low' for ratio<10%, 'medium' for ratio between 10% and 30%, 'high' for ratio above 30%).[10 points]

2.Fit the recoded data set using logistic regression.Use Credit.Grade, Amount, Age, Borrower.Rate and Debt to Income Ratio (recoded) as the explanatory variables.Copy the glm summary output from R and paste it below. [10 points]

3.Evaluate in-sample fitting of your logistic regression model using .5 as the cutoff probability.Display the confusion matrix below. [10 points]

4.The cutoff probability should be around 92.43% with symmetric costs of misclassification.Why?Display the confusion matrix using the updated cutoff probability below.What's the overall in-sample misclassification rate in this case?[10 points]

5.Randomly select 4611 out of 5611 loans as your training set.Apply the fitted logistic model to the 1000 loans from your test set.Choose the appropriate cutoff probability assuming symmetric costs of misclassification [see step 4].What's your out-of-sample prediction accuracy rate based on the test set's confusion matrix?[10 points]

6.Sort the 1000 loans in your test set according to the predicted default probabilities in decreasing order.Use a FOR loop to calculate the lift. Then plot the lift chart for your test set.[10 points]

7.Calculate the out-of-sample prediction accuracy rate for 20 random test samples (sample size=1000).Display the 20 accuracy rates and their mean below. [10 points]

8.Please briefly explain why Nave Bayes classifier is considered as a nave implementation of the Bayes' Theorem?[10 points]

9.Load packages textir and e1071 into R.Perform Nave Bayes Analysis using the political sentiment data set (as we did in lecture 10).Use 300 randomly selected observations as training set and the remaining 100 as your test set.Display the test set's confusion matrix below. [10 points]

10.Calculate the out-of-sample prediction accuracy rates for ten random test samples (sample size =100).Display the 10 accuracy rates and their mean below. [10 points]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Differential Equations and Linear Algebra

Authors: Jerry Farlow, James E. Hall, Jean Marie McDill, Beverly H. West

2nd edition

131860615, 978-0131860612

More Books

Students also viewed these Mathematics questions

Question

Find out how satisfi ed our suppliers are with us as a customer?

Answered: 1 week ago

Question

1. What is meant by Landslide? 2.The highest peak in Land?

Answered: 1 week ago

Question

What are the impact of sand mining in rivers ?

Answered: 1 week ago

Question

What are the important Land forms in Lithosphere ?

Answered: 1 week ago

Question

What Is The Responsibility Of A Pharmacist?

Answered: 1 week ago