Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Step 1 : Read in the Data Read the data into R List the structure of the data ( str ) Execute a summary of

Step 1: Read in the Data
Read the data into R
List the structure of the data (str)
Execute a summary of the data
Print the first six records
Step 2: Classification Decision Tree
Using the code discussed in the lecture, split the data into training and testing data sets.
Use the rpart library to predict the variable TARGET_BAD_FLAG
Develop two decision trees, one using Gini and the other using Entropy using the training and testing data
All other parameters such as tree depth are up to you.
Do not use TARGET_LOSS_AMT to predict TARGET_BAD_FLAG.
Plot both decision trees
List the important variables for both trees
Using the training data set, create a ROC curve for both trees
Using the testing data set, create a ROC curve for both trees
Write a brief summary of the decision trees discussing whether or not the trees are are optimal, overfit, or underfit.
Rerun with different training and testing data at least three times.
Determine which of the two models performed better and why you believe this
Step 3: Regression Decision Tree
Using the code discussed in the lecture, split the data into training and testing data sets.
Use the rpart library to predict the variable TARGET_LOSS_AMT
Do not use TARGET_BAD_FLAG to predict TARGET_LOSS_AMT.
Develop two decision trees, one using anova and the other using poisson
All other parameters such as tree depth are up to you.
Plot both decision trees
List the important variables for both trees
Using the training data set, calculate the Root Mean Square Error (RMSE) for both trees
Using the testing data set, calculate the Root Mean Square Error (RMSE) for both trees
Write a brief summary of the decision trees discussing whether or not the trees are are optimal, overfit, or underfit.
Rerun with different training and testing data at least three times.
Determine which of the two models performed better and why you believe this
Step 4: Probability / Severity Model Decision Tree (Push Yourself!)
Using the code discussed in the lecture, split the data into training and testing data sets.
Use the rpart library to predict the variable TARGET_BAD_FLAG
Use the rpart library to predict the variable TARGET_LOSS_AMT using only records where TARGET_BAD_FLAG is 1.
Plot both decision trees
List the important variables for both trees
Using your models, predict the probability of default and the loss given default.
Multiply the two values together for each record.
Calculate the RMSE value for the Probability / Severity model.
Rerun at least three times to be assured that the model is optimal and not over fit or under fit.
Comment on how this model compares to using the model from Step 3. Which one would your recommend using?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Power Of Numbers In Health Care A Students Journey In Data Analysis

Authors: Kaiden

1st Edition

8119747887, 978-8119747887

More Books

Students also viewed these Databases questions

Question

Outline the steps that might be involved in a simulation study.

Answered: 1 week ago

Question

Organizing Your Speech Points

Answered: 1 week ago