Answered step by step
Verified Expert Solution
Question
1 Approved Answer
I. Follow the instructions of Videos on LPM, logistic, and random forest models on the bank loan data. Make the R code and submit
I. Follow the instructions of Videos on LPM, logistic, and random forest models on the bank loan data. Make the R code and submit as txt file to D2L. Do not copy and paste but you have to write your own code for practice. II. Fannie Mae Mortgage performance data Let's consider the Fannie Mae Mortgage performance data The Fannie Mae Mortgage originated and performance data are combined from Fannie Mae data. The data are significantly reduced and changed to new format to be used for business analytics class purpose. If anyone wants to access the full data, visit http://www.fanniemae.com/portal/index.html site. Variables Included Variable Description orig.rt ORIGINAL INTEREST RATE origt trm ORIGINAL LOAN TERM orig amt ORIGINAL LOAN AMOUNT oltv ocltv num bo dti cscore b num ut orgyear deling fstimebuyer occ stat ORIGINAL LOAN-TO-VALUE (LTV) ORIGINAL COMBINED LOAN-TO-VALUE (CLTV) NUMBER OF BORROWERS ORIGINAL DEBT TO INCOME RATIO BORROWER CREDIT SCORE AT ORIGINATION NUMBER OF UNITS ORGINATE YEAR HIGHER THAN 30 DAYS DELINQUENCY FIRST TIME HOME BUYER INDICATOR OCCUPANCY TYPE P = Principal . S = Second . I = Investo purpose relo flg state = Purchase C Cash-out Refinance R = No Cash-out Refinance RELOCATION MORTGAGE INDICATOR LOAN ORIGINATED IN R code in bigblue: wwwwww. # Adding some functions to use for analysis from server source("/var/www/html/jlee141/econdata/R/func lib.R") #3 Read Data from bigblue server gse Suppose you want to find the best performing machine-learning algorithm to predict the delinquency of the mortgage loans bought by Fannie Mae. If any mortgage was not paid on time, it will go into delinquency. The data includes "deling" column that indicates 1 if the mortgage payment was not made for more than 30 days and 0 otherwise. Use train data set estimate the best models, and compare the performance using test data for the following models. 1) Linear probability Model (minimum 2 models: Your own model, stepwise) 2) Logistic Model (minimum 2 models: Your own model, stepwise) 3) One best Random Forest Model (the performance may vary by mtry and ntree options. Find the best performing one). You need to include the following code prior to the random forest estimation: library (randomForest) train$deling b. Error Rate C. True Positive Rate (TRP) d. False Positive Rate (FPR) Submit your R code as a text file with your answers in comments. The grade will be depending upon your R code and the prediction performance of your best model. The best performing model will be recognized as the best ML model of the week.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started