Question
R programming: It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare. Questions
R programming:
It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare. Questions like should a loan be approved, is a driver entitled a discount, and will a patient survive are all answered with a form of logistic regression (i.e., with a Yes/No answer).
Using a dataset representing applications for a bank loan, the task will be to build a logistic regression model that can predict whether or not a loan will be approved.
Useful R functions for this work are:
- Data explorations: na(), summary()
- Split data into train/test: sample()
- Build the model: glm(), summary()
- Model performance evaluation: predict()
- Model validation: library(gains), gains(), plot(), lines(), dim(), library(car), vif(), glm(), summary(), predict(), ifelse()
- Validate prediction: table(), mean()
- Results interpretation: library(ROCR), predict(), prediction(), performance()
For this activity, perform the following:
Load the "application_record.csv," located down
https://www.kaggle.com/datasets/rikdifos/credit-card-approval-prediction dowmload the file and use "application_record.csv",
- Display representative portions of the data.
- Check for missing values and clean the data.
- Check for outliers and decide if and how to process them.
Formally state what your model will predict using the variables in the data.
Split the data into a training set and a testing set with a split ratio of 70:30.
Build the Predictive Model:
- Define the formula for the glm().
- Run the model.
- Interpret the results, referring to the p-values.
Evaluate the Model Performance:
- Compare the predicted versus actual values.
- Search for any predictions that differ significantly from the actual values.
Validate the Model:
- Produce a Gain and Lift chart and use it to describe the performance of the model.
- Measure the Variation Inflation Factor (VIF) to test for multicollinearity. If changes are necessary to the model based in VIF, state and implement them.
- Has the formula, as defined in the previous section, changed? Why or why not?
- If changes to the model occurred, repeat the validation steps on the new model.
Make Predictions:
- Demonstrate a few examples of predictions your model can make.
- Validate the predictions by calculating the misclassification error.
- Interpret the results.
State a few suggestions for improving the model.
write professionally written and formatted R Markdown document knitted as a word. Make sure the documentation contains the R code, relevant plots, your analysis, and the appropriate citations and references.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started