Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

R programming: It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare. Questions

R programming:

It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare. Questions like should a loan be approved, is a driver entitled a discount, and will a patient survive are all answered with a form of logistic regression (i.e., with a Yes/No answer).

Using a dataset representing applications for a bank loan, the task will be to build a logistic regression model that can predict whether or not a loan will be approved.

Useful R functions for this work are:

  1. Data explorations: na(), summary()
  2. Split data into train/test: sample()
  3. Build the model: glm(), summary()
  4. Model performance evaluation: predict()
  5. Model validation: library(gains), gains(), plot(), lines(), dim(), library(car), vif(), glm(), summary(), predict(), ifelse()
  6. Validate prediction: table(), mean()
  7. Results interpretation: library(ROCR), predict(), prediction(), performance()

For this activity, perform the following:

Load the "application_record.csv," located down

https://www.kaggle.com/datasets/rikdifos/credit-card-approval-prediction dowmload the file and use "application_record.csv",
  1. Display representative portions of the data.
  2. Check for missing values and clean the data.
  3. Check for outliers and decide if and how to process them.

Formally state what your model will predict using the variables in the data.

Split the data into a training set and a testing set with a split ratio of 70:30.

Build the Predictive Model:

  1. Define the formula for the glm().
  2. Run the model.
  3. Interpret the results, referring to the p-values.

Evaluate the Model Performance:

  1. Compare the predicted versus actual values.
  2. Search for any predictions that differ significantly from the actual values.

Validate the Model:

  1. Produce a Gain and Lift chart and use it to describe the performance of the model.
  2. Measure the Variation Inflation Factor (VIF) to test for multicollinearity. If changes are necessary to the model based in VIF, state and implement them.
  3. Has the formula, as defined in the previous section, changed? Why or why not?
  4. If changes to the model occurred, repeat the validation steps on the new model.

Make Predictions:

  1. Demonstrate a few examples of predictions your model can make.
  2. Validate the predictions by calculating the misclassification error.
  3. Interpret the results.

State a few suggestions for improving the model.

write professionally written and formatted R Markdown document knitted as a word. Make sure the documentation contains the R code, relevant plots, your analysis, and the appropriate citations and references.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Reading, Writing, And Proving A Closer Look At Mathematics

Authors: Ulrich Daepp, Pamela Gorkin

2nd Edition

1441994793, 9781441994790

More Books

Students also viewed these Mathematics questions