Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

For this exercise, Default dataset from ISLR will be used. Default dataset has 9667 instances of default = = No, yet only 333 instances have

image text in transcribed

For this exercise, Default dataset from ISLR will be used. Default dataset has 9667 instances of default = = No, yet only 333 instances have default = =Yes A one predictor logistic regression model will be Constructed withdefaultas the response variable andbalance' as the only predictor variable. library (ROCR) data (Default, package = "ISLR") str (Default) ## 'data frame' 10000 obs. of 4 variables: ## default: Factor w/2 levels "No", "Yes": 1 1 1 1 1 1 1 1 1 1 ... ## $ student Factor w/2 levels "No", "Yes" 1 2 1 1 1 2 1 2 1 1 ... ## $ balance: num 730 817 1074 529 786 ... ## $ income: num 44362 12106 31767 35704 38463 (a) Selection of p_thr via F_1 score Recall in the previous homework assignment, we selected the best threshold for a logistic regression by 10-fold CV and used misclassification rate as a measure of (in)accuracy or error rate. Misclassification rate treats both positive and negative class as equally important, it may not be suitable for analyzing imbalanced datasets like Default. In the extreme case, a classification model will simply predict the majority class label. F_1 score (equation 5.76 in Tan) is an alternative score we can use when dataset has class imbalance. In order to calculate F_1, regard the minority class as Positive, and the majority class as Negative Fit a logistic regression model (with default as the response and balance as the predictor) and name it def .glm. [2] Use prediction () in library (ROCR) to find TP, FP and FN for each cutoff threshold value, name them as tp, fp and fn respectively. Then calculate F1 score name the calculated F1 score as f1 Report f1, tp, fp and fn of first 10 observations. Plot F_1 score as function of threshold values. Use which .max () to select the p_thr yielding the maximal F_1 score. What's your p_thr

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Jeffrey A. Hoffer Fred R. McFadden

4th Edition

0805360476, 978-0805360479

More Books

Students also viewed these Databases questions