Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this sub - question, we discuss the common evaluation metrics for imbalanced dataset. Suppose we have a validation dataset and for some rho

In this sub-question, we discuss the common evaluation metrics for imbalanced dataset. Suppose we have a
validation dataset and for some \rho in (0,1), we assume that \rho fraction of the validation examples are positive
examples (with label 1), and 1\rho fraction of them are negative examples (with label 0).
Define the accuracy as
A #examples that are predicted correctly by the classifier
#examples
(i)[1 point (Written)] Show that for any dataset with \rho fraction of positive examples and 1\rho fraction of
negative examples, there exists a (trivial) classifier with accuracy at least 1\rho .
The statement above suggests that the accuracy is not an ideal evaluation metric when \rho is close to 0. E.g.,
imagine that for spam detection \rho can be smaller than 1%. The statement suggests there is a trivial classifier
that gets more than 99% accuracy. This could be misleading 99% seems to be almost perfect, but actually
you dont need to learn anything from the dataset to achieve it.
Therefore, for imbalanced dataset, we need more informative evaluation metrics. We define the number of true
positive, true negative, false positive, false negative examples as
TP #positive examples with a correct (positive) prediction
TN #negative examples with a correct (negative) prediction
FP #negative examples with a incorrect (positive) prediction
FN #positive examples with a incorrect (negative) prediction
Define the accuracy of positive examples as
A1 TP
TP +FN = #positive examples with a correct (positive) prediction
#positive examples
Define the accuracy of negative examples as
A0 TN
TN +FP = #negative examples with a correct (negative) prediction
#negative examples
We define the balanced accuracy as
A1
2(A0+A1)
With these notations, we can verify that the accuracy is equal to A =
TP+TN
(4)
TP+TN+FP+FN.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Expert Performance Indexing In SQL Server

Authors: Jason Strate, Grant Fritchey

2nd Edition

1484211189, 9781484211182

More Books

Students also viewed these Databases questions