Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please solve this Data Science problem. Consider a data set with instances belonging to one of two classes - positive (+) and negative(-). A classifier

image text in transcribed

Please solve this Data Science problem.

Consider a data set with instances belonging to one of two classes - positive (+) and negative(-). A classifier was built using a training set consisting of equal number of positive and negative instances. Among the training instances, the classifier has an accuracy m on the positive class and an accuracy of n on the negative class. (For example, if m=0.7, then among all the training samples that are truly in the positive class, 70% are correctly classified.) The trained classifier is now tested on two data sets. Both have similar data characteristics as the training set. The first data set has 1000 positive and 1000 negative instances. The second data set has 100 positive and 1000 negative instances. A. (10 Points) Draw the expected (i.e., expectation in statistics) confusion matrix summarizing the expected classifier performance on the two data sets. Assume the model keeps the same accuracy performance on test samples in each class as m and n. B. (10 Points) What is the accuracy of the classifier on the training set? Compute the precision, TPR and FPR for the two test data sets using the confusion matrix from part A. Also report the accuracy of the classifier on both data sets. C. (Extra credit 6 points) i). If the skew in the test data - the ratio of the number of positive instances to the number of negative instances, is 1:s, what is the accuracy of the algorithm on this data set? Express your answer in terms of s,m,n. ii). What value does the overall accuracy approach to ifs is very large ( >>1) ? And when s is very small (1) ? D. (Extra credit 4 points) In the scenario where the class imbalance is pretty high (say, >>500 for part C), how are precision and recall better metrics in comparison to overall accuracy? What information does precision capture that recall doesn't

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Moving Objects Databases

Authors: Ralf Hartmut Güting, Markus Schneider

1st Edition

0120887991, 978-0120887996

More Books

Students also viewed these Databases questions