Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

. . MIS 3210 - Home Work 4 Evaluating Classification Models You need to understand and be able to calculate the following three terms to

image text in transcribed
image text in transcribed
image text in transcribed
. . MIS 3210 - Home Work 4 Evaluating Classification Models You need to understand and be able to calculate the following three terms to do this activity. Precision is the fraction of positively predicted outcomes that are actually positive i.e., If the output is predicted to be positive, what is the chance that it is actually positive? Recall is the fraction of all actual positive data points in our sample that are predicted as positive i.e., out of all the actual positive outcomes, how many have been predicted as positive? Overall accuracy is the fraction of all data points that have been predicted correctly i.e., out of all the data points how many positives have been predicted as positive and how many negatives have been predicted as negative? Note: - You can leave all the answers in fractions but putting them in decimals or percentages would make them more interpretable. Before attempting the questions, you need to understand what is across the rows and what is across the columns. You also need to understand what each of the numbers in each of the cells means. Put them in plain English. For example, what is the number "2", what is the number "15"? (91) We have built a new spam filter and want to evaluate how good it is. Given below is the confusion matrix for the spam filter for 100 e-mails. Predicted Spam Not Spam Actual Spam 15 10 Not Spam 5 70 (a) What is the number 15 here? Put it in plain English (b) Calculate the Precision for the Spam Filter. What is the interpretation of having this value for precision i.e., How would you explain this to someone who doesn't know how precision is calculated but still uses e- mail and gets spam e-mails? (c) Calculate the recall for the Spam Filter. What is the interpretation of having this value for recall i.e., How would you explain this to someone who doesn't know how recall is calculated but still uses e-mail and gets spam mails? (d) You can see here that the precision is very good for this spam filter but the recall is not so good. What does it mean to have high precision and low recall (Hint: Think about how you interpret precision and recall and apply it to the context of spam filter). What might the possible reason you are seeing these results? (e) What does it mean to have high recall and low precision for a spam filter? Which of the two do you think is better i.e., high precision and low recall or high recall and low precision. (1) What is the overall accuracy of the spam filter? What do you mean when you say this spam filter has this value of accuracy? (02) You have the confusion matrix for the performance of a classifier I used to predict a student's grade in the class. These grades are based on the overall scores of the students across different assessments including Quizzes, Home Works, Attendance, in-class activities etc., The range for each of the grades in given below . . A 90-100 B 80-89 C 70-79 D below 70 Predicted B A Actual B A 10 4 2 1 H00 N 3 6 9 3 ENWO D 1 (a) There are two "4" s in this matrix. Provide a plain English description for each of them. (b) How many students do I have in the class? (c) In plain English, define what is "Precision for Grade B" in this context. What is the Precision for Grade C and for Grade D? (d) In plain English, define what is "Recall for Grade C" in this context. What is the Recall for Grade A and for Grade B? What does it mean to have a higher recall for one grade and not the other? (e) What is the overall accuracy? . . MIS 3210 - Home Work 4 Evaluating Classification Models You need to understand and be able to calculate the following three terms to do this activity. Precision is the fraction of positively predicted outcomes that are actually positive i.e., If the output is predicted to be positive, what is the chance that it is actually positive? Recall is the fraction of all actual positive data points in our sample that are predicted as positive i.e., out of all the actual positive outcomes, how many have been predicted as positive? Overall accuracy is the fraction of all data points that have been predicted correctly i.e., out of all the data points how many positives have been predicted as positive and how many negatives have been predicted as negative? Note: - You can leave all the answers in fractions but putting them in decimals or percentages would make them more interpretable. Before attempting the questions, you need to understand what is across the rows and what is across the columns. You also need to understand what each of the numbers in each of the cells means. Put them in plain English. For example, what is the number "2", what is the number "15"? (91) We have built a new spam filter and want to evaluate how good it is. Given below is the confusion matrix for the spam filter for 100 e-mails. Predicted Spam Not Spam Actual Spam 15 10 Not Spam 5 70 (a) What is the number 15 here? Put it in plain English (b) Calculate the Precision for the Spam Filter. What is the interpretation of having this value for precision i.e., How would you explain this to someone who doesn't know how precision is calculated but still uses e- mail and gets spam e-mails? (c) Calculate the recall for the Spam Filter. What is the interpretation of having this value for recall i.e., How would you explain this to someone who doesn't know how recall is calculated but still uses e-mail and gets spam mails? (d) You can see here that the precision is very good for this spam filter but the recall is not so good. What does it mean to have high precision and low recall (Hint: Think about how you interpret precision and recall and apply it to the context of spam filter). What might the possible reason you are seeing these results? (e) What does it mean to have high recall and low precision for a spam filter? Which of the two do you think is better i.e., high precision and low recall or high recall and low precision. (1) What is the overall accuracy of the spam filter? What do you mean when you say this spam filter has this value of accuracy? (02) You have the confusion matrix for the performance of a classifier I used to predict a student's grade in the class. These grades are based on the overall scores of the students across different assessments including Quizzes, Home Works, Attendance, in-class activities etc., The range for each of the grades in given below . . A 90-100 B 80-89 C 70-79 D below 70 Predicted B A Actual B A 10 4 2 1 H00 N 3 6 9 3 ENWO D 1 (a) There are two "4" s in this matrix. Provide a plain English description for each of them. (b) How many students do I have in the class? (c) In plain English, define what is "Precision for Grade B" in this context. What is the Precision for Grade C and for Grade D? (d) In plain English, define what is "Recall for Grade C" in this context. What is the Recall for Grade A and for Grade B? What does it mean to have a higher recall for one grade and not the other? (e) What is the overall accuracy

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Cost Accounting

Authors: M.Y. Khan, P.K. Jain

2nd Edition

9339203445, 9789339203443

More Books

Students also viewed these Accounting questions