Question
Suppose the five most common words appearing in spam emails are shipping !, today !, here !, available , and fingertips ! Many spam filters
Suppose the five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips! Many spam filters separate spam from ham (email not considered to be spam) through application of Bayes' theorem. Suppose that for one email account, 1 in every 10 messages is spam and the proportions of spam messages that have the five most common words in spam email are given below.
shipping! | 0.051 |
today! | 0.046 |
here! | 0.032 |
available | 0.015 |
fingertips! | 0.014 |
Also suppose that the proportions of ham messages that have these words are the following.
shipping! | 0.0013 |
today! | 0.0021 |
here! | 0.0022 |
available | 0.0042 |
fingertips! | 0.0010 |
(a) If a message includes the word shipping!, what is the probability the message is spam? (Round your answer to three decimal places.)_______
If a message includes the word shipping!, what is the probability the message is ham? (Round your answer to three decimal places.)________
Should messages that include the word shipping! be flagged as spam?
They {---Select--- should should not be }flagged as spam because the probability that a message is spam if it includes the word shipping! is { ---Select--- high low }.
(b) If a message includes the word today!, what is the probability the message is spam? (Round your answer to three decimal places.)______
If a message includes the word here!, what is the probability the message is spam? (Round your answer to three decimal places.)______
Which of these two words is a stronger indicator that a message is spam? Why?
A message that includes the word {---Select--- today! here!} is more likely to be spam because
P(spam | today!) {is ---Select--- larger smaller} than P(spam | here!).
(c)If a message includes the word available, what is the probability the message is spam? (Round your answer to three decimal places.)______
If a message includes the word fingertips!, what is the probability the message is spam? (Round your answer to three decimal places.)______
Which of these two words is a stronger indicator that a message is spam? Why? A message that includes the word {---Select--- fingertips! available} is more likely to be spam because
P(spam | available) is { ---Select--- larger smaller} than P(spam | fingertips!).
(d)What insights do the results of parts (b) and (c) yield about what enables a spam filter that uses Bayes' theorem to work effectively?
A message containing a word is {---Select--- less more} likely to be spam when that word occurs more often in unwanted messages (spam) and less often in legitimate messages (ham).
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started