The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips! (Andy
Question:
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips! (Andy Greenberg, “The Most Common Words In Spam Email,” Forbes website, March 17, 2010). Many spam filters separate spam from ham (e-mail not considered to be spam) through application of Bayes’ theorem. Suppose that for one e-mail account, 1 in every 10 messages is spam and the proportions of spam messages that have the five most common words in spam email are given below.
Shipping! ............ .051
Today! ............ .045
Here! ............ .034
Available ........... .014
Fingertips! ............ .014
Also suppose that the proportions of ham messages that have these words are
Shipping! ............0015
Today! .............0022
Here! ...............0022
Available ...........0041
Fingertips! ............0011
a. If a message includes the word shipping!, what is the probability the message is spam? If a message includes the word shipping!, what is the probability the message is ham? Should messages that include the word shipping! be flagged as spam?
b. If a message includes the word today!, what is the probability the message is spam? If a message includes the word here!, what is the probability the message is spam? Which of these two words is a stronger indicator that a message is spam? Why?
c. If a message includes the word available what is the probability the message is spam? If a message includes the word fingertips!, what is the probability the message is spam? Which of these two words is a stronger indicator that a message is spam? Why?
d. What insights do the results of parts (b) and (c) yield about what enables a spam filter that uses Bayes’ theorem to work effectively?
Step by Step Answer:
Essentials Of Statistics For Business And Economics
ISBN: 9781305081598
7th Edition
Authors: David Anderson, Thomas Williams, Dennis Sweeney, Jeffrey Cam