Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

PLEASE CODE IN C++ We want to classify this email as either spam or not spam. Typically, the filter will consider the entire email and

PLEASE CODE IN C++

We want to classify this email as either spam or not spam. Typically, the filter will consider the entire email and look for multiple words that are common in spam emails. For our filter, we will consider a single word. For this example, we will classify the email based on the word best. Assume the probability that any particular email is spam is 0.25, and the probability that any particular email is not spam is 0.75. To classify the mystery email (above), we want to compute the probability that this email is spam given that it contains the word best. Then we want to compute the probability that this email is NOT spam given that it contains the word best. We then classify based on which probability is higher. Lets define a couple of variables: 1. C: email contains the word best 2. C: email does NOT contain the word best 3. S: email is spam 4. S: email is NOT spam Hence, we want to compute P(S|C) and P(S|C).

TEST FILES NEEDED TO STREAM (emailX.txt or spamX.txt) - (X - number):

email1.txt:

dear all,

on behalf of thrill company, i am glad to invite you for a luncheon party with all the senior employees, team members, and other staff members associated with the company. since according to associations policies, we have five working days, therefore we have planned to set a lunch party for saturday, 13th january 2012.

please mark your presence on this party. together, we would get an opportunity to interact with our boss, expand our contacts, learn more about our field and of course eat some mouth watering dishes. this luncheon would be held in a new york cafe situated at park lane. kindly be present by 12:00 noon so that your taste buds do not miss any of the tempting dishes being served!

i request you to confirm your presence latest by wednesday, 9th january 2012 so that we make appropriate bookings.

looking forward to see you on this thrills luncheon party!

sincerely, jacob thomas hr head thrill company

email2.txt:

dear all,

on behalf of thrill company, i am glad to invite you for a luncheon party with all the senior employees, team members, and other staff members associated with the company. since according to associations policies, we have five working days, therefore we have planned to set a lunch party for saturday, 13th january 2012.

please mark your presence on this party. together, we would get an opportunity to interact with our boss, expand our contacts, learn more about our field and of course eat some mouth watering dishes. this luncheon would be held in a new york cafe situated at park lane. kindly be present by 12:00 noon so that your taste buds do not miss any of the tempting dishes being served!

i request you to confirm your presence latest by wednesday, 9th january 2012 so that we make appropriate bookings.

looking forward to see you on this thrills luncheon party!

sincerely, jacob thomas hr head thrill company

email3.txt:

Dear all,

This email is germane to all the team members and other staff circle associated with Marine Company. We are writing this letter to notify you about some temporary changes in office timings.

As all of you are aware that you were supposed to arrive by 10:00 AM and depart by 5:00 PM. But, due to some ongoing transactions in the financial department, your timings have been shifted to 9:00 AM to 4:00 PM. A short lunch break of 15 minutes would now be provided to you at 1:30 PM. Late cases, if any would not be accepted. Employers with applications regarding medical, social or causal purpose leaves are requested to contact HR Department.

These changes in official working timings are affective from July, 23rd. We would further contact you with emails only, if there are any kinds of changes in the scheduled timings. For any kind of queries, you can leave a mail on marinecompany@hotmail.com.

Expecting a full cooperation from all the members of Marine.

Sincerely,

Smith Jacob HR Head Marine Company

spam1.txt:

it training tuition scholarships for college faculty, students and staff

national education foundation cyberlearning, a non-profit organization dedicated to bridging the digital divide since 1994, is offering "no excuse" tuition-free on-line training in information technology to the first 10,000 applicants.

nef cyberlearning courses, sponsored by the u.s. department of commerce in the federal learning exchange, have recently been acclaimed as "the best of the web" in online it training by the forbes magazine. two online course programs are available:

1) personal computing (300+ self-study and instructor-led courses including all microsoft office, web design, lotus notes, internet etc, tuition value of $3,000) for a $75 registration fee, the only cost.

2) information technology (650+ self-study and instructor-led courses, including the above and 350+ certification courses in microsoft, cisco, oracle, novell, web master etc, tuition value of $6,500) for a $270 registration fee, the only cost.

for either program, registration is valid through june 30, 2001 and there are no tuition costs for classes. the registrant receives free unlimited access to the courses, a vast online library, chat areas, certification skill tests and evaluations. this is an exceptional value and a great way for anyone to upgrade it skills and learn new skills.

to sign up, visit www.cyberlearning.org and click on "pc scholarships(300+ courses)" or click on "it scholarships (650+ courses)." then, complete the "teachers and others in education" application. manycolleges, schools and other educational organizations reimburse the training registration fee. over 5,000 educators, faculty and students have already registered.

to bridge the digital divide, nef also provides "no excuse" it training scholarships to disadvantaged school and college students and teachers throughout the nation.

please forward this information to all interested colleagues and others in education, colleges, universities and schools.

to unsubscribe, please reply with "unsubscibe" in the subject line.

about nef: the non-profit national education foundation cyberlearning has provided tuition-free it training to thousands of students, teachers, government and non-profit employees and disadvantaged individuals since 1994.

nef is well on its way to training 100,000 it professionals and a million disadvantaged students nationally through its "no excuse" it training program. nef has earned many distinctions including "the ivy league of it training," "1995 fairfax human rights award," and " a leader in bridging the digital divide."

"you are helping to empower america. i salute you for your ongoing commitment to creating a better america," --- president clinton

"congratulations on a wonderful program," --- congressional leader tom davis(r-va)

"this is an awesome opportunity. you are making a difference."-- washingtonjobs.com

"nef can make a positive difference in the lives of a great number of individuals." --- microsoft

" the best online it training program i have come across. i am using it to train my students in it certification," --- doug bertain, palo alto high school it teacher

" i just want to say thank you on behalf of the many people that benefit from your incredible benevolence." --- lilia nunez, a registrant and a digital divide program beneficiary

"i have found the cyberlearning online courses to be extremely easy and useful. i liked pre-course self-assessment and it books online and available 24/7. the course screens were interactive and made me feel as if i was in the application itself. the site looks and feels very professional. the list of courses is huge. it includes something for almost everyone. i find this to be a very worthy cause." --- ken horowitz, it training coordinator

spam2.txt:

one-pound-a-day diet (back by popular demand)

free delicious caesar salad recipe included in this email !

do you have an over-weight problem that you can't seem to beat? have you tried diet after diet with no results? are you too busy to buy special diet foods? do you want a simple, quick one-pound-a day diet that gets you really slim, really fast?

spam3.txt:

with the holidays approaching quickly... control your weight!! don't let your weight control you!

find out how you can lose weight naturally & effectively. enjoy a new level of energy. convert your fat into energy. new herbal formula that has virtually eliminated the need to diet! to find out more information on this incredible new product e-mail: products98@hotmail.com and type "request more info" in the subject line.

`\|||/ wishing you the best! (@@) don't let the holidays gobble you up! ooo_(_)_ooo___________happy holidays!_______________ _____|_____|_____|_____|_____|_____|_____|_____|______ ___|____|_____|_____|_____|_____|_____|_____|______|___ _____|_____please pardon the intrusion_|____|____|______ if this announcement is of no interest to you we are deeply sorry and apologize for any inconvenience. if you wish to be removed from our database please e-mail: getmeoff98@hotmail.com and type "remove" in the subject line and message body and press send!! it's that easy!! =====================================================

PROMPT:

PLEASE STREAM ALL OF THOSE FILESimage text in transcribedimage text in transcribed IN .TXT FORMAT.

image text in transcribedimage text in transcribedimage text in transcribed

Lab 2:Spam Filter In this lab, you will implement part of a naive Bayes' spam classifier To illustrate how this filter works, consider the following email: Hey! This is the best link I found. I thought you would want to see it! www.somelink.com/example Best Sus We want to classify this email as either spam or not spam. Typically, the filter will consider the entire email and look for multiple words that are common in spam emails. For our filter, we will consider a single word. For this example, we will classify the email based on the word "best". Assume the probability that any particular email is spam is 0.25, and the probability that any particular email is not spam is 0.75 To classify the mystery email (above), we want to compute the probability that this email is spam given that it contains the word "best". Then we want to compute the probability that this email is NOT spam given that it contains the word "best". We then classify based on which probability is higher Let's define a couple of variables 1. C: email contains the word "best" 2. C: email does NOT contain the word "best" 3. S: email is spam 4. S: email is NOT spam H SIC) ence, we want to compute P(SC) and P( Computing the Probability of "best" First, we need to figure out how common "best" is in spam emails and how common "best" is in emails that are not spam. To do this, we have to use sample emails. This is called training data. For this example, we'll use the following emails These are the sample spam emails 1. you've been selected as a winner! click now to get the best anti-virus scanner! Lab 2:Spam Filter In this lab, you will implement part of a naive Bayes' spam classifier To illustrate how this filter works, consider the following email: Hey! This is the best link I found. I thought you would want to see it! www.somelink.com/example Best Sus We want to classify this email as either spam or not spam. Typically, the filter will consider the entire email and look for multiple words that are common in spam emails. For our filter, we will consider a single word. For this example, we will classify the email based on the word "best". Assume the probability that any particular email is spam is 0.25, and the probability that any particular email is not spam is 0.75 To classify the mystery email (above), we want to compute the probability that this email is spam given that it contains the word "best". Then we want to compute the probability that this email is NOT spam given that it contains the word "best". We then classify based on which probability is higher Let's define a couple of variables 1. C: email contains the word "best" 2. C: email does NOT contain the word "best" 3. S: email is spam 4. S: email is NOT spam H SIC) ence, we want to compute P(SC) and P( Computing the Probability of "best" First, we need to figure out how common "best" is in spam emails and how common "best" is in emails that are not spam. To do this, we have to use sample emails. This is called training data. For this example, we'll use the following emails These are the sample spam emails 1. you've been selected as a winner! click now to get the best anti-virus scanner

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Successful Keyword Searching Initiating Research On Popular Topics Using Electronic Databases

Authors: Randall MacDonald, Susan MacDonald

1st Edition

0313306761, 978-0313306761

More Books

Students also viewed these Databases questions