Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please do respond, it's urgent, The code is needed in c++. The spam filter is shown in the picture below: 3.1 Fuzziness In working with

Please do respond, it's urgent, The code is needed in c++.

The spam filter is shown in the picture below:

image text in transcribedimage text in transcribed

3.1 Fuzziness In working with strings, fuzziness means tolerance for typos. So for example, if you receive an email that reads: "Duble your amezing oportunity for a gret incom and lats of cesh cesh cesh", it is still very likely a spam, only written by someone very bad at spelling (or someone trying to circumvent the simple filters like one in our first task.) An advanced filter should be able to catch this type of spam, by computing something called Levenshtein distance. Your advanced filter should tolerate words that are Levenshtein distance = 1 away from the desired word. A Levenshtein distance of 1 between two strings can happen in three following ways: A letter is substituted for another letter, for example income incone) esh) A letter is missing from the original word, for example (cash There is an extra letter in the word, for example love lorve) Build a spam filter that for each word in the email allows a Levenshtein distance of 1. So for example, "Duble your amezing oportunity for a sgreat incom and lats of cash csh casch" will have the same score as the original phrase, as every single word has Levenshtein distance that is not larger than 1. To make the concept more clear, here are some more examples below: "Month free triels " should receive a score of 0 because "triels" is more than distance 1 away from the word emph" trial" (it has both the substitution and an extra letter), so we can not count that word, and consequently the entire phrase. "Yours Love lifes opportuniti" should receive the score of 25 + 10 = 35. "SS per week" should receive the score of 10 because "SS" is a string "S" that has an extra letter 2 Spam filter: Basic version The goal of this application is to implement a spam filter. Namely, a lot of emails we receive nowadays are emails that advertise various products or spread false information and we would like to exclude these emails from our inbox. The first and the simplest version of the spam filter is a task for Member 1. The filter will work by scanning a given email and looking for certain key- words or phrases that might indicate that the email is spam. In the table below. you are given in keyword phrases. For each given keyword phrase, there is an associated score, and for every mention of the given key word, the total score is increased. If the total score is under 10. you should output NOT SPAM. If the score is at least 10 and less than 50, you should output MAYBE SPAM and if the score is at least 30, you should output DEFINITELY SPAM. See the table below: Keyword Secre Eam per week Double your Income in ons wack Trial that lasts forever 15 Opportunity Month free trial 15 So for example, the email that says: "Double your amazing opportunity for a great income and lots of cash cash cash should be classified as DEFINITELY SDAM because its score is 20 + 10 + 10.20 2020 100. On the other hand, an email that says: "Does your income matter more than your love life? should be classified as MAYBE SPAM because its score is 10+ 25 = 35. Please keep the following remarks in mind. To implement this part of func- tionality and make sure this works is the responsibility of the Member 2: 1. The multi-word phrase counts into the score only if it is the entire phrase So a phrase has to be whole to contribute to the total score. 2. The spam filter is not case-sensitive. This means that eam per week counts equally as " INPR Week". equally as "EARN PER WEEK".ct. 3. The keyword or a phrase contributes to the score for every Occurrence in the email. So if the keyword "cash". as in the example above, occurs three times, it contributes to the score every time. 4. Notice that some phrases and in the same way in which other phrases begin, so some phrases will share words. Because we can take into account only entire phrases, you should always prefer to end the first phrase than to begin the next phrase. For example, if we have an email that says "Mouth free Trial that lasts forrier", the total score of that phrase should be 15 (only counting "Mouth free trial. There should always be at least one empty space between different phrases, so then the email that "555" should have the score because the keyword "SSS does not exist, and there is no space between the consecutive dollar signs to give the credit to each Kuence of "s". 3.1 Fuzziness In working with strings, fuzziness means tolerance for typos. So for example, if you receive an email that reads: "Duble your amezing oportunity for a gret incom and lats of cesh cesh cesh", it is still very likely a spam, only written by someone very bad at spelling (or someone trying to circumvent the simple filters like one in our first task.) An advanced filter should be able to catch this type of spam, by computing something called Levenshtein distance. Your advanced filter should tolerate words that are Levenshtein distance = 1 away from the desired word. A Levenshtein distance of 1 between two strings can happen in three following ways: A letter is substituted for another letter, for example income incone) esh) A letter is missing from the original word, for example (cash There is an extra letter in the word, for example love lorve) Build a spam filter that for each word in the email allows a Levenshtein distance of 1. So for example, "Duble your amezing oportunity for a sgreat incom and lats of cash csh casch" will have the same score as the original phrase, as every single word has Levenshtein distance that is not larger than 1. To make the concept more clear, here are some more examples below: "Month free triels " should receive a score of 0 because "triels" is more than distance 1 away from the word emph" trial" (it has both the substitution and an extra letter), so we can not count that word, and consequently the entire phrase. "Yours Love lifes opportuniti" should receive the score of 25 + 10 = 35. "SS per week" should receive the score of 10 because "SS" is a string "S" that has an extra letter 2 Spam filter: Basic version The goal of this application is to implement a spam filter. Namely, a lot of emails we receive nowadays are emails that advertise various products or spread false information and we would like to exclude these emails from our inbox. The first and the simplest version of the spam filter is a task for Member 1. The filter will work by scanning a given email and looking for certain key- words or phrases that might indicate that the email is spam. In the table below. you are given in keyword phrases. For each given keyword phrase, there is an associated score, and for every mention of the given key word, the total score is increased. If the total score is under 10. you should output NOT SPAM. If the score is at least 10 and less than 50, you should output MAYBE SPAM and if the score is at least 30, you should output DEFINITELY SPAM. See the table below: Keyword Secre Eam per week Double your Income in ons wack Trial that lasts forever 15 Opportunity Month free trial 15 So for example, the email that says: "Double your amazing opportunity for a great income and lots of cash cash cash should be classified as DEFINITELY SDAM because its score is 20 + 10 + 10.20 2020 100. On the other hand, an email that says: "Does your income matter more than your love life? should be classified as MAYBE SPAM because its score is 10+ 25 = 35. Please keep the following remarks in mind. To implement this part of func- tionality and make sure this works is the responsibility of the Member 2: 1. The multi-word phrase counts into the score only if it is the entire phrase So a phrase has to be whole to contribute to the total score. 2. The spam filter is not case-sensitive. This means that eam per week counts equally as " INPR Week". equally as "EARN PER WEEK".ct. 3. The keyword or a phrase contributes to the score for every Occurrence in the email. So if the keyword "cash". as in the example above, occurs three times, it contributes to the score every time. 4. Notice that some phrases and in the same way in which other phrases begin, so some phrases will share words. Because we can take into account only entire phrases, you should always prefer to end the first phrase than to begin the next phrase. For example, if we have an email that says "Mouth free Trial that lasts forrier", the total score of that phrase should be 15 (only counting "Mouth free trial. There should always be at least one empty space between different phrases, so then the email that "555" should have the score because the keyword "SSS does not exist, and there is no space between the consecutive dollar signs to give the credit to each Kuence of "s

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle PL/SQL Programming Database Management Systems

Authors: Steven Feuerstein

1st Edition

978-1565921429

More Books

Students also viewed these Databases questions

Question

=+is irrational.

Answered: 1 week ago

Question

Discuss communication challenges in a global environment.

Answered: 1 week ago