Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write a program that will read the contents of a series of emails and determine which emails should be considered spam. The analysis will be

Write a program that will read the contents of a series of emails and determine which emails should be considered spam. The analysis will be printed in a summary report that is written to a file.

Each email message will contain:

Sender - Recipients - Subject - Message body ---eom--- (this exact String will be on a line of its own and designate the end of message)

This program will use line-based file processing to access each line in the email message, but will need to use token-based processing to analyze the message body for each email.

The contents of all emails (one after the other) will be stored in a file called emails.txt. Each email will be analyzed /to determine how likely it is to be spam. Our program is not very smart, so it simply counts the number of times that a spam-like word appears in the email. Words to look for include:

offer - wire - bank - fund - transfer - lottery

This program will count the number of occurrences of these keywords in a single email. Note that keyword searching should be case-insensitive and the words may be partial words of a larger word ("fund" in "Fundraising" counts as an occurrence). Threshold for spam keywords. Create a class constant at the top of your program. If the number of spam keywords for an email is greater than or equal to the threshold, then that message should be considered spam.

As the program analyzes each email, it should print to the summary to a new file called summary.txt using a PrintStream. The summary should include the subject of each email; however, if an email is deemed spam, the marker **SPAM** should appear in front of the subject.

In order to print the subject of each email, the program will need to "remember" this information from the beginning of the message until after the entire message is processed (the ---eom--- is reached). Lastly, it should print a count of the number of email analyzed.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Temporal Databases Research And Practice Lncs 1399

Authors: Opher Etzion ,Sushil Jajodia ,Suryanarayana Sripada

1st Edition

3540645195, 978-3540645191

Students also viewed these Databases questions