Question: This exercise concerns the classification of spam email. Create a corpus of spam email and one of non-spam mail. Examine each corpus and decide what

This exercise concerns the classification of spam email. Create a corpus of spam email and one of non-spam mail. Examine each corpus and decide what features appear to be useful for classification: unigram words? bigrams? message length, sender, time of arrival? Then train a classification algorithm (decision tree, naive Bayes, SVM, logistic regression, or some other algorithm of your choosing) on a training set and report its accuracy on a test set.

Step by Step Solution

3.38 Rating (160 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

To produce a corpus of spam emails and nonspam emails we can either collect emails ourselves or use being datasets There are several intimately availa... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Artificial Intelligence A Modern approach Questions!