The database Spambase.xlsx** contains detailed in for mation on the words used in a large number of
Question:
The database Spambase.xlsx** contains detailed in for mation on the words used in a large number of emails. The emails are classified as Spam or Not Spam. The goal is to create a Naïve Bayes model to determine whether individual emails are spam or not.
In each of the following exercises, the goal is to develop a classification or prediction model for the given situation and data. Follow these steps:
1. Examine the data descriptions and explore the data.
2. Clean the data as needed.
3. Transform the data as needed (e.g., create dummy variables for categorical variables).
4. Partition the data.
5. Run the specified algorithm.
6. Interpret the results and choose the best model (e.g., choose the best k in the k-Nearest Neighbor method).
Step by Step Answer:
Management Science The Art Of Modeling With Spreadsheets
ISBN: 1301
4th Edition
Authors: Stephen G. Powell, Kenneth R. Baker