Question
Nave Bayes Classifier Overview: In this assignment we will classify an email as spam or legit. Specifically emails from enron that have been made publicly
Nave Bayes Classifier Overview:
In this assignment we will classify an email as spam or legit. Specifically emails from enron that have been made publicly available have been recoded into a term-document matrix that shows each email and words that appear (something well learn more about in text mining). I have provided you code to get started. Please follow the directions below (and code) to create a nave bayes classifier model to predict an email as legit or spam. 1. Download the email.csv from canvas and import to a dataset named email 2. Review the term document matrix, how many columns does it have? 3. Run the following code to see the top words used in the spam email spam_df = email_df.loc[email_df['message_label'] == 'spam'] spam_totals = spam_df.groupby('message_label').sum() spam_totals = spam_totals.drop('message_index', axis=1) spam_totals.T.sort_values(by='spam', ascending=False).head(10) 4. Modify the same code in 3 to identify the top words in legit emails. What is the top appearing word in a spam email? What is the top appearing word in a legit email? 5. Convert the dependent variable message label to a 1,0 categorical outcome. 6. Run the following code to transform the binary classification into categorical variables word_list = email_df.columns for col in [word_list]: email_df[col] = email_df[col].astype('category') 7. Split the data into 75% training and 25% validation using random_state = 2 and stratify = y 8. Is there a proportion imbalance? 9. Create a nave bayes classifier to predict message_label on the training set 10. Using the predict function and the confusion matrix/classification report, what is the overall accuracy of the model on the training and validation sets?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started