Answered step by step
Verified Expert Solution
Link Copied!
Question
1 Approved Answer

Need help with python code please. Question 2 Implement a Naive Bayes classification naiveBayes_classify (word_probs, message) for classifying an email message into spam or non-spam

Need help with python code please.

image text in transcribed

image text in transcribed

Question 2 Implement a Naive Bayes classification naiveBayes_classify (word_probs, message) for classifying an email message into spam or non-spam by using the word probability distributions, word_probs, learned from a set of training data. In this question, you are asked to implement the Naive Bayes method from scratch by implementing the following functions. To simplify the implementation, we assume that any message is equally likely to be spam or not-spam. tokenize (message) : extracts a set of unique words from the given text message. count_words (training_set): creates a dictionary containing the mappings from unique words to the frequencies of the words in spam and non-spam messages in the training set word_probabilities (counts, total_spams, total_non_spams, k=0.5) : turns the word_counts into a list of triplets w, p(w | spam) and p(w | -spam) spam_probability (word_probs, message, total_spams, total_non_spams, k = 0.5): computes the probablity of spam for the given message. naiveBayes_classify(word_probs, message, total_spams, total_non_spams, k): classifies the message as spam or ham Using the data set spam.csv to evaluate the classification in terms of accuracy, recall, precision, and F1-score. Implement the following functions def spam_probability (word_probs, message, total_spams, total_non_spams, k = 0.5): computes the probablity of spam for the given message INPUT: word_probs: a list of triple (W, p(w spam), p(w non-spam)) message: a message under classification OUTPUT: the probability of being spam for the message HINTS: First, get a set of unique words in the mesage. Second, sum up all the log probabilities of the unique words in the message. Third, get probabilities by taking exponentials of the probabilites (for spam and non-spam). Finally, return the ratio of probability of spam over the sum of the probabiliy of spam and the probability of not spam. 111 ######YOUR CODE HERE### return prob_spam / (prob_spam + prob_ham) Question 2 Implement a Naive Bayes classification naiveBayes_classify (word_probs, message) for classifying an email message into spam or non-spam by using the word probability distributions, word_probs, learned from a set of training data. In this question, you are asked to implement the Naive Bayes method from scratch by implementing the following functions. To simplify the implementation, we assume that any message is equally likely to be spam or not-spam. tokenize (message) : extracts a set of unique words from the given text message. count_words (training_set): creates a dictionary containing the mappings from unique words to the frequencies of the words in spam and non-spam messages in the training set word_probabilities (counts, total_spams, total_non_spams, k=0.5) : turns the word_counts into a list of triplets w, p(w | spam) and p(w | -spam) spam_probability (word_probs, message, total_spams, total_non_spams, k = 0.5): computes the probablity of spam for the given message. naiveBayes_classify(word_probs, message, total_spams, total_non_spams, k): classifies the message as spam or ham Using the data set spam.csv to evaluate the classification in terms of accuracy, recall, precision, and F1-score. Implement the following functions def spam_probability (word_probs, message, total_spams, total_non_spams, k = 0.5): computes the probablity of spam for the given message INPUT: word_probs: a list of triple (W, p(w spam), p(w non-spam)) message: a message under classification OUTPUT: the probability of being spam for the message HINTS: First, get a set of unique words in the mesage. Second, sum up all the log probabilities of the unique words in the message. Third, get probabilities by taking exponentials of the probabilites (for spam and non-spam). Finally, return the ratio of probability of spam over the sum of the probabiliy of spam and the probability of not spam. 111 ######YOUR CODE HERE### return prob_spam / (prob_spam + prob_ham)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image
Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Principles Programming And Performance

Authors: Patrick O'Neil

1st Edition

1558603921, 978-1558603929

More Books

Students explore these related Databases questions