Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

DATA ANALYTICS (Exercise 2 of chapter 5) Email spam filtering models often use a bag-of-words representation for emails. In a bag-of-words representation, the descriptive features

DATA ANALYTICS

image text in transcribed

(Exercise 2 of chapter 5) Email spam filtering models often use a bag-of-words representation for emails. In a bag-of-words representation, the descriptive features that describe a document (in our case, an email) each represent how many times a particular word occurs in the document. One descriptive feature is included for each word in a predefined dictionary. The dictionary is typically defined as the complete set of words that occur in the training dataset. The table below lists the bag-of-words representation for the following five emails and a target feature. SPAM, whether they are spam emails or genuine emails: (1) "money, money, money(2) 'free money for free gambling fun", (3)"gambling for fm"', (4) "machine learning for fun. fun. fun"', (5) "free machine learning* What target level would a nearest neighbor model using Euclidean distance return for the following email: "machine learning for free"? What target level would a A-, v, v model with k~3 and using Euclidean distance return for the same query? What target level would a weighted Jt-AW model with k = S and using a weighing scheme of the reciprocal of the squared Euclidean distance between the neighbor and the query, return for the query? What target level would a k-NN model with k = 3 and using Manhattan distance return for the same query? There are a lot of zero entries in the spam bag-of-words dataset. This is indicative of sparse data and is typical for text analytics. Cosine similarity is often a good choice when dealing with sparse non-binary data. What target level would a J-AW model using cosine similarity return for the query

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning Databases With PostgreSQL From Novice To Professional

Authors: Richard Stones, Neil Matthew

2nd Edition

1590594789, 978-1590594780

More Books

Students also viewed these Databases questions

Question

4 How the market system adjusts to change and promotes progress.

Answered: 1 week ago