Answered step by step
Verified Expert Solution
Link Copied!

Question

00
1 Approved Answer

A collection of reviews about comedy movies (data D) contains the following keywords and binary labels for whether each movie was funny (+) or not

A collection of reviews about comedy movies (data D) contains the following keywords and binary labels for whether each movie was funny (+) or not funny (-). The data are shown below: for example, the cell at the intersection of "Review 1" and "laugh" indicates that the text of Review 1 contains 2 tokens of the word "laugh." Review laugh hilarious awesome dull yawn bland | Y 1 1 1 0 + 2 0 0 0 + 0 0 0 1 + 0 2 1 0 - 1 2 0 You may find it easier to complete this problem if you copy the data into a spreadsheet and use formulas for calculations, rather than doing calculations by hand. Please report all scores as log-probabilities, with 3 significant figures (10 pts (a) Assume that you have trained a Naive Bayes model on data D to detect funny vs. not funny movie reviews. Compute the model's predicted score for funny and not-funny to the following sentence S i.e. P(+S) and P(-1S)), and determine which label the model will apply to S. (4 pts) S: "This film was hilarious! I didn't yawn once. Not a single bland moment. Every minute was a laugh." (b) The counts in the original data are sparse and may lead to overfitting, e.g. a strong prior on assigning the "not funny" label to reviews that contain "yawn." What would happen if you applied smoothing? Apply add-1 smoothing and recompute the Naive Bayes model's predicted scores for S. Did the label change? (4 pts) (c) What is an additional feature that you could extract from text to improve the classification of sentences like S, and how would it help improve the classification? (2 pt]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Signals and Systems using MATLAB

Authors: Luis Chaparro

2nd edition

978-0123948120

Students also viewed these Programming questions

Question

1. Explain the evolving role of HRM in the next millennium.

Answered: 1 week ago