Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. Consider the following product review sample from Amazon. helpful is a variable indicating the total number of helpful votes each review has received so

1. Consider the following product review sample from Amazon. helpful is a variable indicating the total number of helpful votes each review has received so far; score refers to the rating of the product being reviewed. Suppose your goal is to use the review text to train a model for predicting the sales of a product (which is highly correlated with its average rating on Amazon)image text in transcribed

(1) [6 points] Which one of the two terms, flavor or buck, is more informative about review #4 in the review corpus? Justify your answer by calculating their TF-IDF scores.

Note: The TF-IDF scores should be calculated after word-stemming and removing stopwords and non-words. Assume that stopwords = c(is, and, the, of) and word-stemming replaces nouns in their plural form with their singular form. Please specify your own list of non-words.

(2) [2 points] In this analysis, should we pre-process the review data by removing numbers from the text? Why or why not? Please explain.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Managing Information Technology

Authors: Carol Brown, Daniel W. DeHayes, Jeffrey A. Hoffer, Wainright E. Martin, William C. Perkins

6th edition

131789546, 978-0131789548

More Books

Students also viewed these General Management questions

Question

Evaluate the integral. cos sin 3 d

Answered: 1 week ago