Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Q 1 . You are performing text mining on a customer review dataset containing 2 0 0 customer reviews. Answer the following questions: 1 .

Q1. You are performing text mining on a customer review dataset containing 200 customer reviews. Answer the following questions:
1. Suppose each review was limited to no more than 50 words. In the term-document matrix, which dimension is more likely to be larger, the number of documents or the number of terms? Explain your choice in one sentence.
2. You are considering to use stemming or lemmatization for processing the review text. The term 'increasing' appeared in many reviews. What are the results of stemming and lemmatization of this term, respectively?
3. In addition to the review text data, each customer also provided a rating score, with 1-star representing poor and 5-star representing excellent. Suppose your text mining task is to predict ratings based on the customer reviews. Which of the three techniques below is NOT appropriate for your task? Choose only one answer.
(i) J48 decision tree algorithm
(ii) support vector regression
(iii) k-means algorithm

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions