Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In python, please. Write a program that shall calculate word sentiment level, based on the user reviews from the Yelp academic dataset. The attached file

In python, please.

Write a program that shall calculate word sentiment level, based on the user reviews from the Yelp academic dataset.

The attached file has 156,602 reviews written by Yelp members (the original dataset has 1,569,265 reviews). Each review has a text fragment and a star rating on the scale from1 (worst) to 5 (best). We assume that the words predominantly used in "bad" reviews are "bad" and the words predominantly used in "good" reviews are "good." The measure of the sentiment level of a word, therefore, is the average star rating of all reviews where the word is used.

Processing steps:

Load the JSON data from the file and select a small subset for practicing. (The final run of the program shall include all reviews.) You must use a JSON reader.

Extract all review texts and star ratings.

Break each review into individual words using NLTK.

Lemmatize the words.

Filter out stopwords and words that are not in the words corpus.

For each lemma, calculate its average star rating. If a lemma is used in fewer than 10 reviews, discard it.

Save the 500 most negative lemmas and 500 most positive lemmas and their respective sentiment levels into a two-column CSV file. You must use a CSV writer.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions