Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Count up ratings given for each movie All you need is to change one thing in the mapper we dont care about ratings now, we

Count up ratings given for each movie

All you need is to change one thing in the mapper we dont care about

ratings now, we care about movie IDs!

Start with this and make sure you can do it.

You can use nano to just edit the existing RatingsBreakdown.py script

from mrjob.job import MRJob from mrjob.step import MRStep

class MostPopularMovie(MRJob): def steps(self): return [ MRStep(mapper=self.mapper_get_ratings, reducer=self.reducer_count_ratings), MRStep(reducer = self.reducer_find_max) ]

def mapper_get_ratings(self, _, line): (userID, movieID, rating, timestamp) = line.split('\t') yield movieID, 1

def reducer_count_ratings(self, key, values): yield None, (sum(values), key)

def reducer_find_max(self, key, values): yield max(values)

if __name__ == '__main__': MostPopularMovie.run()

from mrjob.job import MRJob from mrjob.step import MRStep

class RatingsBreakdown(MRJob): def steps(self): return [ MRStep(mapper=self.mapper_get_ratings, reducer=self.reducer_count_ratings) ]

def mapper_get_ratings(self, _, line): (userID, movieID, rating, timestamp) = line.split('\t') yield rating, 1

def reducer_count_ratings(self, key, values): yield key, sum(values)

if __name__ == '__main__': RatingsBreakdown.run()

from mrjob.job import MRJob from mrjob.step import MRStep

class RatingsBreakdown(MRJob): def steps(self): return [ MRStep(mapper=self.mapper_get_ratings, reducer=self.reducer_count_ratings), MRStep(reducer=self.reducer_sorted_output) ]

def mapper_get_ratings(self, _, line): (userID, movieID, rating, timestamp) = line.split('\t') yield movieID, 1

def reducer_count_ratings(self, key, values): yield str(sum(values)).zfill(5), key

def reducer_sorted_output(self, count, movies): for movie in movies: yield movie, count

if __name__ == '__main__': RatingsBreakdown.run()

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Microsoft Visual Basic 2008 Comprehensive Concepts And Techniques

Authors: Gary B. Shelly, Corinne Hoisington

1st Edition

1423927168, 978-1423927167

More Books

Students also viewed these Databases questions

Question

Whats the premium on bonds payable?

Answered: 1 week ago