Question
Ineed help in writing hive command for these questions: u.data table -- The dataset has 100000 ratings by 943 users on 1682 movies. The file
Ineed help in writing hive command for these questions:
u.data table -- The dataset has 100000 ratings by 943 users on 1682 movies. The file has 4 tab ("\t") separated columns. The first column is the user id, the second column is the movie id, the third column is the rating, and the fourth column is a timestamp.
u.user table - Demographic information about the users; this has 5 pipe "|" separated columns. the first column is the user id, the second column is the age, the third column is the gender (Male denoted by 'M' and Female denoted by 'F'), fourth column is the occupation, and the fifth column is the zip code. The user ids are the ones used in the u.data data set.
1.Find the user id who has rated the most number of movies
2.Find average rating received by movie with id 178.
3.The users belonging to which 3 occupations provided the most number of ratings
4.How many unique male users provided at least one rating of 5.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started