Question
Implement, run and time the following query using Hadoop streaming with python. SELECT lo_quantity, MIN (lo_revenue) FROM (SELECT lo_revenue, MAX(lo_quantity) as lo_quantity, MAX(lo_discount) as lo_discount
Implement, run and time the following query using Hadoop streaming with python. SELECT lo_quantity, MIN (lo_revenue) FROM (SELECT lo_revenue, MAX(lo_quantity) as lo_quantity, MAX(lo_discount) as lo_discount FROM lineorder WHERE lo_orderpriority LIKE '%URGENT' GROUP BY lo_revenue) WHERE lo_discount BETWEEN 6 AND 8 GROUP BY lo_quantity; This requires running two different map reduce jobs. First, you would write a job that executes the subquery and produces an output in HDFS. Then you would write a second job that uses output of the first job as the input.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started