Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider the following Spark Code to construction a Dataframe. a = [ ( ' Chris ' , 'Budweiser', 1 5 ) , ( ' Chris

Consider the following Spark Code to construction a Dataframe.
a =[('Chris', 'Budweiser', 15),('Chris', 'Becks', 5),('Chris', 'Heineken', 2),('Bob', 'Becks', 15),('Bob', 'Budweiser', 10),('Bob',
'Heineken', 2),('Alice', 'Heineken', 8)]
rdd = sc. parallelize (a)
df = sqIContext.createDataFrame(rdd,['drinker', 'beer', 'score'])
sqIContext.registerDataFrameAsTable(df, "drinkers")
How can we get the total score of each beer brand?
We want to have the following answer from the above example: (Your print out may be different - values are important)
beer='Becks', total score =20
beer='Budweiser', total score=25
beer='Heineken', total score=12
(Multiple Choice with Negative Scores for wrong Answers)
A. df.drop('drinker').groupByKey('beer').reduceByKey(add).collect()
B. df.drop('drinker').groupBy('beer').agg({'score': 'sum'}).collect()
C. df.drop('drinker').groupByKey('beer').map(lambda a, b: a+b).top()
D. df.filter('drinker').groupBy('beer').agg({'score': 'sum'}).collect()
E. df.filter('drinker').groupByKey('beer').reduceByKey(add).collect()
F. sqIContext.sql("SELECT beer, sum(score) from drinkers GROUP BY drinker").collect()
G. df.filter('drinker').groupByKey('beer').agg({'score': 'sum'}).collect()
H. df.drop('drinker').groupByKey('beer').map(lambda a, b: a+b).collect()
I. sqIContext.sqI("SELECT beer, sum(score) from drinkers GROUP BY beer").collect()
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions