Answered step by step
Verified Expert Solution
Question
1 Approved Answer
I want to solve this assignment in scala spark-shell using RDD Writing Naive Spark Jobs! Assignment Objective and Description: - Write spark jobs in spark-shell.
I want to solve this assignment in scala spark-shell using RDD
Writing Naive Spark Jobs! Assignment Objective and Description: - Write spark jobs in spark-shell. - Before writing your code, you should familiarize yourself with the examples in lab 6. - When googling for operations, make sure that you specify that operation is applied on RDD. For example, if you want to search about max operation, you can search with "RDD max operation in spark". First Question [5 marks]: Assume that you have the following datase of employees that contains emplyee's information as follow : \{("Eman", "Abdulaziz",1991, "F", 9000), ("Hamed", "Ryan", 2000, "M", 1200), ("Fatima", "Saeed", 1978, "F",13000), ("Rahaf", "Abdullah", 1967, "F", 14000), ("Ahmad", "Mohamed", 1980, "M", 15000)\}. Answer the following queries: - Select employees whose age is greater than 33. - Report the age of the oldest female. - Report the number of female and male employees. - Find the average Salary. - Report the employees with salary less than 10,000 as follow: . Second Question [4 Marks]: Assume that you have the following dataset: {45,3,4,44,39,11,7,8,13,21,20, 44,44,12,27,27,29,18,19,19,1,1,31,31,32,1,22,33,31,37,50,41,42}. Notice that the the lowest value in this dataset is 1 and the largest value is 50 . Assume that you want you to find the average and the count of the values in ranges of 10 s where the ranges are . Write a spark job to solve this problem, then trigger the execution by collecto action and print the returned array. (Hint: refer to example 3, lab 6). Third Question [4 Marks]: Assume that you have the following dataset: " "Apple", "Orange", "Oracle", "Umbrella", "Unit", "Illness", "Elephant", "Eve", "Early", "Artistic", "Iconic", "Idol", "Book", "Novel" Assume that the user want you to find the average length of words that start only with each vowel, where English vowels are: {a,e,i,o,u}. Write a spark job to solve this problem, then trigger the execution by .collect 0 action and print the returned array. What to Submit, When and How.. This assignment should be done in teams of 4 students (with a single group of 5 students). Each team (a single student) should submit a single zip file (using the blackboard) containing: - A document that contains snap shots of the scripts and the output of each job the same way I did in the slides! You have to submit it by no later than (Saturday, Feb 18,11:59pm ), with late submission policy as follow: (submitting late work will be penalized 15% per day forStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started