Answered step by step
Verified Expert Solution
Question
1 Approved Answer
a) Solve 9.3.1-a (normalize the ratings based on a threshold), 9.3.1-e f 9 h 3 2 A B C 4 2 b 5 CO
a) Solve 9.3.1-a (normalize the ratings based on a threshold), 9.3.1-e f 9 h 3 2 A B C 4 2 b 5 CO CT 3 4 1 d e 5 1 3 1 3 2 1 4 5 3 Figure 9.8: A utility matrix for exercises Exercise 9.3.1: Figure 9.8 is a utility matrix, representing the ratings, on a 1-5 star scale, of eight items, a through h, by three users A, B, and C. Compute the following from the data of this matrix. (a) Treating the utility matrix as boolean, compute the Jaccard distance be- tween each pair of users. (e) Normalize the matrix by subtracting from each nonblank entry the average value for its user. b) Describe one strategy that is used to make a utility matrix less sparse Acti Go to 2) a) Describe at least one mechanism that ensures that data is not lost in Spark when a hardware/software failure occurs. b) From a resource (i.e., disk, memory, CPU) managing perspective, which Hadoop nodes should be chosen to run Spark tasks? Specifically, which type of tasks can co-exist with Spark and which ones should not be using the same node as Spark? c) Implement (in python only, without actual Storm) a solution that would compute streaming queries average for a specified window. For example, to compute a 4-value windowed average that moves 2 tuples at a time, you can use the following line (make sure that your code supports other sizes as well). You are not allowed to first read the entire input before producing output, because the input stream is infinite. Instead, you should compute and print the output as the data arrives. cat mydata | python storm.py 4 2 This can be tested in your own environment, without Linux, using the following code: fd = open('mydata', 'r') sys.stdin = fd for line in sys.stdin: Assuming that mydata file contains (one value per line, no error checking is necessary) 5 3 6 11 # Your code goes here. 8 4 6 3 7 Your command above should output the following three averages (representing an average of (5,3,6,11), (6,11,8,4), (8,4,6,3)): 6.25 7.25 5.25 The last window only contains 6, 3, 7 and cannot output an average until more data arrives.
Step by Step Solution
★★★★★
3.48 Rating (161 Votes )
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started