Answered step by step
Verified Expert Solution
Question
1 Approved Answer
This assignment is to demonstrate your understanding of the parallel algorithms discussed in class. Each of these parallel algorithms consists of a number of computation
This assignment is to demonstrate your understanding of the parallel algorithms discussed in class. Each of these parallel algorithms consists of a number of computation phases. A new phase cannot start until every processing node completes the current phase. We discussed a PowerPoint slide in class how to estimate the computation cost for the Hash Phase of the Grace Algorithm. We clarified that this computation cost for a given phase is determined by the slowest processing node last to finish its workload not the sum of the local computation costs of all the processing nodes. The estimation of the other phases of this and other multiphase algorithms is the same. You need to understand each phase of these algorithms in order to do such estimation. Do the following exercises to demonstrate your knowledge of these techniques.
pts We apply the GRACE algorithm to perform on a small sharednothing system with four processing nodes PNs has pages and has pages. Each relation is evenly divided among the four PNs Thus, each PN has pages of tuples. The Hashing Phase results in data skew as follows:
of the data in the first bucket pairs:
of the data in the second bucket pairs:
of the data in the third bucket pairs:
of the data in the fourth bucket pairs:
For each of the three parallel phases, estimate the read cost, the write cost, and the total computation cost. Show and explain the derivation of your mathematical analysis.
pts We apply the ABJ algorithm for the same setting described in Question Estimate the read cost, the write cost, and the total computation cost for each of the four phases of ABJ. Show and explain the derivation of your mathematical analysis.
pts If we apply the TIJ algorithm to compute as described in Question Estimate the total computation cost for this strategy.
Answer:
HashTI Phase: Since each PN reads pages, the read cost is IOs. The HashingTI computation does not incur disk access. It evenly sends pages to each of the PNs the write cost is IOs. The total computation cost for this phase is
Partition Tuning: Each PN has pages to read and sends the tuples in pages to the target bucket pairs according to the Bin Packing Algorithm BPA Thus, the read cost is IOs. BPA evenly gives the pages to each PN it writes its share of pages to its local disks. The write cost, therefore, is pages. The total computation cost for this phase is IOs.
Bucket Tuning Phase: For efficient join of each bucket pair, one of the two buckets should fit in the computer memory. This requires scanning the other bucket only once to complete the join operation. For buckets that are significantly smaller than the memory capacity, this phase combines them to create a bigger combined bucket to better fit the memory capacity. This is just a decision making process without really loading these buckets into memory and writing them back to disk as a single bigger combined bucket. This phase, therefore, incurs only negligible computation and no IO
Join Phase: Each PN has pages of data to do the local join operations. Each of these local joins requires loading each of its buckets only once. The read time for doing all the local joins at a given PN is the cost of reading all the local bucket pairs, or pages. The read cost for this phase, therefore, is IOs. We neglect the write cost for this phase in this analysis since we do not know how the join result is like. When we compare the different join techniques in Questions and as long as we ignore this write cost for all three cases, the comparison is still valid to determine which one of the three is the better option for the join operation.
TOTAL COMPUTATION TIME: Since the computation of one phase cannot start until every PN finishes the last phase, the total computation cost of TIJ is the sum of the computation of each of the four phases: IOs.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started