Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 04, 2024

This assignment is to demonstrate your understanding of the parallel algorithms discussed in class. Each of these parallel algorithms consists of a number of computation

This assignment is to demonstrate your understanding of the parallel algorithms discussed in class. Each of these parallel algorithms consists of a number of computation phases. A new phase cannot start until every processing node completes the current phase. We discussed a PowerPoint slide in class how to estimate the computation cost for the Hash Phase of the Grace Algorithm. We clarified that this computation cost for a given phase is determined by the slowest processing node $($ last to finish its workload $),$ not the sum of the local computation costs of all the processing nodes. The estimation of the other phases of this and other multi $-$ phase algorithms is the same. You need to understand each phase of these algorithms in order to do such estimation. Do the following exercises to demonstrate your knowledge of these techniques.

$(30$ pts $.)$ We apply the GRACE algorithm to perform $R | | > |S|$ on a small sharednothing system with four processing nodes $($ PNs $) . R$ has $4, 000$ pages and $S$ has $8, 000$ pages. Each relation is evenly divided among the four PNs $.$ Thus, each PN has $3, 000$ pages of tuples. The Hashing Phase results in data skew as follows:

$50 %$ of the data in the first $8$ bucket pairs: $\frac{R_{0}}{S_{0}} - \frac{R_{7}}{S_{7}}$

$20 %$ of the data in the second $8$ bucket pairs: $\frac{R_{8}}{S_{8}} - \frac{R_{15}}{S_{15}}$

$15 %$ of the data in the third $8$ bucket pairs: $\frac{R_{16}}{S_{16}} - \frac{R_{23}}{S_{23}}$

$15 %$ of the data in the fourth $8$ bucket pairs: $\frac{R_{24}}{S_{24}} - \frac{R_{31}}{S_{31}}$

For each of the three parallel phases, estimate the read cost, the write cost, and the total computation cost. Show and explain the derivation of your mathematical analysis.

$(70$ pts $.)$ We apply the ABJ algorithm for the same setting described in Question $1 .$ Estimate the read cost, the write cost, and the total computation cost for each of the four phases of ABJ. Show and explain the derivation of your mathematical analysis.

$(0$ pts $.)$ If we apply the TIJ algorithm to compute $R | | > |S|$ as described in Question $1,$ Estimate the total computation cost for this strategy.

Answer:

Hash $/$ TI Phase: Since each PN reads $3000$ pages, the read cost is $3000$ IOs. The Hashing $/$ TI computation does not incur disk access. It evenly sends $3000$ pages to each of the PNs $,$ the write cost is $3000$ IOs. The total computation cost for this phase is $3000 + 3000 = 600010 .$

Partition Tuning: Each PN has $3000$ pages to read and sends the tuples in pages to the target bucket pairs according to the Bin Packing Algorithm $($ BPA $) .$ Thus, the read cost is $3000$ IOs. BPA evenly gives the pages to each PN $,$ it writes its share of $3000$ pages to its local disks. The write cost, therefore, is $3000$ pages. The total computation cost for this phase is $3000 + 3000 = 6000$ IOs.

Bucket Tuning Phase: For efficient join of each bucket pair, one of the two buckets should fit in the computer memory. This requires scanning the other bucket only once to complete the join operation. For buckets that are significantly smaller than the memory capacity, this phase combines them to create a bigger combined bucket to better fit the memory capacity. This is just a decision making process without really loading these buckets into memory and writing them back to disk as a single bigger combined bucket. This phase, therefore, incurs only negligible computation and no IO $.$

Join Phase: Each PN has $3000$ pages of data to do the local join operations. Each of these local joins requires loading each of its buckets only once. The read time for doing all the local joins at a given PN is the cost of reading all the local bucket pairs, or $3000$ pages. The read cost for this phase, therefore, is $3000$ IOs. We neglect the write cost for this phase in this analysis since we do not know how the join result is like. When we compare the different join techniques in Questions $(1), (2),$ and $(3),$ as long as we ignore this write cost for all three cases, the comparison is still valid to determine which one of the three is the better option for the join operation.

TOTAL COMPUTATION TIME: Since the computation of one phase cannot start until every PN finishes the last phase, the total computation cost of TIJ is the sum of the computation of each of the four phases: $6000 + 6000 + 0 + 3000 = 15, 000$ IOs.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Databases Illuminated

Authors: Catherine M Ricardo, Susan D Urban

3rd Edition

1284056945, 9781284056945

More Books

Students also viewed these Databases questions

Question

★★★★★

The Harris Poll (December 2013) conducted an online survey of American adults to determine their favorite sport. Your friend believes professional (National Football League [NFL]) football-with...

Answered: 1 week ago

Question

★★★★★

=+17. More concepts. For each of the following time series, suggest an appropriate model: a) Weekly stock prices that reveal erratic periods of up and down swings.

Answered: 1 week ago

Question

★★★★★

Shelby Shelving is a small company that manufactures two types of shelves for grocery stores. Model S is the standard model; model LX is a heavy-duty version. Shelves are manufactured in three major...

Answered: 1 week ago

Question

★★★★★

1 0 . Question 1 0 ( Difficulty: ) Consider the image given below:You are given six different 2 D filters, 1 , . . . , 6 f 1 , . . . , f 6 :Give the right correspondence between filters 1 , . . . , 6...

Answered: 1 week ago

Question

★★★★★

Read the following scenarios and answer the following questions: I. I. Rajan Road Transport Corporation introduced an incentive scheme in 1996. The bus crew will get the incentive bonus if the income...

Answered: 1 week ago

Question

★★★★★

A local gym is looking in to purchasing more exercise equipment and runs a survey to find out the preference in exercise equipment amongst their members. They categorize the members based on how...

Answered: 1 week ago

Question

★★★★★

Interest on a credit card's unpaid balance is calculated using the average daily balance. Suppose that netBalance is the balance shown in the bill, payment is the payment made, d1 is the number of...

Answered: 1 week ago

Question

★★★★★

125. Suppose that people finally realize that they must save a larger proportion of their income in order to retire and that they simultaneously begin to use new technology that allows them to reduce...

Answered: 1 week ago

Question

★★★★★

Mandatory Border Operations meetings are conducted for all employees in the Border Protection Unit every Monday. Employees are not permitted to telework on days when Border Operations meetings are...

Answered: 1 week ago

Question

★★★★★

A small southern West Australian country town has maintained records showing that the daily maximumtemperature during summer (Dec.,Jan.,Feb.) has averaged 30 degree Celsius with a standard deviation...

Answered: 1 week ago

Question

★★★★★

Design a cross-cultural preparation program. page 313

Answered: 1 week ago

Question

★★★★★

Discuss the strengths and weaknesses of presentation, hands-on, and group training methods. page 295

Answered: 1 week ago

Question

★★★★★

Evaluate employees readiness for training. page 289

Answered: 1 week ago

Previous Question Next Question