Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

As presented, the tree summation algorithm was always illustrated with n = 2m, causing the tree to be perfectly balanced. Revise the algorithm for the

As presented, the tree summation algorithm was always illustrated with n = 2m, causing the tree to be perfectly balanced. Revise the algorithm for the case when n is not a power of 2.

image text in transcribedPair-Wise Summation. Another, more parallel order of summation is to add even/odd pairs of data values yielding the intermediate sums, (x0 + x1), (x2 + x3), (x4 + x5), (x6 + x7), ... which are added in pairs, ((x0 + x1) + (x2 + x3)), ((x4 + x5) + (x6 + x7)), ... yielding more intermediate sums, which are themselves added in pairs, and so on. This solution can be visualized as inducing a tree on the computation, where the original data values are leaves, the intermediate nodes are the sum of the nodes below them, and the root is the overall sum (see Figure 1.3). Comparing Figures 1.2 and 1.3, we see that because the two solutions require the same number of operations and the same number of intermediate sums, there is no time advantage to either solution when using one processor. However, with a parallel computer that has at least P = n/2 processors, all of the additions at the same level of the tree can be computed simultaneously, yielding a solution with time complexity that is proportional to log n. The strategy is a significant improvement over the linear time sequential algorithm. Like the sequential solution, the pair-wise approach is a very intuitive way to think about the computation. Expressing Parallel Sum. The iterative summation was illustrated using C code, but the pair-wise summation was not. If we are not concerned about writing code for an arbitrary length array, we might write it as follows to highlight the binary tree structure of the computation: 1 t[0]=x[0]+x[1]; 2 t[1]=x[2]+x[3]; 3 t[2]=x[4]+x[5]; 4 t[3]=x[6]+x[7]; 5 t[4]=t[0]+t[1]; 6 t[5]=t[2]+t[3]; 7 sum=t[4]+t[5]; The first four assignments can be performed in parallel; after they are complete, the next two (5, 6) can also be performed in parallel. Parallel Prefix Sum Closely related to the sum is the prefix sum, also known as scan in many parallel programming languages. It begins with the same sequence of n values, x0, x1, x2, ..., xn1 but the desired computation is the sequence y0, y1, y2, ..., yn1 such that each yi is the sum of the first i elements of the input, that is, yi = j i xj Solving the prefix sum in parallel is less obvious than summation, because all of the intermediate values of the sequential solution are needed. It seems as though there is no advantage to, nor much possibility of, finding better solutions. But in fact the prefix sum can be performed in parallel. The observation is that the summation by pairs approach can be modified to compute the prefix values. The idea is that each leaf processor storing xi could compute the value, yi , if it only knew the sum of all elements to its left, that is, its prefix; in the course of summing by pairs, we know the sum of all subtrees (see Figure 1.3), and if we save that information, we can determine the prefixes without directly summing them. To do so, we start at the root, whose prefixthat is, the sum of all elements before the elements of the sequenceis 0. This is also the prefix of its left subtree, and the total for its left subtree is the prefix for the right subtree. Applying this idea inductively, we get the following set of rules: Compute the grand total at the root by pair-wise sum, as before. On completion, imagine the root receiving a 0 from its (nonexistent) parent. All non-leaf nodes receive a value from their parent, relay that value to their left child, and send their right child the sum of the parents value and their left childs value that was computed on the way up; these are the prefixes of their child nodes. Leaves add the prefix value from above and the saved input. The values moving down the tree are the prefixes for the child nodes (see Figure 1.4, where downward moving prefix values are shown in the white square). The computation is known as the parallel prefix computation. It requires an up sweep and a down sweep in the tree, but all operations at each level in a sweep can be performed concurrently. At most two add operations are required at each node, one going up and one coming down, plus the routing logic. Thus, the parallel prefix also has logarithmic time complexity. Many seemingly sequential operations yield to the parallel prefix approach. An essential difference between the sequential and parallel algorithms is that we organized the parallel algorithms to change the order of the computation.

Chapter 1.pdf file:/// C /Users plata/Downloads/Chapter%201.pdf The order combining (T 3 15 ID, 13, 18 then adding them to tion variabl Amay dement Pair-Wise summation. Another, more p order of summation is to add evenlodd pairs of data values ekling the intermediate sums, which are added in pairs, yielding more intermediate sums, which are themselves added in paits, and so on. his solutinn can be visnalized as indu tation, where the a tre an e leaves, the intermediate nodes are the sum o nodes root is the overal um Figure Comparing Figu that becau the two soluti require the number of op her of intermediate sums, th ing a sexpaenke 12, 13, 18, 6, 4) by ing pairs of values, 10 25 31 10 Array ektmanls de Ask me anything 8:44 PM 1/29/2017 20

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Objects And Databases International Symposium Sophia Antipolis France June 13 2000 Revised Papers Lncs 1944

Authors: Klaus R. Dittrich ,Giovanna Guerrini ,Isabella Merlo ,Marta Oliva ,M. Elena Rodriguez

2001st Edition

3540416641, 978-3540416647

More Books

Students also viewed these Databases questions

Question

What would be a good test case for this problem?

Answered: 1 week ago

Question

3. What are potential solutions?

Answered: 1 week ago