Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

For this assignment, you will practice implementing a solution for a cluster. However, for initial testing purposes, you will implement your solution on your VM.

For this assignment, you will practice implementing a solution for a cluster. However, for initial testing purposes, you will implement your solution on your VM. You will be implementing a distributed barrier. A barrier is a synchronization primitive implemented as an API call. A call to the barrier() function sychronizes the processes involved in the computation by blocking each process until all of the processes arrive at the barrier call. Following is an example pseudocode for Matric Multiply.

Multiply(A, B, C)

BEGIN

if MASTER

divide work among the N processes into chunks

fi

barrier()

multiply my chunk

barrier()

if MASTER

print results

fi

END

If Multiply is call simultaneously by the N processes, then the worker processes must wait for the MASTER process to assign each worker its chunk. Therefore, the barrier() will make all the workers wait until the MASTER finishes the work assignment and reaches the barrier. Likewise, the second barrier() call is necessary because the MASTER must wait until all the worker processes are finished computing their chunks before the master can print the results.

You will be writing your own distributed barrier for the MPI library. MPI, which stands for Message Passing Interface, is the most popular library for message passing on compute clusters. MPI actually has its own barrier implementation, but you are going to program your barrier implementation from scratch using only MPI send and receive calls. Many algorithms for barrier implementations exist, and some are more scalable that others. You will implement a simple, but correct, barrier that uses a central count. Unfortunately, implementing a naive barrier can lead to a process mistakenly passing through the current barrier because of state still being used by previous barrier. So, you will implement a barrier such that each process keep track of its current phase (it only needs two phases to be correct) so that it will not accidentally be signaled to continue by a previous phase.

The following is a barrier implementation taken from wikipedia (https://en.wikipedia.org/wiki/Barrier_(computer_science)). See if you can follow its logic.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

struct barrier_type { int counter; // initialize to 0 int flag; // initialize to 0 std::mutex lock; };

int local_sense = 0; // private per processor

// barrier for p processors void barrier(barrier_type* b, int p) { local_sense = (b->flag == 0) ? 1 : 0; b->lock.lock(); b->counter++; //This needs to be atomic

int arrived = b->counter; //This need to be atomic if (arrived == p) // last arriver sets flag { b->lock.unlock(); b->counter = 0; // memory fence to ensure that the change to counter // is seen before the change to flag b->flag = local_sense; } else { b->lock.unlock(); while (b->flag != local_sense); // wait for flag } }

Note that the local_sense variable tracks the phase of the process. The last process to enter the barrier signals the rest of the processes to continue, but those processes only continue if they have the same phase (checked at line 23). So a process that is out of phase does not get signaled. Unfortunately, the above code is an implementation for a SMP computer with shared memory. You will be using message passing and will not have access to a shared mutex such as that used in the above code. Fortunately, the professor has made your life easier by supplying a pseudocode implementation, below, using message passing. Convince yourself that the algorithm works. In particular, see if you can determine why no mutex is needed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

27

28

structure barrier has count of type integer /* barrier count */ flag of type integer /* barrier phase */

everyone of type integer /* initialize to number of total processes */

me of type integer /* initialized to this processes rank */

end struct

function barrier(barrier of type structure barrier) set local_sense to 1 - barrier.flag send MY_INCREMENT to MASTER while barrier.flag is not equal to local_sense receive msg from anyone if msg is MY_INCREMENT_REPLY /* should only come from master */ set arrived to msg.barrier_count if (arrived == b.everyone) send MY_RESET with local_sense to all but me set barrier.counter to 0; set barrier.flag to local_sense; fi else if msg is MY_INCREMENT /*only MASTER should ever get this */

set barrier.counter to barrier.counter + 1

send MY_INCREMENT_REPLY back to original sender with barrier.counter else if msg is MY_RESET set barrier.counter to 0; set barrier.flag to msg.sense fi wend fend

You will implement a main driver, a barrier() function, and a barrier_init() function according to the following table.

Function name

Description

void my_barrier_init(barrier_t *barrier);

Initialized the barrier structure. Initialize count to 0, flag to 0, everyone to the total number of processes, and me to the process's MPI rank.

void my_barrier(barrier_t *barrier)

Each process should stop at this function call until all processes arrive. Then all processes fall through.

int main()

A driver to test your code.

The main driver should test your code by having the MASTER process (with MPI rank 0) sleep for 1 second and then write to a file called tstfile.txt. A barrier should keep the other processes from continuing until the MASTER is finished. Then, after the barrier, all processes should read the file and print its contents. Put this code in a loop so that it executes ten times.

It will be written C!!!!!

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Power Of Numbers In Health Care A Students Journey In Data Analysis

Authors: Kaiden

1st Edition

8119747887, 978-8119747887

More Books

Students also viewed these Databases questions

Question

4-6 Is there a digital divide? If so, why does it matter?

Answered: 1 week ago