Question

1 Approved Answer

Posted on Sep 25, 2024

This assignment is based on parrallel programming in C. The assignment is to code Matrix Multiplication in C programming using OpenMP. Compute C=B*A where A,

This assignment is based on parrallel programming in C. The assignment is to code Matrix Multiplication in C programming using OpenMP. Compute C=B*A where A, B, and C are matrices and * is matrix multiplication. You may assume the matrices are all square with N rows and N columns.

You are to write two versions of matrix multiplication and compare the performance achieved by each with a sequential execution, i.e., execution using a single thread. These two codes differ in the approach used to distribute the computations among the threads. The first uses a static mapping approach while the second uses a dynamic worker approach.

2. Static Mapping

Assume there are K threads performing the computation. In this approach, each thread is statically assigned a set of rows of the result matrix C for which it is responsible for computing results. Specifically, thread 0 is responsible for computing values for rows 0, K, 2K, of C. Thread 1 computes values for rows 1, K+1, 2K+1, In other words, the rows of the matrix are assigned in round robin fashion to the different threads.

3. Dynamic Mapping

In the dynamic approach a pool of worker threads is used. This means you create K threads to perform the computation. Each worker thread repeatedly accesses a global data structure to allocate a single row of the result matrix to compute. It then performs this computation, and then goes back to the global data structure to allocate another piece of work to do. This process continues until the entire matrix computation is complete. In other words, each worker thread executes the following loop:

While (rows of the result matrix have not been computed) { Allocate a row of the result matrix to compute Compute results for this row of the result matrix }

Construct a set of experiments to measure the performance (execution time) of the two parallel implementations and a sequential implementation. Note that in general, an execution using one thread is not the same as a sequential execution because it will have parallel processing overheads. SpeedUp(K) is defined as the execution time of the sequential code divided by the execution time of the parallel code using K threads, and indicates now many times faster the parallel code executes. Create two different sizes of the matrix multiplication computation. Define a small matrix as one with 50 rows and 50 columns. Define a large matrix by setting the number of rows and columns (N) to a value where the execution of the sequential code is, say approximately 10 or 20 seconds. Your code should fill the matrix with random numbers in the interval [0.0, 1.0]. All values should be double precision floating point numbers. Measure the runtime of the codes (sequential static, dynamic) for the small and large matrices varying the number of threads. Show plots of speedup as the number of threads is varied for the small and large matrices.