Answered step by step
Verified Expert Solution
Link Copied!
Question
1 Approved Answer

Complete the function compute using pthreads v1() to parallelize the SAXPY loop using the chunking method. That is, given vectors x and y of some

Complete the function compute using pthreads v1() to parallelize the SAXPY loop using the "chunking" method. That is, given vectors x and y of some arbitrary length n, and when k threads are used, individual threads calculate SAXPY in parallel on smaller chunks of these vectors

image

This assignment, worth twenty points, is due February 1, 2024, by 11:59 pm. You may work on it in a group of up to two people. Please submit original code. You are asked to compare two different ways of parallelizing the SAXPY loop using the pthread library. SAXPY is a function within the standard Basic Linear Algebra Subroutines (BLAS) library and stands for "Single-Precision AX Plus Y." The implementation is very simple, involving a combination of scalar multiplication and vector addition. The routine takes as inputs two vectors of 32-bit floating-point values x and y with n elements each, and a scalar value a. It multiplies each element x[i] by a and adds the result to y[i]. The serial implementation looks like this: void saxpy (float *x, float *y, float a, int n) { } int i; for (i = 0; i < n; i++) y[i] = a *x[i] +y[i]; Using the provided sequential program saxpy.c as a starting point, develop two versions that parallelize SAXPY. Complete the function compute_using-pthreads_vl() to parallelize the SAXPY loop using the "chunking" method. That is, given vectors x and y of some arbitrary length n, and when k threads are used, individual threads calculate SAXPY in parallel on smaller chunks of these vectors. Complete the function compute_using_pthreads_v2() to parallelize the SAXPY loop using the "striding" method. That is, each thread strides over elements of the vectors with some stride length, calculating SAXPY along the way. For example, given k = 4 threads, thread 0 calculates SAXPY for elements y[0], y[4], y[8], . . ., thread 1 calculates SAXPY for elements y[1], y[5], y[9], . . ., and so on. The pseudo-code for this method looks like this for each thread: /* tid is the thread ID and k is the number of threads created */ int stride = k; while (tid < n) { y[tid] a * x[tid] + y[tid];

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image
Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Linear Algebra with Applications

Authors: Steven J. Leon

7th edition

131857851, 978-0131857858

More Books

Students explore these related Algorithms questions