Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. A matrix addition takes two input matrices B and C and produces one output matrix A. Each ele- ment of the output matrix A
1. A matrix addition takes two input matrices B and C and produces one output matrix A. Each ele- ment of the output matrix A is the sum of the corresponding elements of the input matrices B and C, that is, A[i][j] ==B[i][j] + C[i][j]. For simplicity, we will only handle square matrices of which the elements are single-precision floating-point numbers. Write a matrix addition kernel and the host stub function that can be called with four parameters: pointer to the output matrix, pointer to the first input matrix, pointer to the second input matrix, and the number of elements in each di- mension. Use the following instruction: a) Write the host stub function by allocating memory for the input and output matrices, transferring input data to the device, launch the kernel, transferring the output data to host, and freeing the de- vice memory for the input and output data. Leave the execution configuration parameters open for this step. b) Write a kernel that has each thread producing one output matrix element. Fill in the execution configuration parameters for the design. c) Write a kernel that has each thread producing one output matrix row. Fill in the execution config- uration parameters for the design. d) Write a kernel that has each thread producing one output matrix column. Fill in the execution configuration parameters for the design. e) Analyze the pros and cons of each preceding kernel design. 1. A matrix addition takes two input matrices B and C and produces one output matrix A. Each ele- ment of the output matrix A is the sum of the corresponding elements of the input matrices B and C, that is, A[i][j] ==B[i][j] + C[i][j]. For simplicity, we will only handle square matrices of which the elements are single-precision floating-point numbers. Write a matrix addition kernel and the host stub function that can be called with four parameters: pointer to the output matrix, pointer to the first input matrix, pointer to the second input matrix, and the number of elements in each di- mension. Use the following instruction: a) Write the host stub function by allocating memory for the input and output matrices, transferring input data to the device, launch the kernel, transferring the output data to host, and freeing the de- vice memory for the input and output data. Leave the execution configuration parameters open for this step. b) Write a kernel that has each thread producing one output matrix element. Fill in the execution configuration parameters for the design. c) Write a kernel that has each thread producing one output matrix row. Fill in the execution config- uration parameters for the design. d) Write a kernel that has each thread producing one output matrix column. Fill in the execution configuration parameters for the design. e) Analyze the pros and cons of each preceding kernel design
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started