Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This is a High Performance Computing question: 1. Puff is a student working on a GPU research project. In this project, he is implementing an

This is a High Performance Computing question:

1. Puff is a student working on a GPU research project. In this project, he is implementing an algorithm for Finite Element modeling using CUDA. His program is slow so he uses a profiler to analyze performance. Help Puff figure out how to improve his code by answering the questions below. Assume that the size of the matrices is in the millions of elements and is not multiples of powers of 2, and the device used has compute capability 3.7.

The first thing Puff has noticed is low occupancy. Without knowing details of his implementation, the kernel, indicate whether each of the following factors could affect this metric(answer yes or not) and briefly explain why

a. Having a few blocks with many threads (over 16K threads) each

b. Having many blocks with exactly 32 threads per block

c.Having many blocks with a few threads (more than 32 but less than 1024)

d. Using only the default stream for execution

.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Mary Z. Last

8th edition

1285427106, 978-1285427102

More Books

Students also viewed these Databases questions

Question

Focuses strongly on achievement and success in self and others.

Answered: 1 week ago