Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

computer science your input image is named santa - grayscale.jpg , the execution command would be / / convolution santa - grayscale.jpg

computer science
your input image is named "santa-grayscale.jpg", the execution command would be "//convolution santa-grayscale.jpg"
No command argument for filter radius is necessary in this assignment, as we utilize the constant average filter of size 55 where the filter radius is 2.
Within "
convolution.cu", implement one host function and two CUDA kernels to perform convolution, respectively. Specifically,
a. A host function that performs the convolution operation using CPU-only.
b. A CUDA kernel that performs the convolution operation using GPU but without tiling. You may refer to the example provided on Slide 40 in Module 3, but please modify it to incorporate the average filter matrix. It is acceptable to load the average filter from either GPU global memory or constant memory.
c. An optimized CUDA kernel that presents a "tiled" version of the convolution operation using GPU shared memory. Specifically, in this optimized kernel, please load the average filter from GPU constant memory.
Ensure that your code can handle images with varying image sizes. Please also consider boundary conditions to ensure proper handling in such cases.
For testing purposes, two input images of different sizes are provided in the zipped folder accompanying this assignment.
"santa-grayscale.jpg": a grayscale image of size 1,0001,000
"tree-grayscale.jpg": a grayscale image of size 345346
You may also choose to test your program with additional grayscale images if desired.
Utilize timing techniques such as CPU timers or CUDA events, to measure the performance of your implementation of the host function and two CUDA kernels as specified above.
We also suggest structuring the "
convolution.cu" by implementing the following macros, host functions, and CUDA kernels. At the end of this assignment, we will provide screenshots of an example of the program's structure.
#define CHECK(call)
A macro for error checking.
double myCPUTimer()
A timer for measuring execution time.
void blurImage_h(cv::Mat Pout_Mat_h, cv::Mat Pin_Mat_h, unsigned int nRows, unsigned int nCols)
A host function for CPU-only convolution.
_global__void blurImage_Kernel(unsigned char * Pout, unsigned char * Pin, unsigned int width, unsigned int height)
A CUDA kernel performs a simple convolution without using tiling.
void void blurImage_d(cv::Mat Pout_Mat_h, cv::Mat Pin_Mat_h, unsigned int nRows, unsigned int nCols)
A host function for handling device memory allocation and free, data copy, and calling the specific CUDA kernel, blurImage_Kernel().
When running your program,
Please make sure to replace "inputImg.jpg" with the name of your input image file, which should be located in the same directory as your program. For example, if
2Given a string s, find the length of the longest substring without repeating characters.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions