Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

QUESTION 1 An 8-core multi-core microporcessor has an ideal speedup for embarassingly paralle code of 8. The measured execution time of a benchamrk code using

QUESTION 1

An 8-core multi-core microporcessor has an ideal speedup for embarassingly paralle code of 8. The measured execution time of a benchamrk code using only a single thread (1) is 100 E-6 seconds ( 100 microseconds).

The OpenMP thread management overhead for the 8 core processor is 10E-6 seconds ( 10 microseconds).

Taking the the threading overhead into account, what would the actual measured speedup be for the benchmark code run on the system using 8 threads?

QUESTION 2

A double precision DAXPY benchmark is being run on a 3GHz Intel CPU system which has a single issue, scalar pipeline with 14 stages. The typical, or nominally observed, execution time speed up of the double precision DAXPY benchmark complied using the GCC optimization setting -O3 compared to the execution time of the benchmark using the GCC -O0 setting is: 1, 7, 14 or 28 ?

QUESTION 3

The ideal cycles per arithmetic operation (CpOps) achievable on a single issue, scalar computer processor 10 stage pipeline using the highest level of compiler optimization is: 1, 5, 10 or 2.5 ?

QUESTION 4

Out-of-order instruction processing is a fundamental characteristic of a SuperScalar Processor - True or False ?

QUESTION 5

The execution time required to compute the 1500 x 1500 square matrix multiplication , [A]*[B] = [C] was measured and found to be 16 seconds. The average execution time per arithmetic operation for the computation is:

Note : Give your answer in seconds

QUESTION 6

A processor pipeline with 1 ALU (4-way SIMD enabled) ,1 load, 1 store execution units and an L1 cache to CPU bandwidth of 48 Bytes per cycle, will likely run the optimized double precision matrix multiplication benchmark code faster than one configured with 1 ALU (4-way SIMD enabled), 1 load and 1 store execution units and 96 Bytes of L1 to CPU cache bandwidth. - True or False ?

QUESTION 7

A multiple level cache hierarchy is a necessary, and required, feature of a SuperScalar, SIMD enabled Processor- True or False

QUESTION 8

A benchmark code has ________....

QUESTION 9

Which of the following is a key element or characteristics of a SuperScalar processor? Multiple level memory hierarchy, CPU clock frequencies greater than 2GHz, Mechanisms for managing and controlling out-of-order instruction execution or Instruction Set Arhcitecture (ISA) ?

QUESTION 10

The number of arithmetic operations ( adds and multiplies) to evaluate a 1500 x 1500 square matrix multiplication [A] * [B] = [C] is :

QUESTION 11

Which of the following is a key feature of OpenMP? Loops, Nested Loops, Threads or Number of pipeline stages ?

QUESTION 12

A double precision DAXPY benchmark is being run on a 3GHz Intel CPU system which has a single issue, scalar pipeline with 14 stages. The ideal time speed up of the double precision DAXPY benchmark complied using the GCC optimization setting -O3 compared to the execution time of the benchmark using the GCC -O0 setting is: 1, 7, 14 or 28 ?QUESTION 12

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

15-5 How will MIS help my career?

Answered: 1 week ago