Newer processors such as Intel's i7 Kaby Lake include support for AVX2 vector/multimedia instructions. Write a dense
Question:
Newer processors such as Intel's i7 Kaby Lake include support for AVX2 vector/multimedia instructions. Write a dense matrix multiply function using single-precision values and compile it with different compilers and optimization flags. Linear algebra codes using Basic Linear Algebra Subroutine (BLAS) routines such as SGEMM include optimized versions of dense matrix multiply. Compare the code size and performance of your code to that of BLAS SGEMM. Explore what happens when using double-precision values and DGEMM.
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Computer Architecture A Quantitative Approach
ISBN: 9780128119051
6th Edition
Authors: John L. Hennessy, David A. Patterson
Question Posted: