Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

As usual, you have been provided with a file lab 1 0 . h with headers for the functions described, lab 1 0 c .

As usual, you have been provided with a file lab10.h with headers for the functions described, lab10c.c with analogous C implementations, tests.c with reasonably thorough tests of the functions as described.
Dot Product, again
The dot product (pairwise product, and sum) that we did in Lab 7 makes more sense on floating point values.
In lab10.S, write a function dot_double that calculates the dot product of two arrays, as before but on double-precision floating point arrays.
An equivalent dot_double_c has been provided in lab10c.c.
Polynomial evaluation, again
We previously wrote a function that evaluated a cubic polynomial on a single x value, using the equivalent of this expression: x*(x*(a*x + b)+ c)+ d.
This time, we want to apply this operation to an array of double x values, writing the result to another (equally-sized) array. That is, for each array element, do the above calculation and put the result in the corresponding position in the "output" array.
In lab10.S, write a function map_poly_double with this signature (the coefficients will be constant):
void map_poly_double(double* input, double* output, uint64_t length, double a, double b, double c, double d);
Equivalent map_poly_double_c1 and map_poly_double_c2 have been provided, with the two expressions we used in the previous exercise, for comparison.
That, but Single Precision
Maybe single-precision floating point operations are faster?
In lab10.S, write functions dot_single and map_poly_single that are equivalent to the above but work on single-precision floating point values (float). This should be as simple as swapping double-precision instructions for their single-precision equivalents, and changing the element size from 8 to 4.
Some single-precision instructions that I found useful: movd, addss, mulss.
As before, there are dot_single_c and map_poly_single_c in lab10c.c for comparison.
Time It
The provided timing.c provides some timing tests on reasonably-sized arrays. Have a look. How does your code compare to what the compiler wrote? (Use -O3 to give the compiler its best chance.)
Why not x87?
As mentioned in lecture, we aren't using the x87 instructions in this course. Why not? It seems like there are a lot of programmer-friendly instructions there (loading constants, trigonometry, logarithms, they work on a stack and everybody loves stacks).
This function (with C signature void sin_x87(double* input, double* output, uint64_t length)) calculates the sine of each element of an array of double and fills another (equally-sized) array with the results. You don't need to worry about the details, except that it uses x87 floating point instructions to do the calculation. You can copy it into your lab10.S.
sin_x87:
mov $0,%rcx
s87_loop:
cmp %rdx,%rcx
jae s87_ret
fldl (%rdi, %rcx,8)
fsin
fstpl (%rsi, %rcx,8)
inc %rcx
jmp s87_loop
s87_ret:
ret
Create a program sin.c and write a C implementation sin_stdlib that does the same operation, but using the C standard library's sin function. You will have to include math.h and add -lm to your compiling/linking command.
The sin function you're calling isn't using any trig instructions, but is doing something different. Is it faster?
You have not been provided with testing or timing code for this part of the exercise. That is deliberate. Write a main function in your sin.c that produces output relevant to testing/timing the implementations (i.e. what you need to convince yourself your code is correct and answer the question below). The command to compile it should be like this:
gcc ... lab10c.c lab10.S sin.c -lm
Questions
Answer these questions in a text file answers.txt.
What was the running time of the dot product implementations? Assembly vs the compiler, and single- vs double-precision?
Same question, but for the polynomial evaluation? (Prediction: the differences should be much more obvious here.)
What is the relative running time of the x87-based sine calculation vs the C implementation (that uses its own implementation of the function)?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Spatial Databases A Tour

Authors: Shashi Shekhar, Sanjay Chawla

1st Edition

0130174807, 978-0130174802

More Books

Students also viewed these Databases questions

Question

Persuading Your Audience Strategies for

Answered: 1 week ago