Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1 Background In this assignment you will practice predicting and verifying the impact of data nature on the run-time of sorting algorithms. As we have

1 Background

In this assignment you will practice predicting and verifying the impact of data "nature" on the run-time of sorting algorithms. As we have seen with algorithms like insertion sort, the runtime (even worse case like Big-Oh) can be impacted by the nature of the input. To do this, we will create three different types of input data, that may give different results when sorted. Two sorting algorithms will then be benchmarked on these three types of data. Each algorithm will be run twice, for different dataset sizes, in order to get times that we can use to apply the doubling formula. (see slide 23: Modeling Small Datasets) in the Analysis of Algorithms slide deck for details on the doubling formula.) The doubling formula is lg TT(2 (NN)) = b. If we compute the formula, then we will be able to figure out the algorithm's Big-Oh for a particular type of input data, since they will be O(nb). b is simply the power. 2 Requirements [30 points] For this assignment you will be writing code to generate test data and benchmark sorting algorithms on it (edited from Sedgewick and Wayne: 2.1.36). First, write a series of methods that generate test data that is non-uniform: Half the data is 0s, half 1s. For example, an input of length 8 might look like [0, 1, 1, 0, 0, 1, 0, 1]. [4 points] Half the data is 0s, half the remainder is 1s, half the reminder is 2s, half the reminder is 3s, and so forth. For example, an input of length 8 might look like [0, 0, 1, 3, 0, 1, 2, 0]. [5 points] Half the data is 0s, half random int values (can use nextInt() from Java's Random package). For example, an input of length 8 might look like [0, 138617093, 0, 54119567, 0, 0, 4968, -650736346]. [4 points] Each of these three techniques should be implemented as a static method that takes a integer representing the size of a dataset, and returns an integer array containing that number of elements generated with the corresponding rule. Randomize (shuffle) the contents of the array after you populate it. Using the three methods you implemented, develop and test hypotheses about the effect of input on the performance of two of the algorithms (your choice) we have covered. See the course git repository for implementations. The program should contain your hypotheses (3 per algorithm) as comments: describe what you think the running time will look like (O(n)? O(n2)? O(n3)?) on each data set, and explain briey why you think that. As long as your ideas make sense, and you do the analysis prior to benchmarking, you will receive full credit on the hypotheses. [5 points] For each of the two sorting algorithms, your program should run them on the three types of test data. Test them with datasets size of 2048 and 4096. Time each of these twelve tests with the Stopwatch class given in class. (If your system is so fast you don't get good results, you may increase the dataset size.) [6 points] The program needs to compute the result of the doubling formula on the run times from the 2048 and 4096 result pairs to get the power (b) for that algorithm on that type of input, and then display it. Six different values should be shown if you have properly implemented all of the tests. [6 points]

3 Testing The main functionality to test is the methods that generate test data. You will want to run them multiple times, on different sizes, and display their output. Check that the output matches the patterns required above. There isn't much else to test for this homework. The algorithms you are benchmarking have already been tested for correctness. Optionally, you may want to try giving an input that is pure random numbers to each of the algorithms and checking if your doubling formula code gives the algorithms expected Big-Oh.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Machine Learning And Knowledge Discovery In Databases European Conference Ecml Pkdd 2017 Skopje Macedonia September 18 22 2017 Proceedings Part 3 Lnai 10536

Authors: Yasemin Altun ,Kamalika Das ,Taneli Mielikainen ,Donato Malerba ,Jerzy Stefanowski ,Jesse Read ,Marinka Zitnik ,Michelangelo Ceci ,Saso Dzeroski

1st Edition

3319712721, 978-3319712727

More Books

Students also viewed these Databases questions

Question

biochemistry

Answered: 1 week ago

Question

What is Accounting?

Answered: 1 week ago

Question

Define organisation chart

Answered: 1 week ago

Question

What are the advantages of planning ?

Answered: 1 week ago

Question

a valuing of personal and psychological privacy;

Answered: 1 week ago