Question
This lab revisits sorting algorithms. You will: Implement and analyze two sort algorithms: InsertionSort MergeSort Using programming conventions (or patterns) that allow the performance of
This lab revisits sorting algorithms. You will:
Implement and analyze two sort algorithms:
InsertionSort
MergeSort
Using programming conventions (or patterns) that allow the performance of your implementations of the algorithms to be measured.
Use the make development tool to keep your project up to date.
5. Theory
5.1. The Insertion Sort Algorithm
You are required to analyze and implement another sorting strategy: InsertionSort. The
InsertionSort algorithm applied to a deck of cards can be described as:
Step 1:
If there are no cards to sort, then Stop.
Step 2:
Otherwise, remove the top card from the unsorted pile, figure out where it should go in the sorted pile and insert it there.
Step 3:
Go back to step 1.
The important characteristics of the algorithm are:
As it runs, it divides the deck into a sorted portion (initially empty) and an unsorted portion (initially the whole deck).
Each time Step 2 is performed, the sorted portion has one more card and the unsorted portion has one less card.
Step 2 is performed n times (where n is the number of cards in the deck).
Step 2 describes how this is done: in this case, get the first unsorted card and insert it into the proper position in the sorted part.
It is useful to compare this algorithm with the similar SelectionSort algorithm.
Step 1:
If there are no cards to sort, then Stop.
Step 2:
Otherwise, find the smallest card, remove it and place it on top of the sorted card pile.
Step 3:
Go back to step 1.
It should be evident that both algorithms share some basic characteristics. Both increase the sorted portion by one each time Step 2 is performed. They differ only in the way this is done:
SelectionSort has to examine all the cards in the unsorted portion, remove the smallest one and then place it at the end of the sorted portion.
InsertionSort takes the first card from the unsorted portion (i.e. it does not have to scan through the unsorted cards) and inserts it into the sorted portion (and it has to scan the sorted portion to figure out where it should go).
5.2. Implementing sort algorithms with metrics
The algorithms described previously use the deck-of-cards metaphor. This may be useful for human understanding of the algorithm, but is incomprehensible to computers.
In this lab, we require that all sort algorithms be implemented in C using a function mySort that sorts an array (or a subarray) of int's. The signature for mySort() is:
void mySort(int data[], unsigned int first, unsigned int last); |
Note |
The signature for mySort is in the file mySort.h. An implementation of the mySort function must modify the data array such that all elements in the range data[first], data[first+1]...data[last] are in sorted order. |
5.2.1. Example: SelectionSort implementation
The SelectionSort algorithm is described in Chapter 2 of Introduction to Algorithms, exercises 2.2-2 on page 27. It is strongly recommended that you review the idea of SelectionSort.
5.2.1.1. The initial implementation
For reference, the initial implementation of SelectionSort is shown below:
void mySort(int array[], unsigned int first, unsigned int last) { int i; /* Step 1: Is there nothing to sort? */ while (first < last) /* Step 2: Swap... */ for(i = first+1; i <= last; i++) { /* Find smallest one in rest of array */ if(array[first] > array[i])) { /Step 2..continued...swap them */ int tmp; tmp = array[first] array[first] = array[i]; array[i] = tmp; } first++; } return; |
5.2.2. Using metrics
All sort algorithms require that elements in the array to be sorted can be compared with something else of the same data type. Indeed, the number of times an algorithm needs to compare a data item with something else (of the same type) is fundamental metric (i.e. measure-of-performance or benchmark) that is often used to evaluate different sort algorithms.
You are provided with a framework that counts the total number of times that data elements are compared. Although you do not need to write (or even understand) this metrics framework, you must respect its contract.
In order to respect the contact so that the total number of comparisons is tracked, it is forbidden to compare a data element with something else using C comparison operators (such as ==, <, >, etc.) Instead, you must use the myCompare() function.
Note: Replacing (data[j] |
Consider the following code that directly compares a data element with something else: if (data[j] < foo) { The myCompare(x, y) function returns a negative number if and only if x < y. Hence, the previous code can transformed to: if (myCompare(data[i], foo) < 0) { |
In addition to comparing elements in the data array, all sort algorithms also have to change the positions of individual data elements. In order to be able to keep count of these operations, we require that elements be moved using the myCopy() or mySwap() functions.
Note: Replacing "data[i] = tmp" with myCopy() |
For example, instead of writing something like: data[i] = tmp; you would use the myCopy(): myCopy(&tmp, &data[i]); |
Note: Using mySwap() to interchange elements |
Step 2 of SelectionSort was initially implemented in C (as shown previously) with: int tmp; tmp = array[first] array[first] = array[i]; array[i] = tmp; The effect of this code was to interchange (or swap) the elements array[first] and array[i]. The same result can be achieved with the mySwap() function: mySwap(&array[first], &array[i]); |
5.3. Using make
Unlike previously labs where you had to type in the compile and link commands, you can perform these steps efficiently with the single command make.
6. Suggested approach
There are several requirements in the lab. A suggested approach is outlined below.
6.1. Use make
The files furnished to you include stub implementations of mySort() and a real main() function.
By default, the command make ensures that the commands insertionSort and mergeSort exist and correspond to the most recent modifications to source code files. The command make works "right out of the box" (which you should have verified in your prelab).
Although the "out-of-the-box" commands are compiled without error, they do not really work! (Actually, they do work if they are asked to sort nothing. Try it! (i.e. the command insertionSort < /dev/null will "sort nothing", write "nothing" to stdout and write statistics to stderr indicating that no comparisons, move, or swaps were needed to "sort nothing".)
6.2. Implement InsertionSort without metrics
As described previously, the InsertionSort sort algorithm is quite similar to SelectionSort.
6.3. Modify InsertionSort to include metrics
Once you are reasonably confident that the command insertionSort works, modify the implementation so that the metric functions are used.
The most important measurement is the number of comparisons; start by using myCompare() instead of direct comparisons. When this works, the output should still be sorted and the statistics (displayed to stderr) will indicate the number of comparisons required.
Finally, replace data element movements with myCopy() and/or mySwap(). The statistics should now reflect the number of copies/swaps performed by the algorithm.
6.4. Complete InsertionSort theoretical analysis
You should now attempt to complete a theoretical analysis of the number of compares/copies/swaps that your implementation performs for worst-, best- and average-case inputs of n. Ideally, you should develop an equation (a function of n) that gives these metrics.
6.5. Implement and analyze MergeSort
You should aim to complete the implementation and analysis of InsertionSort in the first week of the lab.
MergeSort is a bit trickier to implement and analyze. Note, in particular, that the algorithm itself is actually very easy to implement in C (although, of course, it does use recursion). The tricky part is the merge sub-algorithm.
Although the merge is not itself recursive, it does require at least one temporary array and it does require a fair amount of copying of elements to an from the temporary. Note, however, that it is not necessary to use dynamic memory allocation functions (such as malloc()); you can simply declare a local variable for your temporary array as (for example):
int temp[MAX_SIZE_N_TO_SORT}; |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started