Question
6 A Implement the following projects and collect any results in a Word (or other word-processing) doc to be printed out, stapled and handed in
6A
Implement the following projects and collect any results in a Word (or other word-processing) doc to be printed out, stapled and handed in by the due-date. Copy any questions, in bold, into your document so that I have context for your answers. Title and label axes of all graphs, if graphs are required.
Goals for this lab: to determine the difference between our basic search algorithms: UnsortedSequentialSearch, BinarySearch and Hashing. These techniques are all O(n), O(lg n), and ~O(1). So in completing this lab, I would expect you will generate some graphs that are linear, logarithmic-ish, and constant-ish -- respectively.
Lab Instructions:
Algorithm search-timings
Graphically compare timing-results of (Unsorted) Sequential Search, Binary Search, and Hashing for both present and nonpresent items. (Since these timings can be very short, you should search for a fixed number of values for each list-size, then divide that total search-time by the fixed number (1000 in the examples seen below) )
- Using each algorithm above: for each list-size, search for a constant number of items within the list that:
a) are not in the list, and
b) are in the list
- Be sure to divide your timing result by the number of items searched-for in order to get the properly-shaped graph result for: average time to find a single item
- 6B has instructions on setting yourself up for Hashtable time-trials!
- Generate a graph for each pair of timing-series (i.e. present, nonpresent)
- Describe your 3-4 resulting graphs and explain why they are shaped the way they are
Suggested parameters to get you started:
- Run your time-trials with 10 evenly-distributed list-sizes of size 100,000 - 1,000,000 for the sequential and Hashing searches.
- For Binary Search, you might try non-evenly-distributed list-sizes (as shown in class), for example from 1,000 to 1 million (approximately doubling in size each time) to get the full shape of the logarithmic curve.
- e.g. LEN = [1000, 2000, 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000]
- To ensure all-unique numbers, generate a list of random numbers using: random.sample()
- When averaging X iterations of a search over a single list-size, to properly compare apples to apples search for the same two lists of numbers (present, non-present) for each list-size.
- Timing your Hashtable object performance is extremely similar to timing Sequential and Binary searches, but made just a bit more complicated since you have to build an object. Instructions and suggestions in lab-format can be found here.
You should notice that set-up details (such as list-size) for the Hashing time-trials match up exactly with those of Sequential/Binary search lists, so your resulting graphs can be directly comparable.
Suggested Plan of Attack:
A. Make a random list of 2,000,000 integers using random.sample() name: TOTLIST
- divide into two master-lists of size 1,000,000: INS and NOTINS
- Generate a constant-size list of not-ins using random.sample(...)
Example
TOTLIST = random.sample(range(100000000), 2000000)
INS = TOTLIST[:1000000]
NOTINS = TOTLIST[1000000:]
For LEN in range(100000, 1000001, 100000):
Create 2 lists with first LEN elements of INS -- and NOTINS respectively (e.g. linsrchINS = INS[:LEN]
linsrchNotINS = NOTINS[:LEN] )
Create 2 more lists to search for a constant number of values
(e.g. srchNotINS = random.sample(NOTINS, 1000)
srchINS = random.sample(linsrchINS , 1000) )
Get aggregate (i.e. average) search-times of your srch items
Compare search-times of the same lists of values of each size using:
- sequential search
- binary search
- Hashing
algorithms.
Notes:
- Never overwrite INS, NOTINS after creation, but you may choose to create new (slice-extracted) lists as needed (such as a sorted version of INS to do BinarySearch on, for example e.g. binarysrchINS = linsrchINS[:].sort() )
- Be sure to explain the necessary preprocessing steps for the sorted search-types in your report
- Beware of aliasing when manipulating and/or sorting your lists. If aliasing occurs, explain how it happened, how you discovered it, and how you fixed it.
6B
Implement the following projects and collect any results in a Word (or other word-processing) doc to be printed out, stapled and handed in by the due-date. Copy any questions, in bold, into your document so that I have context for your answers. Title and label axes of all graphs, if graphs are required.
Goals for this lab: to explore the Hashing search algorithm, and compare it to SequentialSearch and BinarySearch. These techniques are O(~1), O(n) and O(lg n) respectively. So in completing this lab, I would expect to generate graphs that are constant-ish, linear and logarithmic respectively.
Lab Instructions:
- Implement Hashing storage, and retrieval.
- Make a quickLoad method for your HashTable class
- interface: {ht}.quickLoad(self, listOfValues, loadFactor)
- quickLoad method: resets self.slots to an appropriately-sized list (depending on size of listOfValues and loadFactor). Then fills it with elements from listOfValues.
- choose your favorite collision-resolution method, except: do not use the list-of-lists technique shown in your textbook (unless it is an extra trial in addition to a more traditional technique).
- show/explain the code for your rehash method in your lab report
- Compare average search-time of a given item using Hashing (with two load-levels: 50% and 99%)
- Generate search-timing series for both present items, and non-present items
- This can probably be done in two graphs: one for present, another for non-present items. Each graph contains two series: 50% and 99% timings.
- Make an additional graph containing all four HashTable timing-series together
- Graphically compare HT results to (Unsorted) Sequential Search, and Binary Search (again, using average per-item timing for both present and nonpresent items).
- Come up with two relevant, testable questions (i.e. hypotheses) and test-procedures; Come up with your hypothesis; then run your tests, determine your average results, graph them, revisit your hypothesis and explain your results.
Example question 1: Does search using Hashing, a presumably O(1) (constant-time) algorithm, really execute faster than Binary Search, a O(lg n) technique?
- The Neville Hypothesis: Don't be ridiculous! How could it possibly?!?
Example question 2: Does search using Hashing, a presumably O(1) (constant-time) algorithm, really execute faster than Sequential Search, a O(n) technique?
- The Everybody Hypothesis: Sounds legit.
Suggested parameters to get you started:
- Test with 10 list-sizes of size 100,000 - 1,000,000.
- To ensure all-unique numbers, generate a list of random numbers using: random.sample()
- To properly compare apples to apples, search for the same two lists of numbers (present, non-present) in each case.
Suggested Plan of Attack:A. Make a HashTable class
- use a simple Python-List implementation, not a list-of-lists.
- default list-size: 100 name: self.slots
- quickLoad method: resets self.slots to an appropriately-sized list (depending on size of listvals and loadFactor). Then fills it with elements from listVals.
- What is an "appropriate size" for your Hashtable list? That depends on your loadFactor parameter! To have a (loadFactor)%-full Hashtable after storing N values, Make a Hashtable of length N//LoadFactor.
B. Make a random list of 2,000,000 integers using random.sample() name: TOTLIST
- divide into two master-lists of size 1,000,000: INS and NOTINS
For LEN in range(100000, 1000001, 100000):
Create 2 HTs with first LEN elements of INS -- 0.50 and 0.99 loadFactor respectively (e.g. ht.QuickLoad(INS(:LEN), 0.50)
myINS = INS(:LEN)
myNotINS = NOTINS(:LEN) )
Find average search-time of LEN NOTINS items vs. INS items
Compare to search-times of these sizes of lists using:
- sequential search
- binary search
Hints:
- Never overwrite INS, NOTINS after creation, but you may choose to create new (derived) lists as needed
- Be sure to explain the necessary preprocessing steps for the sorted search-types in your report
- Beware of aliasing when manipulating and/or sorting your lists. If aliasing occurs, explain how it happened, how you discovered it, and how you fixed it.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started