Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Lab 8: Hashing Strings In this lab you will write a hashcode method that converts a String to an integer. Given a hashtable of size

Lab 8: Hashing Strings

In this lab you will write a hashcode method that converts a String to an integer. Given a hashtable of size 1000000 and a file of over 300k words, you will calculate the hashcode of each string and find out many many of them result in collisions (under a separate chaining collision handler so if two words happen to hash to the same index, only that one index is used up from the hash table.)

You will write a hash function: public static int hash(String s, int k)

which takes a String as input and computes the polynomial in k as learned in class. The following excerpt from official Java documentation indicates that Java uses a value k = 31 to computer this polynomial.

Your task is to simulate the hashing of all the English words in the given file using a polynomial for all values of k from 1 to 45. In each of these cases, you are to find the number of collisions that occurred, and also report the size of the largest hash bucket (the largest size of a chain under separate chaining). Your hash function, on k=31, before you take it modulo 1million should result in the same number as Javas .hashCode() function, which is something you should check to see if you are computing the polynomials correctly.

Note that you do not have to implement a completely-functioning hashtable!

Firstly, you do not have to consider any deletions. Secondly, you do not have to store the words at each hash bucket, just the number of words that would be there.

Note also that this computed polynomial will overflow as an integer, causing it to wrap into negative values. You can decide on how to deal with it, but document it in your solution.

Your output should provide the requested information:

k value = 31 resulted in 51789 collisions

k value = 31 maximum bucket size was 6

k value = 32 resulted in 220334 collisions

k value = 32 maximum bucket size was 1543

(If you want to check your values against these values my hash function computed the polynomial value in an int variable, allowing it to overflow, then taking it modulo 1million, and if it resulted in a negative value, adding 1million to it.)

At the end of your code, report the 3 best k-values by measure of total number of collisions and the 3 best k-value by having the best (i.e. smallest) maximum chain size.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database And Expert Systems Applications 33rd International Conference Dexa 2022 Vienna Austria August 22 24 2022 Proceedings Part 1 Lncs 13426

Authors: Christine Strauss ,Alfredo Cuzzocrea ,Gabriele Kotsis ,A Min Tjoa ,Ismail Khalil

1st Edition

3031124227, 978-3031124228

More Books

Students also viewed these Databases questions

Question

8. Do the organizations fringe benefits reflect diversity?

Answered: 1 week ago