Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

ignore the test file, just provide some psedo code 2. A good hash function h(x) behaves in practice very close to the uniform hashing assumption

ignore the test file, just provide some psedo code

image text in transcribed

2. A good hash function h(x) behaves in practice very close to the uniform hashing assumption analyzed in class, but is a deterministic function. That is, h(x) = k each time r is used as an argument to h). Designing good hash functions is hard, and a bad hash function can cause a hash table to quickly exit the sparse loading regime by overloading some buckets and under-loading others. Good hash functions often rely on beautiful and complicated insights from number theory, and have deep connections to pseudorandom number generators and cryptographic functions. In practice, most hash functions are moderate to poor approximations of uniform hashing Consider the following hash function. Let U be the universe of strings composed of the characters from the alphabet A, . . . ,Z), and let the function f(xi) return the index of a letter xi E , e.g., f(A) = 1 and f(z) 26. Finally, for an m-character string x E ", define h(x) = (Din: 1/(Xi] mod 1), where 1 is the number of buckets in the hash table. That is, our hash function sums up the index values of the characters of a string r and maps that value onto one of the buckets. (a) (10 pts) The following list contains US Census derived last names http://www2.census.gov/topics/genealogy/2000surnamesames.zip (We have also provided a copy of a the CSV file from this ZIP file in the assignment on Moodle. Using these names as input strings, first choose a uniformly random 50% of these name strings and then hash them using h(x) Produce a histogram showing the corresponding distribution of hash locations when 175. Label the axes of your figure. Briefly describe what the figure shows about h(x), and justify your results in terms of the behavior of h(x). Do not forget to submit your code with your PS, using the same filename convention as for problem sets Lastname-Firstname-MMDD-PS5-code.* (where thecan be any plaintext file type you like). Hint: the raw file includes information other than name strings, which will need to be removed; and, think about how you can count hash locations without building or using a real hash table

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Processing

Authors: David J. Auer David M. Kroenke

13th Edition

B01366W6DS, 978-0133058352

More Books

Students also viewed these Databases questions

Question

Use trade credit to the firms advantage

Answered: 1 week ago

Question

Perform an Internet search. Discuss a company that uses EPLI.

Answered: 1 week ago

Question

How do you feel about employment-at-will policies? Are they fair?

Answered: 1 week ago