Answered step by step
Verified Expert Solution
Question
1 Approved Answer
ignore the test file, just provide some psedo code 2. A good hash function h(x) behaves in practice very close to the uniform hashing assumption
ignore the test file, just provide some psedo code
2. A good hash function h(x) behaves in practice very close to the uniform hashing assumption analyzed in class, but is a deterministic function. That is, h(x) = k each time r is used as an argument to h). Designing good hash functions is hard, and a bad hash function can cause a hash table to quickly exit the sparse loading regime by overloading some buckets and under-loading others. Good hash functions often rely on beautiful and complicated insights from number theory, and have deep connections to pseudorandom number generators and cryptographic functions. In practice, most hash functions are moderate to poor approximations of uniform hashing Consider the following hash function. Let U be the universe of strings composed of the characters from the alphabet A, . . . ,Z), and let the function f(xi) return the index of a letter xi E , e.g., f(A) = 1 and f(z) 26. Finally, for an m-character string x E ", define h(x) = (Din: 1/(Xi] mod 1), where 1 is the number of buckets in the hash table. That is, our hash function sums up the index values of the characters of a string r and maps that value onto one of the buckets. (a) (10 pts) The following list contains US Census derived last names http://www2.census.gov/topics/genealogy/2000surnamesames.zip (We have also provided a copy of a the CSV file from this ZIP file in the assignment on Moodle. Using these names as input strings, first choose a uniformly random 50% of these name strings and then hash them using h(x) Produce a histogram showing the corresponding distribution of hash locations when 175. Label the axes of your figure. Briefly describe what the figure shows about h(x), and justify your results in terms of the behavior of h(x). Do not forget to submit your code with your PS, using the same filename convention as for problem sets Lastname-Firstname-MMDD-PS5-code.* (where thecan be any plaintext file type you like). Hint: the raw file includes information other than name strings, which will need to be removed; and, think about how you can count hash locations without building or using a real hash tableStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started