Answered step by step
Verified Expert Solution
Question
1 Approved Answer
( 5 pts ) Implement in Python a class to model hash functions from a certain family H of hash functions. ( Call this class
pts Implement in Python a class to model hash functions from a certain family H of hash functions. Call this class HashFamily. The constructor for H should take n and p as arguments. The constructor should generate a random set of p distinct indices from n and remember this set, call it S This class should have a method hashx which computes the hash value of any nbit number x as follows. The method first extracts xi : i in S It then formulates the pbit number obtained from this set in order of increasing index and returns this pbit number in decimal. Added : See Q&A On this question at the end of this document. pts Implement in Python a Bloom Filter BF as a class. The constructor should take m k and n as arguments, where m and k are as done in class and n is the number of bits in an input. Its okay to limit m to be a power of The constructor should then create k random hash functions as HashFamily objects with arguments nn and plogm It maintains these hash functions in its state. Support insert and lookup as done in class. Support a bash insert method a wrapper around insert as a convenience method see next question pts In this question, the aim is to empirically study how various parameter settings impact the false positive rate on simulated data. Generate bit numbers at random and store this list somewhere. Call it S for m in : for k in m: Create a BF with params mm kk n for q in mmmmmmm Generate q nbit numbers and batch insert them all into the BFNote that these are inserted into an existing BF Generate random nbit numbers from scratch and lookup each of them in the BF Calculate the proportion of these K numbers on which the BF answers YES. The higher this proportion the higher the false positive rate. Report your results in a table whose rows correspond to m k q one row per triplet. Analyze your results to draw meaningful conclusions about the impact of the various parameter settings on the false positive rate. Q: Having difficulty understanding the requirements for problem of HW Ask: explain a bit more with an example or what exactly the class functionality should be A: h HashFamilynp # After the constructor has executed, hS is a random element subset of eg hS Next, consider hhashx where x is an nbit binary number, represented however you wish eg as bit array Say x For hS the bits in x that hS touches are in bold next. Projecting out these bits forms the binary number which in decimal is So hash.x returns
pts Implement in Python a class to model hash functions from a certain family H of hash functions. Call this class HashFamily. The constructor for H should take n and p as arguments. The constructor should generate a random set of p distinct indices from n and remember this set, call it S This class should have a method hashx which computes the hash value of any nbit number x as follows. The method first extracts xi : i in S It then formulates the pbit number obtained from this set in order of increasing index and returns this pbit number in decimal. Added : See Q&A On this question at the end of this document.
pts Implement in Python a Bloom Filter BF as a class.
The constructor should take m k and n as arguments, where m and k are as done in class and n is the number of bits in an input. Its okay to limit m to be a power of The constructor should then create k random hash functions as HashFamily objects with arguments nn and plogm It maintains these hash functions in its state.
Support insert and lookup as done in class.
Support a bash insert method a wrapper around insert as a convenience method see next question
pts In this question, the aim is to empirically study how various parameter settings impact the false positive rate on simulated data.
Generate bit numbers at random and store this list somewhere. Call it S
for m in :
for k in m:
Create a BF with params mm kk n
for q in mmmmmmm
Generate q nbit numbers and batch insert them all into the BFNote that these are inserted into an existing BF
Generate random nbit numbers from scratch and lookup each of them in the BF Calculate the proportion of these K numbers on which the BF answers YES. The higher this proportion the higher the false positive rate.
Report your results in a table whose rows correspond to m k q one row per triplet.
Analyze your results to draw meaningful conclusions about the impact of the various parameter settings on the false positive rate.
Q: Having difficulty understanding the requirements for problem of HW Ask: explain a bit more with an example or what exactly the class functionality should be
A: h HashFamilynp #
After the constructor has executed, hS is a random element subset of eg hS
Next, consider hhashx where x is an nbit binary number, represented however you wish eg as bit array Say x For hS the bits in x that hS touches are in bold next. Projecting out these bits forms the binary number which in decimal is So hash.x returns
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started