Question
Hi, this is a program for C++ that I have not been able to figure out so far. Any help would be much appreciated! A
Hi, this is a program for C++ that I have not been able to figure out so far. Any help would be much appreciated!
"A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with a "counting" filter); the more elements that are added to the set, the larger the probability of false positives."
The Bloom filter itself is stored as an array of m Boolean values, which all start out as false. To add an object to the filter (in our case strings), we compute k hash functions for the string, and set the bit at the hash indices to true.
To test if a string is in the filter, we compute the k hash functions for that string and check to see if the values stored at those locations in the filter are true or false. If any of them are false, then the string is definitely not in the set, but if they are all true then the string is probably in the set.
Fairly rigorous analysis has been done on the error rates of Bloom filters (some of which is described in the Wikipedia article), but a quick rule of thumb is that using 10 bits per item stored in the filter and 7 hashes (k=7) will result in a false positive rate of about 1%.
The final functionality needed to implement the Bloom filter is the hash functions themselves. For a String s with letters s0s{n-1}, a positive integer p, and a Bloom filter of size m, we can define a hash h_p(s) as:
$hp(s) = (p^0 s0 + p^1 s1 + p^{n-1} s{n-1}) MOD m$
where s_i is the ASCII value of the letter (in other words, you just treat the char as an integer in the calculation). Recall that the % operator computes the MOD function in C++. Also note that to prevent overflow, you should apply the % operator to your result after every operation. In addition, you can't use the pow function to get the powers of p, since the powers of p themselves overflow.
To make 7 different hash functions, we simply use 7 different p values. Typcially, prime numbers work well, so we will use the values: 31, 37, 41, 43, 47, 53, and 59
---------------------------------------------------------------------------------------------------------------------------------------
Write a function that will compute the Bloom filter hash function defined above (hp) for an input string, an input value of p, and an input value of $m$. Your function must take three parameters: a string s, a integer value for p, and an integer filter size m. It must return an integer that is the value of the hash function hp(s).
The main function of your program should get the three inputs from the user, then call the hash function and print its value. The inputs are (1) a line of text that is the input string to be hashed, (2) the p value for the hash function, and (3) the size of the filter, m. Your program must have a second function to do the hash calculation. (The hash calculation may not be in main.)
Sample Program Run
Please enter a string: hi Please enter a value for p: 31 Please enter a filter size m: 100 h_31("hi") = 59
The calculation of h_p("hi") is given as follows:
s = "hi" (ASCII values: h=104, i=105) p = 31 m = 100 h_31("hi") = (31^0 * 104 + 31^1 * 105) % 100 = 3359 % 100 = 59
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started