Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assume a dataset A of n points in a metric space with distance metric d(, ). Let c be a constant greater than 1. Then,

Assume a dataset A of n points in a metric space with distance metric d(, ). Let c be a constant greater than 1. Then, the (c, )-Approximate Near Neighbor (ANN) problem is defined as follows: Given a query point z, assuming that there is a point x in the dataset with d(x,z) , return a point x from the dataset with d(x ,z) c (this point is called a (c,)-ANN). The parameter c therefore represents the maximum approximation factor allowed and is a userdefined parameter. Let us consider an LSH family H of hash functions that is (, c, p1, p2)-sensitive0 for the distance measure d(,). Let G1 = Hk = {g = (h1,...,hk)|hi H, 1 i k}, where k = log1/p2 (n). Let us consider the following procedure: 1. Select L = n random members g1,...,gL of G, where = $%& (( )* + ) $%& (( )- + ) 2. Hash all the data points as well as the query point using all gi (1 i L). 3. Retrieve at most2 3L data points (chosen uniformly at random) from the set of L buckets to which the query point hashes. 4. Among the points selected in Step 3 (above), report the one that is the closest to the query point as a (c, )-ANN. The goal of the first part of this problem is to show that this procedure leads to a correct answer with constant probability.

Python code for the above problem

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Programming questions

Question

What are three drawbacks of mobile and wireless networks?

Answered: 1 week ago