Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Here the first algorithm refers to essentially the Bottom-k algorithm with k=1. In the first algorithm for estimating the number of distinct elements, our estimator

image text in transcribedHere the "first algorithm" refers to essentially the Bottom-k algorithm with k=1.

In the first algorithm for estimating the number of distinct elements, our estimator was E(z) = 1/2 1. Is E(2) an unbiased estimator (i.e., does E[E(z)] equal to the real number of distinct elements) ? Now, suppose we were to take k independent copies z1, ... Zk of the basic estimator. (In other words, each Zj, for E [k], is the minimum of the hash values hj(i) over elements i encountered in the stream, where each hj is an independent random hash function.) Consider the estimator E'(z) = k-1 E(zi) 3-1. Prove that E' is not a good estimator for any arbitrarily large k: e.g., it is not true that it is a 2-approximation with probability at least 90%. (It is ok to fix d to a concrete value if that is convenient.) Justify your answers. In the first algorithm for estimating the number of distinct elements, our estimator was E(z) = 1/2 1. Is E(2) an unbiased estimator (i.e., does E[E(z)] equal to the real number of distinct elements) ? Now, suppose we were to take k independent copies z1, ... Zk of the basic estimator. (In other words, each Zj, for E [k], is the minimum of the hash values hj(i) over elements i encountered in the stream, where each hj is an independent random hash function.) Consider the estimator E'(z) = k-1 E(zi) 3-1. Prove that E' is not a good estimator for any arbitrarily large k: e.g., it is not true that it is a 2-approximation with probability at least 90%. (It is ok to fix d to a concrete value if that is convenient.) Justify your answers

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

DB2 11 The Database For Big Data And Analytics

Authors: Cristian Molaro, Surekha Parekh, Terry Purcell, Julian Stuhler

1st Edition

1583473858, 978-1583473856

More Books

Students also viewed these Databases questions