Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

3. (3 points) In the lecture, we noted that F, represents the size of a self-join of the same table. In this problem, you will

image text in transcribed

3. (3 points) In the lecture, we noted that F, represents the size of a self-join of the same table. In this problem, you will generalize the algorithm to compute the estimate of the size of the join between two different tables. For two relations (i.e. tables in a database) r(A, B) and s(A,C), with a common attribute (i.e.,column) A, we define the join r s to be a relation consisting of all tuples (a, b,c) such that (a,b) er and (a,c) es. Therefore, if frj and fs, denote the frequencies of j in the first columns (i.e., "A"-columns) of r and s, respectively, and j can take values in {1,...,N}, then the size of join is D-1 frj-fej For example, in Figure 2, the third table is the join of the first two tables, joined by the key "MENTOR. The size (# rows) of the table is 2 x 2+2 x 1+3 x 2 = 12. Suppose that the first table and the second table are given as streams 01 and 12 respectively. (You may ignore irrelevant information outside of the first column and assume that each item in the stream is a single token ;). (a) Consider the following procedure: Choose a hash function h from 4-universal hash family that map 1, N] to (-1, +1}. Consider the following sketching algo- rithm: procedure f(a,h) a 2 MENTOR Alice Alice Bob Bob Charlie Charlie Charlie David Figure 2: MENTOR EXPERTISE Alice C++ Alice Java Bob Scala Charlie Python Charlie Fortran STUDENTS David Edward George Jack Luigi Mike Norman STUDENTS EXPERTISE David C++ Java Edward C++ Edward Java George Scala Jack Scala Luigi Python Luigi Fortran Mike Python Mike Fortran Norman Python Norman Fortran Set C+0 On processing token j: C+c+h() return c Let 0 = x-frjfaj. Use the procedure above to compute the an estimate of 6. Show that the expectation of your output is 0. (b) Assume that the variance of the above algorithm is upper bounded by 2.02. Describe how to produce an estimate o such that (1 )

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions