Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Problem 2 (10 points) (Exercise 6.3.4 MMDS book) Suppose we perform the PCY Algo- rithm to find frequent pairs, with market-basket data meeting the following

image text in transcribed

Problem 2 (10 points) (Exercise 6.3.4 MMDS book) Suppose we perform the PCY Algo- rithm to find frequent pairs, with market-basket data meeting the following specifications: 1. The support threshold is 10,000. 2. There are one million items, represented by the integers 0, 1, 999999 3. There are 250, 000 frequent items, that is, items that occur 10,000 times or more. 4 Thare are oae millipstur 10,00 tines or ore 5. There are P pairs that occur exactly once and consist of two frequent items 6. No other pairs occur at all. 7. Integers are always represented by 4 bytes. 8. When we hash pairs, they distribute among buckets randomly, but as evenly as possible i.e., you may assume that each bucket gets exactly its fair share of the P pairs that occur once Suppose there are S bytes of main memory. In order to run the PCY Algorithm successfully, the number of buckets must be sufficiently large that most buckets are not frequent. In addition, on the second pass, there must be enough room to count all the candidate pairs. As a function of S, what is the largest value of P for which we can successfully run the PCY Algorithm on this data

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

Is this continuous at -2 1 -0 1

Answered: 1 week ago