Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. The Apriori algorithm uses a generate-and-count strategy for finding frequent itemsets. Candidate generation: candidate itemsets of size k+1 are created by merging a pair

1. The Apriori algorithm uses a generate-and-count strategy for finding frequent itemsets.

Candidate generation: candidate itemsets of size k+1 are created by merging a pair of frequent itemsets of size k if their first k-1 items are the same. An itemset created in this way is discarded if any of its subsets is found to be infrequent during the candidate pruning step (see Page 52 of the slides for an example of candidate generation).

Support counting: after candidate itemsets of size k+1 are determined, Apriori algorithm counts the support of each candidate to determine frequent itemsets of size k+1.

Suppose the Apriori algorithm is applied to the data set shown in the following table with minsup = 30%, that is any itemset occurring in less than 3 transactions is considered to be infrequent.

Transaction ID

Item Bought

1

{a, b, c, d}

2

{b, c, d, e}

3

{a, b, d}

4

{a, d, e}

5

{b, c, d}

6

{b, e}

7

{c, d}

8

{a, b, d}

9

{a, b}

10

{d, e}

The itemset lattice representing the data set is given below.

image text in transcribed

  1. Mark out the pruned itemsets using Apriori algorithm. (10%)
  2. frequent itemsets found by the Apriori algorithm. (15%)
  3. Find all association rules involving 3 items, satisfying support >=30% and confidence >=70%. (25%)

2. Consider the following set of candidate 3-itemsets: {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 5}, {2, 4, 5}, {3, 4, 6}, {4, 5, 6}.

(a) Construct a hash tree for the above candidate 3-itemsets. (25%)

Assume the tree uses a hash function where all odd-numbered items are hashed to the left child of a node, while all even-numbered items are hashed to the right child. A candidate 3-itemset is inserted into the tree by hashing on each successive item in the candidate and then following the appropriate branch of the tree according to the hash value. Once a leaf node is reached, the candidate is inserted based on one of the following conditions:

Condition 1: If the depth of the leaf node is equal to 3 (the root is assumed to be at depth 0), then the candidate is inserted regardless of the number of itemsets already stored at the node.

Condition 2: If the depth of the leaf node is less than 3, then the candidate can be inserted as long as the number of itemsets stored at the node is less than maxsize. Assume maxsize = 2 for this question.

Condition 3: If the depth of the leaf node is less than 3 and the number of itemsets stored at the node is equal to maxsize, then the leaf node is converted into an internal node. New leaf nodes are created as children of the old leaf node. Candidate itemsets previously stored in the old leaf node are distributed to the children based on their hash values. The new candidate is also hashed to its appropriate leaf node.

(b) Consider a transaction that contains the following items: {1, 3, 4, 5, 6}. Using the hash tree constructed in Part (a), which leaf nodes will be checked against the transaction? What are the candidate 3-itemsets contained in the transaction? (Bonus Point) (Hint: See Page 59 of the slides.)

3. Given the following transactions:

TID

Items bought

1

AC adapter, wireless router, printer, camera, USB hub

2

Plastic bags, camera, HDMI cable, USB hub, wireless router

3

Camera, HDMI cable, printer, wireless router

4

HDMI cable, AC adapter, printer, camera

5

USB hub, lens wipe, HDMI cable, camera, wireless router

  1. Build the FP Tree if minimum support is 3 out of 5 (20%).
  2. What is the confidence of {HDMI cable, camera} {printer}? (5%)
Transcribed image text

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

International Baccalaureate Computer Science HL And SL Option A Databases Part I Basic Concepts

Authors: H Sarah Shakibi PhD

1st Edition

1542457084, 978-1542457088

More Books

Students also viewed these Databases questions