The RIPPER algorithm (by Cohen [1]) is an extension of an earlier algorithm called IREP (by F¨urnkranz
Question:
R1: A C
R2: A § B C
R2 is obtained by adding a new conjunct, B, to the left-hand side of R1. For this question, you will be asked to determine whether R2 is preferred over R1 from the perspectives of rule-growing and rule-pruning. To determine whether a rule should be pruned, IREP computes the following measure:
(a) Suppose R1 is covered by 350 positive examples and 150 negative examples, while R2 is covered by 300 positive examples and 50 negative examples. Compute the FOIL's information gain for the rule R2 with respect to R1.
(b) Consider a validation set that contains 500 positive examples and 500 negative examples. For R1, suppose the number of positive examples covered by the rule is 200, and the number of negative examples covered by the rule is 50. For R2, suppose the number of positive examples covered by the rule is 100 and the number of negative examples is 5. Compute vIREP for both rules. Which rule does IREP prefer?
(c) Compute vRIPPER for the previous problem. Which rule does RIPPER prefer?
Step by Step Answer:
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar