Consider the data set shown in Table 5.1 Table 5.1. Data set for Exercise 7. (a) Estimate
Question:
Table 5.1. Data set for Exercise 7.
(a) Estimate the conditional probabilities for P(A|+), P(B|+), P(C|+), P(A|), P(B|), and P(C|).
Answer:
P(A = 1|) = 2/5 = 0.4, P(B = 1|) = 2/5 = 0.4,
P(C = 1|) = 1, P(A = 0|) = 3/5 = 0.6,
P(B = 0|) = 3/5 = 0.6, P(C = 0|) = 0; P(A = 1|+) = 3/5 = 0.6,
P(B = 1|+) = 1/5 = 0.2, P(C = 1|+) = 2/5 = 0.4,
P(A = 0|+) = 2/5 = 0.4, P(B = 0|+) = 4/5 = 0.8,
P(C = 0|+) = 3/5 = 0.6.
(b) Use the estimate of conditional probabilities given in the previous question to predict the class label for a test sample (A = 0,B = 1, C = 0) using the na¨Ä±ve Bayes approach.
Answer:
Let P(A = 0,B = 1, C = 0) = K.
P(+|A = 0,B = 1, C = 0)
= 0.4 Ã 0.2 Ã 0.6 Ã 0.5/K
= 0.024/K.
P(|A = 0,B = 1, C = 0)
(c) Estimate the conditional probabilities using the m-estimate approach, with p = ½ and m = 4.
Answer:
P(A = 0|+) = (2 + 2)/(5 + 4) = 4/9,
P(A = 0|) = (3+2)/(5 + 4) = 5/9,
P(B = 1|+) = (1 + 2)/(5 + 4) = 3/9,
P(B = 1|) = (2+2)/(5 + 4) = 4/9,
P(C = 0|+) = (3 + 2)/(5 + 4) = 5/9,
P(C = 0|) = (0+2)/(5 + 4) = 2/9.
(d) Repeat part (b) using the conditional probabilities given in part (c).
Answer:
Let P(A = 0,B = 1, C = 0) = K
Step by Step Answer:
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar