Consider the (relative distance) K-means scheme for outlier detection described in Section 10.5 and the accompanying figure,
Question:
(a) The points at the bottom of the compact cluster shown in Figure 10.10 have a somewhat higher outlier score than those points at the top of the compact cluster. Why?
(b) Suppose that we choose the number of clusters to be much larger, e.g., 10. Would the proposed technique still be effective in finding the most extreme outlier at the top of the figure? Why or why not?
(c) The use of relative distance adjusts for differences in density. Give an example of where such an approach might lead to the wrong conclusion.
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Related Book For
Introduction to Data Mining
ISBN: 978-0321321367
1st edition
Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar
Question Posted: