Question
Prove that k-means will produce k clusters, all non empty, or: Give an example of a set D of data points (with no repeated data
Prove that k-means will produce k clusters, all non empty, or: Give an example of a set D of data points (with no repeated data point), a value for k (k<=n, where n is the number of data objects), and a set of k data points as initial seeds, such that some cluster becomes empty. A motivation for this problem: Many or most of you will become professional programmers. The programs you write are supposed to work all the time, not just 999 times out of 1000. The pseudocode for k-means in the textbook will fail if indeed one of the clusters becomes empty.
Question: is it safe to write the code for k-means as the textbook, or will code written like that get you into some trouble with your manager (not initially, but after a while)? If you decide to try to find a malevolent data you can try the following. Try a data file of 6 to 12 data points in the plane so that when k-means is performed to k = 3 or k = 4 clusters, in iteration 2 (or iteration 3 or ...), one of the clusters becomes empty. The initial seeds must be data points; and your example should not rely on ties to accomplish its goals. (Comment: this effort might be easier if you let most of the points be collinear.) Show the step-by-step process of the clustering. Finally, Propose a solution in case of an empty cluster.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started