a. Create and report a scatter plot of the data. Describe the appearance of the points in the scatterplot? How many clusters (clearly separated
a. Create and report a scatter plot of the data. Describe the appearance of the points in the scatterplot? How many clusters (clearly separated sets of points) are there? b. Cluster the data using K-Means for the specified number of clusters as indicated in your response to the previous question. Use a random initialization of means. Create and report a scatter plot of the data with the clusters indicated by their color. How well does this method identify the clusters in the plot? (Hint: It may be helpful to use python's sklearn package for K-Means tasks) c. Consider the task of clustering with K-Means using K = 2 means and initialized means located at (-8,0) and (1, -1). Create and report a scatter plot of the data with the clusters identified under this clustering method indicated by their color. How well does this method identify the clusters in the plot? Does choosing this initialization help in identifying the true clusters? d. Consider the task of clustering with K-Means using K = 5 means and initialized means located at (7.5,0), (0,7.5), (7.5,0), (0, -7.5), and (0, 0). Create and report a scatter plot of the data with the clusters identified under this clustering method indicated by their color. How well does this method identify the clusters in the plot? Does choosing this initialization help in identifying the true clusters? e. Now consider the task of grouping the points using spectral clustering. Using the value of K as indicated by your response to part a, create and report a scatter plot of the data with the clusters identified under this clustering method indicated by their color. Use a Gaussian Kernel with a bandwidth parameter value () of 1.5. Use the numpy.linalg.eig() function to identify eigenvectors and use the top K eigenvectors in terms of decreasing magnitude of eigenvalues. How well does this method identify the clusters in the plot? How do the types of clusters found in this method differ from those found using K-Means? = f. Using K 2 means and spectral clustering, create clusters using different values of the bandwidth parameter $\sigma$. Consider the clusters created by = {.1, 1, 2}. For which of these values of do the most appropriate clusters get created? Create and report a scatterplot of the clusters under this method indicated by their color. How well does this method identify the clusters in the plot?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started