Question
There are four datasets from different sources. The datasets are from different points of view such as 1. geometry (centroid_edges.mat) 2. density (points.mat), 3. image
There are four datasets from different sources. The datasets are from different points of view such as
1. geometry (centroid_edges.mat)
2. density (points.mat),
3. image (binaryalphadigs.mat)
4. text (20news_w100.mat).
Even though the datasets have ground truth labels. The task is to perform clustering operations using the datasets. There is some kind of uniqueness in all the datasets. It is up to the user to identify the uniqueness which is common among all the datasets.
Evaluation Criteria:
1. Data visualization
2. Using principal component analysis (PCA) for real datasets namely binaryalphadigs.mat and 20news_w100.mat
3. Implementing different clustering techniques such as k- means, KNN, k-medoids, min-max cut clustering, DBSCAN or any other hierarchical or Partitional clustering.
4. Identification of number of clusters in each dataset (simple but has to identify automatically through clustering metric).
5. Using clustering metric to perform computation. 6. Evaluation using ground truth labels. Note: These are the key pointers against which we will evaluate your solution. Your solution may cover the above mentioned points but do not restrict yourself just to these.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started