Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Solve the following set of problems using Python and submit the code file with extension . ipynb. 1 . Load the Obesity dataset. Remove unwanted
Solve the following set of problems using Python and submit the code file
with extension ipynb.
Load the Obesity dataset. Remove unwanted features if required.
Select the optimum k value using Silhouette Coefficient and plot the optimum k
values.
Create clusters using Kmeans and Kmeans algorithms with optimal k value found in
the previous problem. Report performances using appropriate evaluation metrics.
Compare the results.
Now repeat clustering using KMeans for times and report the average
performance. Again compare the results that you have obtained in Q using
Kmeans and explain the difference.
Apply DBSCAN on this same Obesity dataset and find the optimum "eps" and
"minsamples" value. Is the number of clusters the same as the cluster found in Q
Explain the similarity or differences that you have found between two solutions.
Load the gene expression dataset. Apply PCA on the genes for generating
principal components. Plot the first three components of the PCA.
Continue from question what is the variance covered by the first three
components? Explain how this percentage of variance has been computed?
Continue from question apply KMeans on the original features of the gene dataset
and the first three components returned by PCA. Compare the results using the given labels. Kindly share the jupyter note contaning the code, outputs and explanation to the answers
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started