Customer Rating of Breakfast Cereals. The dataset Cereals.csv includes nutritional information, store display, and consumer ratings for

Question:

Customer Rating of Breakfast Cereals. The dataset Cereals.csv includes nutritional information, store display, and consumer ratings for 77 breakfast cereals. Data Preprocessing. Remove all cereals with missing values.

a. Apply hierarchical clustering to the data using Euclidean distance to the normalized measurements. Compare the dendrograms from single linkage and complete linkage, and look at cluster centroids. Comment on the structure of the clusters and on their stability. (Hint: To obtain cluster centroids for hierarchical clustering, first use the Flatten Clustering operator to obtain the clustered data as an Example Set. Then, use the De-Normalize and Apply Model operators on the flattened clustered data to compute cluster centroids using the Aggregate operator.)

b. Which method leads to the most insightful or meaningful clusters?

c. Choose one of the methods. How many clusters would you use? What distance is used for this cutoff? (Look at the dendrogram.)

d. The elementary public schools would like to choose a set of cereals to include in their daily cafeterias. Every day a different cereal is offered, but all cereals should support a healthy diet. For this goal, you are requested to find a cluster of “healthy cereals.” Should the data be normalized? If not, how should they be used in the cluster analysis?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Machine Learning For Business Analytics

ISBN: 9781119828792

1st Edition

Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel

Question Posted: