Question
Let X be an (n p) data matrix in which each row corresponds to a p- variate measurement on one of n individuals. Assuming that
Let X be an (n p) data matrix in which each row corresponds to a p- variate measurement on one of n individuals. Assuming that the p variates are continuous variables describe three possible measures of dissimilarity of pairs of individuals. Comment on their relative advantages and disadvantages. [3] (b) What four properties must be satisfied for a dissimilarity function to be a metric dissimilarity coefficient? [2] The values of four binary variables are measured for each of four individuals as follows: Individual Variable 1 2 3 4 1 1 1 1 0 2 0 0 1 1 3 1 1 1 1 4 0 1 0 1 Construct a dissimilarity matrix for the four individuals using (i) the simple matching coefficient and (ii) Jaccards coefficient. [4] If Srt denotes the simple matching coefficient show that drt = 1 Srt is a metric dissimilarity coefficient. [4] (c) Five subjects were each given three psychological tests. The scores for each subject on each test were recorded and the Euclidean distances between each pair of subjects were calculated as follows: Subject A B C D E A 0 - - - - B 4.2 0 - - - C 5.9 7.6 0 - - D 1.2 7.0 10.3 0 - E 6.1 2.6 5.4 7.8 0 Using single-link clustering, cluster the five subjects. Sketch the dendrogram and interpret the results. [4] How would your dendrogram change if you used a complete-link clustering algo- rithm? [3]
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started