Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Given m data points xi in Rn , i = 1 , . . . , m , K - means clustering algorithm groups them

Given m data points xi in Rn, i =1,...,m, K-means clustering algorithm groups them into k clusters by
minimizing the distortion function over {rij,j}
m
J =
i=1
k
j=1
rij\| xi j\|2,
where rij =1 if xi belongs to the j-th cluster and rij =0 otherwise.
1.(10 points) Derive mathematically that using the squared Euclidean distance \| xi j\|2 as the dis
similarity function, the centroid that minimizes the distortion function J for given assignments rij are
given by
j = irijxi
i rij
.
That is,j is the center of j-th cluster.
Hint: You may start by taking the partial derivative of J with respect to j, with rij fixed.
2.(10 points) Derive mathematically what should be the assignment variables rij be to minimize the
distortion function J, when the centroids j are fixed.
3.(5 points) For the question above, now suppose we change the similar score to a quadratic distance
(also known as Mahalanobis distance) for given and fixed positive definite matrix \Sigma in Rn\times n, and the
distortion function becomes:
m
J =
i=1
k
j=1
rij(xi j)T\Sigma (xi j),
Derive what j and rij becomes in this case

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design And Relational Theory Normal Forms And All That Jazz

Authors: Chris Date

1st Edition

1449328016, 978-1449328016

More Books

Students also viewed these Databases questions