In this exercise, we will learn about the mathematical details underlying many spectral clustering methods (Section 9.4.3).

Question:

In this exercise, we will learn about the mathematical details underlying many spectral clustering methods (Section 9.4.3). Given an n×nn×n similarity matrix WW whose elements are the similarities between the corresponding data tuples (i.e., W(i,j)W(i,j) measures the similarity between tuples ii and j)j). We wish to partition the data tuples into two clusters. Let qq be a cluster membership vector of length n:q(i)=1n:q(i)=1 if data tuple ii belongs to Cluster A; and q(i)=1q(i)=1 if it belongs to Cluster B. One way to find these two clusters is to minimize the so-called cut-size, which measures the total similarities across different clusters:

9.4.3 Spectral clustering You may hear that spectral clustering methods have been used in many applications.

q=argminq{1,1}nJ=14ni,j=1(q(i)q(j))2W(i,j)q=argminq{1,1}nJ=14i,j=1n(q(i)q(j))2W(i,j)

a. Prove that the cut-size J=12qT(DW)qJ=12qT(DW)q, where DD is the degree matrix of W:D(i,i)=W:D(i,i)= nj=1W(i,j)j=1nW(i,j) and D(i,j)=0D(i,j)=0 for jiji; and TT is the vector transpose.

b. It is very difficult to directly optimize Eq. (9.48) since the cluster membership vector qq is a binary vector. In practice, we relax qq and allow it to take real numbers, and aim to solve the following optimization problem instead. Prove that the optimal solution of Eq. (9.49) is given by the eigenvector of DWDW that corresponds to the second smallest eigenvalue.

q* = arg min_q" (D  W)q  M st. ()==n =| (9.49)

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Data Mining Concepts And Techniques

ISBN: 9780128117613

4th Edition

Authors: Jiawei Han, Jian Pei, Hanghang Tong

Question Posted: