In this exercise, we will learn about the mathematical details underlying many spectral clustering methods (Section 9.4.3).
Question:
In this exercise, we will learn about the mathematical details underlying many spectral clustering methods (Section 9.4.3). Given an n×n similarity matrix W whose elements are the similarities between the corresponding data tuples (i.e., W(i,j) measures the similarity between tuples i and j). We wish to partition the data tuples into two clusters. Let q be a cluster membership vector of length n:q(i)=1 if data tuple i belongs to Cluster A; and q(i)=−1 if it belongs to Cluster B. One way to find these two clusters is to minimize the so-called cut-size, which measures the total similarities across different clusters:
q∗=argminq∈{−1,1}nJ=14n∑i,j=1(q(i)−q(j))2W(i,j)
a. Prove that the cut-size J=12qT(D−W)q, where D is the degree matrix of W:D(i,i)=∑nj=1W(i,j) and D(i,j)=0 for j≠i; and T is the vector transpose.
b. It is very difficult to directly optimize Eq. (9.48) since the cluster membership vector q is a binary vector. In practice, we relax q and allow it to take real numbers, and aim to solve the following optimization problem instead. Prove that the optimal solution of Eq. (9.49) is given by the eigenvector of D−W that corresponds to the second smallest eigenvalue.
Transcribed Image Text:
9.4.3 Spectral clustering You may hear that spectral clustering methods have been used in many applications. What is the major idea behind spectral clustering? As discussed in Section 9.2.1, due to the curse of dimensionality, measuring pairwise distance in the original data space of a high-dimensional data set may not be reliable for clustering analysis. Fig. 9.13 illustrates the intuition. The figure shows two clusters of points, one in white and the other in black. Points a and b belong to two different clusters. However, in the original data space, the distance between a and b is shorter than that between a and some other points within the white cluster, such as c. Therefore if we rely on the pairwise distances in the original data space to construct clusters directly, the results may not be meaningful. FIGURE 9.13 0 00 b
Fantastic news! We've Found the answer you've been seeking!
Step by Step Answer:
Answer rating: 100% (1 review)
Answered By
Ashington Waweru
I am a lecturer, research writer and also a qualified financial analyst and accountant. I am qualified and articulate in many disciplines including English, Accounting, Finance, Quantitative spreadsheet analysis, Economics, and Statistics. I am an expert with sixteen years of experience in online industry-related work. I have a master's in business administration and a bachelor’s degree in education, accounting, and economics options.
I am a writer and proofreading expert with sixteen years of experience in online writing, proofreading, and text editing. I have vast knowledge and experience in writing techniques and styles such as APA, ASA, MLA, Chicago, Turabian, IEEE, and many others.
I am also an online blogger and research writer with sixteen years of writing and proofreading articles and reports. I have written many scripts and articles for blogs, and I also specialize in search engine
I have sixteen years of experience in Excel data entry, Excel data analysis, R-studio quantitative analysis, SPSS quantitative analysis, research writing, and proofreading articles and reports. I will deliver the highest quality online and offline Excel, R, SPSS, and other spreadsheet solutions within your operational deadlines. I have also compiled many original Excel quantitative and text spreadsheets which solve client’s problems in my research writing career.
I have extensive enterprise resource planning accounting, financial modeling, financial reporting, and company analysis: customer relationship management, enterprise resource planning, financial accounting projects, and corporate finance.
I am articulate in psychology, engineering, nursing, counseling, project management, accounting, finance, quantitative spreadsheet analysis, statistical and economic analysis, among many other industry fields and academic disciplines. I work to solve problems and provide accurate and credible solutions and research reports in all industries in the global economy.
I have taught and conducted masters and Ph.D. thesis research for specialists in Quantitative finance, Financial Accounting, Actuarial science, Macroeconomics, Microeconomics, Risk Management, Managerial Economics, Engineering Economics, Financial economics, Taxation and many other disciplines including water engineering, psychology, e-commerce, mechanical engineering, leadership and many others.
I have developed many courses on online websites like Teachable and Thinkific. I also developed an accounting reporting automation software project for Utafiti sacco located at ILRI Uthiru Kenya when I was working there in year 2001.
I am a mature, self-motivated worker who delivers high-quality, on-time reports which solve client’s problems accurately.
I have written many academic and professional industry research papers and tutored many clients from college to university undergraduate, master's and Ph.D. students, and corporate professionals. I anticipate your hiring me.
I know I will deliver the highest quality work you will find anywhere to award me your project work. Please note that I am looking for a long-term work relationship with you. I look forward to you delivering the best service to you.