Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Q: Apply k - means and hierarchical clustering to the ORL face dataset. Set k = 2 in k - means and select 2 clusters

Q: Apply k-means and hierarchical clustering to the ORL face dataset. Set k =2 in k-means and select 2 clusters in hierarchical clustering. Do the clustering results match the two genders? Following is the code that is need to be corrected:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
from PIL import Image
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
#hierarchical clustering
from sklearn.cluster import AgglomerativeClustering
# load data
df = pd.DataFrame(np.zeros((400,2576)))
count =0
for i in range(40):
for j in range(10):
image = Image.open("/Users/ORL_Faces/"+ str(i+1)+"_"+ str(j+1)+".png")
image_array = np.array(image)
image_array = image_array.reshape(1,-1)
df.iloc[count]= image_array
count +=1
df["Gender"]= np.ones((400,1))
df.iloc[0:10,2576]=0.0
df.iloc[70:80,2576]=0.0
df.iloc[90:100,2576]=0.0
df.iloc[310:320,2576]=0.0
print(df)
# k-means
X = df.iloc[:,0:2576]
scaler = StandardScaler()
X = scaler.fit_transform(X)
X = pd.DataFrame(X)
warnings.filterwarnings("ignore")
kmeans_model = KMeans(n_clusters =2)
kmeans_model.fit(X)
X["clusters"]= kmeans_model.labels_
true_label = sum(X["clusters"]== df["Gender"])
false_label = sum(X["clusters"]!= df["Gender"])
print("True Labels: {}
False Labels: {}".format(true_label, false_label))
true_lab =[]
false_lab =[]
for i in range(100):
kmeans_model = KMeans(n_clusters =2)
kmeans_model.fit(X)
true_label = sum(kmeans_model.labels_== df["Gender"])
false_label = sum(kmeans_model.labels_!= df["Gender"])
true_lab.append(true_label)
false_lab.append(false_label)
plt.scatter(true_lab, false_lab)
plt.xlabel("True Labels")
plt.ylabel("False Labels")
#hierarchical clustering
X = df.iloc[:,0:2576]
scaler = StandardScaler()
X = scaler.fit_transform(X)
X = pd.DataFrame(X)
hierarchical_cluster = AgglomerativeClustering(n_clusters=2, affinity='euclidean', linkage='ward')
hierarchical_cluster.fit(X)
X["clusters"]= hierarchical_cluster.labels_
true_label = sum(X["clusters"]== df["Gender"])
false_label = sum(X["clusters"]!= df["Gender"])
print("# of True Labeling: {}
# of False Labeling: {}".format(true_label, false_label))
true_lab =[]
false_lab =[]
for i in range(100):
kmeans_model = KMeans(n_clusters =2)
kmeans_model.fit(X)
true_label = sum(kmeans_model.labels_== df["Gender"])
false_label = sum(kmeans_model.labels_!= df["Gender"])
true_lab.append(true_label)
false_lab.append(false_label)
plt.scatter(true_lab, false_lab)
plt.xlabel("True Labels")
plt.ylabel("False Labels")
Could you please give me the complete Python code to solve the problem(not just the steps, but the code can get the result directly). Thank you!:)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

7. describe flow and clutch states and how to achieve them.

Answered: 1 week ago