Question: Here is the link to the data set I am using: https://files.catbox.moe/8rj2hu.csv Just copy and paste the link into your browser and it will download
Here is the link to the data set I am using: https://files.catbox.moe/8rj2hu.csv
Just copy and paste the link into your browser and it will download the csv dataset file.
The second image is just to test the function works.
Thank you so much


1+; 2) Write an Python-function entropy(a, b) that computes the entropy and the percentage of outliers of a clustering result based on an apriori given set of class labels, where or gives the assignment of objects in O to clusters, and b contains the class labels of the examples in O. The entropy function H is dened as follows: Assume we have m classes in our clustering problem; for each cluster Ci we have proportions p, = (pil, ...,p,,,.) of examples belonging to the m different classes (for cluster numbers i =1, ..., k); the entropy of a cluster C, is computed as follows: H(p,) = 21-409,} * lo g2 (17%)) (H is called the entropy function) Moreover, if pij = 0,19,} * logz (pit-j) is dened to be 0 The entropy of a clustering X is the size-weighted sum of the entropies on the individual clusters: H(X) = 2r=1(lCr|/| Epl /|Cp|) * H(pr) In the above formulas 'I. . .I' represents the set cardinality function Moreover, we assume that X = {C1, ..., Ck} is a clustering with k clusters C1, ..., Ck You can assume that cluster 0 contains all the outliers, and clusters 1, 2, ..., k represents \"true\" clusters; therefore, you should ignore cluster 0 and its instances when computing H(X). The entropy function returns a vector: (
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
