Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Create a Python class named ClusteringOperations Code will be saved in Q 2 . py This class will have the following methods: a ) (

Create a Python class named ClusteringOperations
Code will be saved in
Q2.py
This class will have the following methods:
a)(16 points possible) Create a class method named calculate_silhouette_scores that gets
as an argument a list of partitions (this will be a list of lists of lists (each first level sublist
represents a partition, and partition will have multiple lists, each representing one
datapoint)). As a distance function, use the Euclidean distance.
The method will implement a modified version of the silhouette coefficient score for
each data point, using a modified version of the formulas presented in class.
The modification is for b(i) function which will equal to
maxC'inClusters,C'CjinC'?,i2|C'|jd(i,j)
The method will return a list with scores, each score representing the silhouette score
for one datapoint. The order of the data points will be kept the same as in the input
argument.
For example if let's say our datapoints are 2D vectors, and the input argument is given
by
[2,13,1?1,11,22,22,33,3?]
In this example we have three partitions: first , second
[111222], and third [2333].
The first partition has two data points 2,1 and 3,1.
The second partition has three data points: 1,1
1,2, and [2,2].
The third partition has two data points 2,3 and 3,3
For such an input, in the result you would need to return the silhouette score for each of
these seven datapoints, in this order. The datapoints will keep the other they appear in
the input argument.
[21311112222333]
This means you will return a list with silhouette scores, and the first element will have
the silhouette score for 2,1 datapoint, the second element will have the silhouette
score for 3,1 datapoint, and so on.
Your method must work with any number of partitions, any number of datapoints, and
datapoints of any dimension.
Note: no method that calculates the silhouette score could be called. This method needs
to be fully implemented by you according to the instructions given. b)(4 points possible) Create
a class method called invoke_silhoutte that gets as an
argument a list of partitions (same format as for method calculate_silhouette_scores),
an it calls calculate_silhouette_scores and gets the silhouette score for the datapoints.
It prints that on disk. After that it calculates the mean silhouette score and it returns
that value.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design For Mere Mortals

Authors: Michael J Hernandez

4th Edition

978-0136788041

More Books

Students also viewed these Databases questions