Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Using python, solve centroids_expanded and distance 3 points Next, we need to compute the distances between data points and centroids. More concretely, for each data

Using python, solve centroids_expanded and distance

image text in transcribed

image text in transcribed

image text in transcribed

3 points Next, we need to compute the distances between data points and centroids. More concretely, for each data point X[:,i], we need to compute its distance from the k centroids, i.e., centroids [:,j](j=1,2,,k). We will store the computed distances in a km array, in which the element at position (j,i) is the distance between X[:,i] and centroids [:,j]. The distance we talk about here is Euclidean distance. There are multiple ways of implementing this computation. The most efficient way is as follows: - First, expand centroids by adding one demension to it, so that its shape changes from ( n,k) to ( n,1,k). This can be done by calling np.expand_dims( ). - Second, transpose X and centroids_expanded. The former has shape (m,n) and the latter has shape (k,1,n). Then the subtraction S=X.T centroids_expanded. T will be in shape (k,m,n). For why it is the case, read the documentation about the broadcasting mechanism of numpy here. Next, following the definition of Euclidean distance, we need to: - Compute S2, which is in shape (k,m,n). - Sum over S2 along axis=2, which eliminate the last dimension. - Apply numpy.sqrt ( ) to S2, resulting in an array of shape (k,m), which gives the Euclidean distances. If you found the above method hard to follow, you can also use an explicit for loop to do the computation. - You create an empty array distances of shape (k,m). - Then you use a for loop, for j in range (k):, and in each step, you compute followed by S2, numpy. sum( ), and numpy.sqrt ( ) to get the Euclidean distance, which is stored in a (1,m) array d. Then you copy d back to the th row of distances . Compute distances def compute_distances (X, centroids): " " " Args: X - - data, shape (n,m) centroids -- shape (n,k) Return: distances -- shape (k,m) """ \#\#\# START YOUR CODE \#\#\# centroids_expanded = None distances = None \#\#\# END YOUR CODE \#\#\# return distances Evaluate Task 2 p.random.seed(1) K_tmp =np random. randn (4,5) = = init_centroids (X_tmp,k=2) dists = compute_distances ( X_tmp, c) orint('Distances:', dists) Expected output Distances:[[3.19996571 3.13120276 0.1.52120576 2.54127667] [5.88553536 0. 3.131202762.25851302 4.11463616]]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

DB2 11 The Database For Big Data And Analytics

Authors: Cristian Molaro, Surekha Parekh, Terry Purcell, Julian Stuhler

1st Edition

1583473858, 978-1583473856

More Books

Students also viewed these Databases questions