Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data

image text in transcribed

Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data is actually samples taken from three different Gaussian distributions. Please follow the following steps, i) Take these three mean values (3, 70), (7, 150) and (13,250). Take these values to be the mean of three different Gaussian distributions, generate 100 random data samples for each mean. Generate the data using standard deviation to be 3 in each dimension, for each distribution. (hint: numpy.random.normal) Page 2 of 4 If we stack all the samples together, this should result in a 2x300 matrix, here each feature vector has dimension 2 and total number of feature samples are 300 (Remember: When you stack all the feature vectors together in a matrix, you already know the order in which you stacked them. In this way, you will always know which feature vector came from which distribution) Now generate 300 samples of a Gaussian distribution with mean (0,0), where standard deviation in each dimension is 1. This should also give you a 2x300 samples of Gaussian noise, add this result to the feature vector matrix generated in step (i). After addition, this result becomes our data, which we are going to utilize for clustering. Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data is actually samples taken from three different Gaussian distributions. Please follow the following steps, i) Take these three mean values (3, 70), (7, 150) and (13,250). Take these values to be the mean of three different Gaussian distributions, generate 100 random data samples for each mean. Generate the data using standard deviation to be 3 in each dimension, for each distribution. (hint: numpy.random.normal) Page 2 of 4 If we stack all the samples together, this should result in a 2x300 matrix, here each feature vector has dimension 2 and total number of feature samples are 300 (Remember: When you stack all the feature vectors together in a matrix, you already know the order in which you stacked them. In this way, you will always know which feature vector came from which distribution) Now generate 300 samples of a Gaussian distribution with mean (0,0), where standard deviation in each dimension is 1. This should also give you a 2x300 samples of Gaussian noise, add this result to the feature vector matrix generated in step (i). After addition, this result becomes our data, which we are going to utilize for clustering

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Server Query Performance Tuning

Authors: Sajal Dam, Grant Fritchey

4th Edition

1430267429, 9781430267423

More Books

Students also viewed these Databases questions