Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1) What is the type of the following kinds of attributes (a) age (in years), (b) salary, (c) ZIP code, (e) height, and (f) intensity

1) What is the type of the following kinds of attributes (a) age (in years), (b) salary, (c)

ZIP code, (e) height, and (f) intensity of rain? Classify them as continuous or discrete, and as

qualitative (nominal or ordinal) or quantitative (interval or ratio).

2)An analyst sets up a sensor network in order to measure the temperature of different

locations over a time period. What is the type of attributes collected (temperature)? What is the type of the dataset?

3) It is desired to partition customers into similar groups on the basis of their demographic profile.

a. What features could we use? Provide 3 examples. Would you describe such data as heterogeneous?

b. Which data mining problem is best suited to this task?

4)Suppose that you had a set of arbitrary objects, each representing different characteristics of gadgets. A domain expert gave you the similarity value between every pair of objects. How would you convert these objects into a multidimensional data set for clustering the gadgets ?

5)Suppose that you had a data set, such that each data point corresponds to sea-surface

temperatures over a square mile of resolution 1010. In other words, each data record contains a 1010 grid of temperature values with spatial locations. You also have some text

associated with each 1010 grid. How would you convert this data into a multidimensional

data set? How many features will each data point have?

6) Compute the cosine similarity, Jaccard coefficient

(if possible, for binary vectors), Euclidean distance, correlation coefficient for the following vectors, x, y:

a. x = (0, -1, 1, 2,-2), y = (0, -2, 2, 4, -4)

b. x = (0, 1, 0, 0, 0), y = (0, 1, 0, 0, 1)

c. x = (-1, -1, -1, -1, -1), y = (1, 1, 1, 1, 1)

7)Compute the cosine similarity and the Jaccard coefficient, between the two sets {A, B, C} and {A, C, D, E}. Hint: how will you represent each set?

8) Create three documents, A, B, and C such that the Euclidean distance between A and B is smaller than the Euclidean distance between A and C, even though documents A and B have no common words whereas documents A and C have some common words.

9)Are the following similarity measures good or bad for finding similarity in document-term data? Provide a one-line justification for each answer you provide.

a. correlation

b. cosine

c. Euclidean

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Chemistry

Authors: Raymond Chang

10th edition

77274318, 978-0077274313

More Books

Students also viewed these Mathematics questions