Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

2a. Identify the level of measurement of the following: odometer reading, NBA league standings, crime rate, quality of air measured as good vs bad, course

2a. Identify the level of measurement of the following: odometer reading, NBA league standings, crime rate, quality of air measured as good vs bad, course rating. Group them into quantitative and categorical attributes.

b. Briefly describe the k-nearest neighbor algorithm

  1. For the following tasks assigned to you, what data mining task will you use and why. [Explain your answer choice for full points)
  1. build a model to detect which tenant will default on rent payment

  1. determine the crime rate in a county

  1. Identify any two (2) algorithms for classification

3a. Briefly explain the following

  1. Root node

  1. Leaf node

  1. Pure Leaf

b. Outline any three ways of dealing with missing values

Case ID

Drive Time (in minutes)

College Major

Gender

Age

Eye Color

1

8

Programming

M

20

Brown

2

12

SoftEng

M

22

Blue

3

5

Programming

F

38

Brown

4

2

Programming

F

18

Brown

5

40

ArtIntel

M

20

Hazel

6

7

Art Intel

F

15

Blue

4.

  1. (i) Calculate the mean, mode and median of age and drivetime. [Show your work]

(ii) What is the shape of the distribution of age and drivetime?

  1. By observation is there an outlier in drive time? Confirm or decline your answer choice by using the interquartile range approach.

  1. (i) Use the Euclidean distance measure to obtain the distance between case 4 and case 6 considering their respective age and drivetime.

(ii) For the cases in (i), normalize their age and drivetime using min-max normalization. Again, using Euclidean measure, what is the distance between the two cases?

  1. Using college major, drivetime and eye color. Between case 1, 4, and 6, which pair is closest? Show your work [Hint use Jaccard distance measure]

5a. What is overfitting in data mining?

b. Distinguish between pre-pruning and post-pruning.

c. Why would you prefer pre-pruning to post-pruning?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions