Question
2a. Identify the level of measurement of the following: odometer reading, NBA league standings, crime rate, quality of air measured as good vs bad, course
2a. Identify the level of measurement of the following: odometer reading, NBA league standings, crime rate, quality of air measured as good vs bad, course rating. Group them into quantitative and categorical attributes. |
|
|
|
|
|
|
|
b. Briefly describe the k-nearest neighbor algorithm |
|
|
|
|
|
|
|
|
|
|
3a. Briefly explain the following |
|
|
|
|
|
|
b. Outline any three ways of dealing with missing values |
|
|
Case ID | Drive Time (in minutes) | College Major | Gender | Age | Eye Color |
1 | 8 | Programming | M | 20 | Brown |
2 | 12 | SoftEng | M | 22 | Blue |
3 | 5 | Programming | F | 38 | Brown |
4 | 2 | Programming | F | 18 | Brown |
5 | 40 | ArtIntel | M | 20 | Hazel |
6 | 7 | Art Intel | F | 15 | Blue |
4.
- (i) Calculate the mean, mode and median of age and drivetime. [Show your work]
(ii) What is the shape of the distribution of age and drivetime?
- By observation is there an outlier in drive time? Confirm or decline your answer choice by using the interquartile range approach.
- (i) Use the Euclidean distance measure to obtain the distance between case 4 and case 6 considering their respective age and drivetime.
(ii) For the cases in (i), normalize their age and drivetime using min-max normalization. Again, using Euclidean measure, what is the distance between the two cases?
- Using college major, drivetime and eye color. Between case 1, 4, and 6, which pair is closest? Show your work [Hint use Jaccard distance measure]
5a. What is overfitting in data mining?
b. Distinguish between pre-pruning and post-pruning.
c. Why would you prefer pre-pruning to post-pruning?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started