Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. (18 points) Consider the following two datasets D1, D2 with sets of observations on the salary of employees (as multiple of $1k) in
1. (18 points) Consider the following two datasets D1, D2 with sets of observations on the salary of employees (as multiple of $1k) in companies AllElectronics and AllQuantums respectively: D: {13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 60, 65, 67, 73}; D2: {5, 10, 11, 13, 15, 35, 50, 55, 72, 92, 110, 142, 204, 215, 815}. (a) (5 points) Discretize each dataset by equal depth binning with a bin size of 5 and using bin means to approximate each bin. Please round the bin means to the nearest integer. Illustrate your steps. (b) (4 points) Compare the effect of the equal depth binning in (a) above on the two datasets in terms of the quality of approximation based on the average variance of the bins. Comparison will be based on which dataset has higher/lower average variance. Note that you will have to compute the average variance of the bins to make this comparison. (c) (5 points) Discretize each dataset by equal width binning using 5 intervals and using bin means to approximate each bin. Please round the bin means to the nearest integer. Illustrate your steps. (d) (4 points) Compare the effect of the equal width binning in (c) above on the two datasets in terms of the quality of approximation based on the average variance of the bins. Comparison will be based on which dataset has higher/lower average variance. Note that you will have to compute the average variance of the bins to make this comparison.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started