Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Suppose you have an example of Diabetes data collected from 200,000 patients, the data contains 8 features with numeric values, and the last column is

student submitted image, transcription available below

Suppose you have an example of Diabetes data collected from 200,000 patients, the data contains 8 features with numeric values, and the last column is the class label. Suppose the number '0' refers to missing data. In order to clean the data, we need to

Requirement (1): to handle the missing data for each feature

,

Requirement (2): to reduce the number of features, Requirement (3): to normalize the 'plas' attribute using Min-max normalization to have a value between 0 and 1,

Requirement (4) and finally to use stratified sampling to reduce the number of records from 200,000 to 10,000. Note that the percentage of positive and negative class labels are 70% and 30%, respectively. Describe the possible methods to be used for the first two requirements and provide a sample of the resulting data for the 3 requirement, and finally, the number of records to be sampled from each class in the 4th requirement
 

Q2)[5 marks] Suppose you have an example of Diabetes data collected from 200,000 patients, the data contains 8 features with numeric values, and the last column is the class label. Suppose the number '0' refers to missing data. In order to clean the data, we need to Requirement (1): to handle the missing data for each feature, Requirement (2): to reduce the number of features, Requirement (3): to normalize the 'plas' attribute using Min-max normalization to have a value between 0 and 1, Requirement (4) and finally to use stratified sampling to reduce the number of records from 200,000 to 10,000. Note that the percentage of positive and negative class labels are 70% and 30%, respectively. Describe the possible methods to be used for the first two requirements and provide a sample of the resulting data for the 3rd requirement, and finally, the number of records to be sampled from each class in the 4th requirement. 'skin' 'insu' 'mass' 226 'class' 'age' 'pedi' 50 positive 0.627

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Microeconomics An Intuitive Approach with Calculus

Authors: Thomas Nechyba

1st edition

538453257, 978-0538453257

More Books

Students also viewed these Programming questions

Question

5-10. How do you build brand loyalty?

Answered: 1 week ago

Question

Explain why it is not wise to accept a null hypothesis.

Answered: 1 week ago

Question

Calculate the account balance for each of the following:

Answered: 1 week ago