Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The dataset includes missing values, invalid values, outliers, and extreme values. The dataset is also imbalanced. Import the data into the IBM Modeler. 1 -

The dataset includes missing values, invalid values, outliers, and extreme values. The dataset is also imbalanced.
Import the data into the IBM Modeler.
1- Exploring the data: Using Modeler Data Audit node, visualize the features in the data. Include figures of visualized features. Also, report
a) How many input features are in the dataset? How many records are in the dataset?
b) Which feature is suitable as the target? (Please propose at least 1 candidate target)
c) How many valid records are in each feature?
d) How many outliers and extreme values are in the data?
e) For each input feature, depending on whether it is a numerical or a categorical feature, answer one of the following questions related to that feature:
a. What type of variable is this feature? What are the mean, median, min, max, standard deviation, and distribution of the feature? Does the distribution (or other statistical measures) look acceptable to you? Why yes or no?
b. What type of variable is this feature? What are the frequency (counts), and distribution of the feature? Does the distribution (or other statistical measures) look acceptable to you? Why yes or no?
2- Missing values: The data includes multiple missing values. Do not remove a record if there is only one missing value in that record. Instead, use the IBM Modeler to fill in the missing value with an algorithm of your choice. If you find a record with more than one missing value, then you may either remove that record, or use the IBM Modeler to fill in for the missing values.
Explain how you treated the missing values in the data.
3- Invalid values: The data includes multiple invalid values. Do not remove a record if there is only one invalid value in that record. Instead, use the IBM Modeler to fill in the invalid value with an algorithm of your choice. If you find a record with more than one invalid value, then you may either remove the record, or use the IBM Modeler to fill in for the invalid values.
Explain how you treated the invalid values in the data.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Expert Oracle Database Architecture

Authors: Thomas Kyte, Darl Kuhn

3rd Edition

1430262990, 9781430262992

More Books