Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1 . 1 . A . Preliminary data exploration, understanding and minor corrections in Excel file: The starting point for this subtask is the dataset

1.1.A. Preliminary data exploration, understanding and minor corrections in Excel file:
The starting point for this subtask is the dataset itself, including the data dictionary and preparation
tasks/notes in the second worksheet (Raw_data_dict).
The next suggested step is to filter each column in the raw data to see whether there are blanks
(missing values) or obvious data entry errors; the latter should be corrected in the Excel spreadsheet
already before loading. The anomaly in the column Dependents should also be corrected in the Excel
spreadsheet first, making reasonable assumptions. Finally, add and generate the variables
Income_total and Loan_Income_ratio as explained in the data dictionary tab.
1.1.B. Data preparation for multiple analysis tasks:
After the initial error removal in the source file (1.A.), you now have to perform the following data
preparation steps in that order:
Missing value handling/transformation;
Data transformation for statistical analysis (e.g. string/cat to numeric, one to many);
Outlier identification and treatment.
For this subtask, you have essentially three options:
1) Use KNIME workflows for each analysis task or all tasks in one single workflow;
2) Use only EXCEL or any other tool and load the data for each analysis task;
3) Use a combination of the two options above (e.g. create a KNIME workflow to prepare the data
and write or copy them to Excel files after outlier correction, and then use this for analysis tasks in
Excel and KNIME; OR: treat missing variables and convert category variables already in Excel do
outlier analysis only in KNIME.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke, David J. Auer

3rd Edition

0131986252, 978-0131986251

More Books

Students also viewed these Databases questions