Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1 . 1 . A . Preliminary data exploration, understanding and minor corrections in Excel file: The starting point for this subtask is the dataset

1.1.A. Preliminary data exploration, understanding and minor corrections in Excel file: The starting point for this subtask is the dataset itself, including the data dictionary and preparation tasks/notes in the second worksheet (Raw_data_dict). The next suggested step is to filter each column in the raw data to see whether there are blanks (missing values) or obvious data entry errors; the latter should be corrected in the Excel spreadsheet already before loading. The anomaly in the column Dependents should also be corrected in the Excel spreadsheet first, making reasonable assumptions. Finally, add and generate the variables Income_total and Loan_Income_ratio as explained in the data dictionary tab. 1.1.B. Data preparation for multiple analysis tasks: After the initial error removal in the source file (1.A.), you now have to perform the following data preparation steps in that order: Missing value handling/transformation; Data transformation for statistical analysis (e.g. string/cat to numeric, one to many); Outlier identification and treatment. For this subtask, you have essentially three options: 1) Use KNIME workflows for each analysis task or all tasks in one single workflow; 2) Use only EXCEL or any other tool and load the data for each analysis task; 3) Use a combination of the two options above (e.g. create a KNIME workflow to prepare the data and write or copy them to Excel files after outlier correction, and then use this for analysis tasks in Excel and KNIME; OR: treat missing variables and convert category variables already in Excel do outlier analysis only in KNIME.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Systems Design Implementation And Management

Authors: Peter Robb,Carlos Coronel

5th Edition

061906269X, 9780619062699

More Books

Students also viewed these Databases questions

Question

4. Devise an interview strategy from the interviewers point of view

Answered: 1 week ago