Question
You are expected to identify any data repository and extract one secondary dataset. You are to provide a step-by-step procedure on how to pre-process the
You are expected to identify any data repository and extract one secondary
dataset. You are to provide a step-by-step procedure on how to pre-process
the extracted dataset and use the procedure to preprocess the extracted
data.
a) What is data?
[1 mark]
b) What is the difference between primary data and secondary data?
[2 marks]
c) What is the name of the data repository you identified? Provide the
repositorys URL
[2 marks]
d) Write the step-by-step procedure you will consider for preprocessing
the extracted dataset.
[5 marks]
e) Implement (d) on the extracted dataset. Upload both the original
dataset and the preprocessed dataset.
[5 marks]
f) Write the three phases for preparing the data in a text file to be called
in WEKA.
[1.5 marks]
g) What is the file extension for data files to be called in MATLAB, R
Software, WEKA, SPSS and RapidMiner?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started