Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Classification of Dry Beans For the purposes of this assignment you will develop two different classification predictive models to classify different types of dry beans.

Classification of Dry Beans
For the purposes of this assignment you will develop two different classification predictive models to classify
different types of dry beans. You have to write a report wherein you provide responses in clear narrative on the
aspects enumerated below, under appropriate section headings. Note that code will not be evaluated. Tables
and figures will also not be considered if these tables and figures are not accompanied by your own explanation
of what these tables and figures portray.
Complete the assignment in the following steps:
Download the DryBeanDataSet874.xlsx dataset. The dataset contains 13611 instances, 20 descriptive
features, and the class feature Class in column U.
Without changing anything in the provided dataset, provide an analytics base table wherein you charac-
terize all of the features of the dataset.
You now have to very carefully explore the dataset to identify data quality issues. For this part of
your report, only identify the data quality issues and provide justifications for these issues. One of the
data quality issues is that some of the class labels are missing. Excude this data quality issue from the
discussion.
Based on your analysis above, decide on two different machine learning approaches that you will employ
to construct a predictive model for this problem. Give justifications for why you have selected these two
approaches for this problem.
For this part of the assignment, only focus on the data quality issues with respect to the descriptive
features. For each of the machine learning approaches, discuss the data-preprocessing steps that you have
implemented to optimally transform the dataset for that specific machine learning approach and to correct
data quality issues. Note: do not do unnecessary data transformations. Carefully think about the data
transformations needed for your selected machine learning algorithms. Provide justifications for each of
these pre-processing steps. Should you decide not to address a data quality issue, justify this decision.
When you pre-process the dataset, make sure that you do not change the order of the instances in the
dataset.
For this part of the assignment, only use the instances that have a known class label. Develop the two
predictive models and evaluate the performance of the two models. Make sure to construct optimal
configurations of your chosen models both with respect to architecture and values for control parameters.
Describe the process that you have followed to produce an optimal configuration for each model. For this
purpose, carefully decide on the performance metrics that you will use. Conclude on which one of the two
approaches is best for this problem, and support your conclusion with justifications. For the purposes of
this assignment, make sure to report the performance based on a k-fold cross-validation. Decide on the
number of folds with a justification.
For the last part of the assignment, focus returns to those instances that have a missing class label. Make
use of k-nearest neighbour to impute a class label for each of these instances. Describe how you have
used k-nearest neighbours for this purpose. You have to decide on the value of k with justification. In a
table list the instance number and the imputed class label. Then, for your best model identified above,
retrain the model on the new datasets with the imputed class labels. Report on the performance of the
model, compared to the results obtained from step 6 above and conclude on the efficacy of the k-nearest
neighbour self-labeling process.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

(1 point) Calculate 3 sin x cos x dx.

Answered: 1 week ago