Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Use the python - based popular industry data mining software packages and development / analysis tools - Panda, Scikit - Learn, Seaborn, etc. Study the

Use the python-based popular industry data mining software
packages and development/analysis tools - Panda,
Scikit-Learn, Seaborn, etc. Study the related documentation
from the software user guide.
5. On the given dataset, execute any predictive data mining
technique such as decision trees, Bayesian classifiers, neural
networks, k-nearest neighbors, ensemble learning, linear
regression etc.
6. You can implement multiple techniques, one followed by another
on the same dataset if you prefer to use this for your analysis.
7. If needed, you should convert the format of your dataset as
needed by the respective technique(s). If you have already
performed conversion in the assignment on data preprocessing,
you can use the converted data here.
8. If you would like to use the results of your descriptive data
mining techniques and conduct further analysis with predictive
data mining, that is fine as well.
9. You should aim to achieve robustness and generalization in the
mining, e.g. by altering seeds in the algorithm.
10. You must modify the concerned parameters, e.g. learning rate
and error threshold for neural networks, the value of k for
k-nearest neighbors etc. to get good results. Execute at least 3
different combinations of parameters and present the
experimental results accordingly for at least one technique.
11. Observe the experimental results and draw useful conclusions
from the data.
12. Based on the hypothesis obtained by the learning in the
predictive data mining technique(s), write a simple program that
uses the learned hypothesis to classify new data, and thus
serves as a prototype mini classification tool.
13. If you have used multiple techniques, you can select the one
that gives greatest accuracy or the one that is most suitable to
your data and domain etc.
14. The program should communicate with the user to give
outputs based on new unseen data. For example, consider that
the learned hypothesis is based on a dataset that uses weather
data to predict if it is okay to play tennis. This can include various
attributes such as temperature,chance-of-rain and so forth to
estimate the target playing tennis as being yes,no or
maybe. It can be used to manually derive rules such as if
temperature = medium and chance-of-rain = low then play-tennis
= yes which can be coded into the program. Thereafter, when a
user inputs new values for parameters such as temperature,
the program can use these rules to output the classification
target and thereby suggest to the user Yes, you can surely play
tennis today or No, you should not play tennis today or
Maybe, it seems okay to play tennis today. This example can
be modified based on your dataset and domain.
15. Show at least 3 different runs of such inputs and outputs
based on user interaction. The final goal is to have simple
communication with the user based on the learning done via
predictive data mining for classifying the target attribute. Hence,
please work accordingly based on your respective technique(s)
and application. You can tune this as per the needs of your MS
project, research topics or any other area of interest.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advanced Database Systems

Authors: Carlo Zaniolo, Stefano Ceri, Christos Faloutsos, Richard T. Snodgrass, V.S. Subrahmanian, Roberto Zicari

1st Edition

155860443X, 978-1558604438

Students also viewed these Databases questions