Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Nearest Neighbors We use a subset of the Iris Plants Database dataset ( provided by WEKA, contained in the iris.arff file ) . Each plant

Nearest Neighbors
We use a subset of the "Iris Plants Database" dataset (provided by WEKA, contained in the "iris.arff" file).
Each plant record (i.e., example) is represented by the 5 attributes.
SepalLength: the sepal length in cm.
SepalWidth: the sepal width in cm.
PetalLength: the petal length in cm.
PetalWidth: the petal width in cm.
Class: the classification attribute, with the possible values {Iris-setosa, Iris-versicolor, Iris-virginica}.
We want to predict the class for each of the following plants:
Plant #16.(SepalLength=4.6; SepalWidth=3.6; PetalLength=1.0; PetalWidth=0.2).
Plant #17.(SepalLength=6.1; SepalWidth=2.8; PetalLength=4.0; PetalWidth=1.3).
Plant #18.(SepalLength=7.7; SepalWidth=3.0; PetalLength=6.1; PetalWidth=2.3).
Part 1- Manual Computation
Apply the Nearest Neighbor learning algorithm to classify the three to-be-predicted plants (i.e., Plants #16-18), to determine what kind of plant it is.
Try the three different values for the neighborhood size; i.e., k=1; 3; and 5. Use one of the geometry distance functions (e.g., Manhattan or Euclidean distance function).
For k=1, convert the data of the set of Plants #16-18(together with their predicted class) into the ARFF format, and save it in the "plants_test1.arff" file.
For k=3, convert the data of the set of Plants #16-18(together with their predicted class) into the ARFF format, and save it in the "plants_test2.arff" file.
For k=5, convert the data of the set of Plants #16-18(together with their predicted class) into the ARFF format, and save it in the "plants_test3.arff" file.
Part 2- Analysis with WEKA
Convert the dataset containing 15 examples (i.e., Plants #1-15) into the ARFF format (supported by WEKA), and save it in the "plants_train.arff" file.
Launch the WEKA tool, and then activate the "Explorer" environment.
Open the "plants_train" dataset (i.e., saved in the "plants_train.arff" file).- For each attribute and for each of its possible values, how many instances in each class have the feature value (i.e., the class distribution of the feature values)?
Go to the "Classify" tab. Select the IBk classifier. In the "Test options" panel select the "Supplied test set" option. Activate the nearby "Set..." button and locate the "plants_test1.arff" file. Run the classifier and observe the results shown in the "Classifier output" window.
- How many instances used for the training? How many for the test?
- How many instances are incorrectly classified?
- What is the MAE (mean absolute error) made by the learned classifier?
- What can you infer from the information shown in the Confusion Matrix?
- Visualize the errors made by the learned classifier. In the plot, how can you differentiate between the correctly and incorrectly classified instances? In the plot, how can you see the detailed information of an incorrectly classified instance?
- How can you save the learned classifier to a file?
Now, click on the "IBk - K 1- W 0" label (i.e., close to the "Choose" button). Set KNN equal to 3(i.e., to use the neighborhood size of 3), and then click the "OK" button to save the new setting. Activate the nearby "Set..." button and locate the "plants_test2.arff" file. Run the classifier and observe the results shown in the "Classifier output" window.
- How many instances are incorrectly classified?
- What is the MAE (mean absolute error) made by the learned classifier?
- What can you infer from the information shown in the Confusion Matrix?
- Visualize the errors made by the learned classifier. In the plot, how can you differentiate between the correctly and incorrectly classified instances? In the plot, how can you see the detailed information of an incorrectly classified instance?
Now, click on the "IBk - K 3- W 0" label (i.e., close to the "Choose" button). Set KNN equal to 5(i.e., to use the neighborhood size of 5), and then click the "OK" button to save the new setting. Activate the nearby "Set..." button and locate the "plants_test3.arff" file. Run the classifier and observe the results shown in the "Classifier output" window.
- How many instances are incorrectly classified?
- What is the MAE (mean absolute error) made by the learned classifier?
- What can you infer from the information shown in the Confusion Matrix?
- Visualize the errors made by the learned classifier. In the plot, how can you differentiate between the correctly and incorrectly classified instances? In the plot, how can you see the detailed information of an incorrectly classified instance?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Database Relational Model A Retrospective Review And Analysis

Authors: C. J. Date

1st Edition

0201612941, 978-0201612943

More Books

Students also viewed these Databases questions

Question

Define what is meant by RAD.

Answered: 1 week ago

Question

To find integral of sin(logx) .

Answered: 1 week ago