Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

(20 points) Download the BostonHousing2.xls file (which has been used in Assignment 2). The target attribute in this dataset is CATMEDV (which is a binary

(20 points) Download theBostonHousing2.xlsfile (which has been used in Assignment 2). The target attribute in this dataset is CATMEDV (which is a binary attribute converted from MEDV in theBostonHousing.xlsfile).

a.Within Excel, save theFullDatasheet as a .CSV file, as you did for Assignment 2. Run Weka's support vector machines algorithm (SMO) on this data file, with 10-fold cross-validation. First, use the default parameterC= 1. Then, changeCvalue to 10 and 100 in sequence. Show the output screens that display the 10-fold cross-validation error rates in these three cases. How does the error rate change as theCvalue increases?

b.Based on the results withC=100, what two attributes are the most important predictors? Explain the impact of these two predictors on classification in terms of how classification result will change when the value of a predictor increases or decreases.

c.Run the SVM algorithm in Rattle on the same data, using a 70/30 partition, Linear (vanilladot) kernel, andC=100. Show two output screens, one from the Model section and the other from the Evaluate section with testing error rate and error matrix.

2.(20 points) Apply (i) decision trees (J48), (ii) Nave Bayes, (iii)k-NN (k=1), and (iv) SVM (SMO) in Weka for classifying the BostonHousing2 data used in Problem 1. Evaluate the performances of these four classification models based on (1) the overall classification accuracy, and (2) the ROC curve and AUC value by considering high-value homes as positive. The specific steps and questions for this problem are:

a.Run the fourclassification modelsin Weka on the data using the default settings (10-fold cross-validation, etc.). For each model, show two output screens: the first displays the 10-fold cross-validation error rates and the confusion matrix; the second displays the ROC curve (for your reference, see the output screens shown in the "Plotting ROC Curve in Weka" section of the lecture notes titled "Model and Performance Evaluation"). In sum, there are eight output screens, two for each classification model.

b.Based on the overall classification accuracy, rank the four models from the best to the worst.

c.Suppose you are only interested in accurately predicting/identifying high-value homes (so that the 'high' class is the positive class). In this case, how do you rank the four models from the best to the worst? Justify your answers with the relevant results from the Weka output.

3.(20 points) Download theBostonHousing.xlsfile (which has been used in Assignment 1). The target attribute in this dataset is MEDV (numeric). Delete the CAT.MEDV attribute (which is a binary attribute converted from MEDV) andsave the data to a CSV file, as you did for Assignment 1.

a.Run Weka'sLinearRegressionalgorithm with the default parameters and 10-fold cross-validation. Show the output screen with Linear Regression Model and the Cross-Validation Summary section with error results.

b.Run Weka's SVR algorithm (SMOreg) on this data. Seterror margin parameter (epsilonParameter)= 0.1.Keep the other default parameters unchanged. Show the output screen with SVR model and the Cross-Validation Summary section with error results. Compare and comment the performance of SVR and that of linear regression in part (a), based on the results of 'Mean absolute error' and 'Root mean squared error'.

c.What two attributes are the most important predictors based on the SVR model? Are they consistent with those identified in Problem 1b based on the SVM model?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Elementary Differential Equations And Boundary Value Problems

Authors: William E Boyce, Richard C DiPrima

8th Edition

0470476389, 9780470476383

More Books

Students also viewed these Mathematics questions

Question

Report to the class on the achievements of Henri de Pitot?

Answered: 1 week ago

Question

How can a sequence of frames be reversed?

Answered: 1 week ago

Question

6. How can a message directly influence the interpreter?

Answered: 1 week ago