Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Part 1: Initial tree This part of the project will allow you to predict a class (fp, nfp) using J48 (C4.5), a decision tree based

Part 1: Initial tree

This part of the project will allow you to predict a class (fp, nfp) using J48 (C4.5), a decision tree based classification algorithm.

Build a classification model using J48 (C4.5) using the fit data set and 10-fold cross validation. Determine the misclassification error rates (%) for both types of misclassifications from the confusion matrix.

Type I: a nfp module is classified as fp

Type II: a fp module is classified as nfp

Record the number of leaves and nodes in the selected tree, and represent the tree in the same way as in the textbook.

Repeat the previous tasks using the test data set to evaluate the model.

Part 2: Unpruned tree

Now in the J48 options, set the unpruned option to true. Rebuild the model in the same way as above, repeat all steps.

Now that you have represented the unpruned tree, compare with the tree generated above, and determine the part that was pruned.

Part 3: Confidence Factor

Now in the J48 options, set the confidence factor (C) to 0.01. Rebuild the model in the same way as for the initial tree (Part 1), repeat all the steps (of Part 1)

How does the size of the new tree compare to one built in Part 1? Explain why. What part was pruned?

Part 4: Cost sensitivity

Till now, we did not make any distinction between a Type I and a Type II error. However, in Software Quality Classification, a Type II error is more serious than a Type I error. Here, our objective is to obtain a balanced misclassification rates with Type II as low as possible.

Use the cost sensitive classifier combined with J48, and determine the optimal cost ratio (set cost of a type I error to 1 and vary the cost of the Type II error), using 10-fold cross validation on the fit data set. Observe the trends in the misclassification rates. What happens when the cost of a Type II error decreases/increases?

Evaluate all the models on the test data set.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Processing Fundamentals, Design, and Implementation

Authors: David M. Kroenke, David J. Auer

14th edition

133876705, 9781292107639, 1292107634, 978-0133876703

More Books

Students also viewed these Databases questions

Question

Azure Analytics is a suite made up of which three tools?

Answered: 1 week ago

Question

2. KMPG LLP

Answered: 1 week ago

Question

5. Wyeth Pharmaceuticals

Answered: 1 week ago