Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use

Consider a labeled data set containing 100 data instances which are randomly partitioned into two sets A and B, each containing 50 instances. We use A as the training set to learn two decision trees T 10 with 10 leaf nodes and T 100 with 100 leaf nodes. The accuracies of the two decision trees on data sets A and B are shown below:

Data Set T10 0.86 T300 0.97 0.77 0.84

(a) Based on the accuracies shown in the table above, which classification model would you expect to have better performance on unseen instances?

(b) Now you've tested T 10 and T 100 on the entire dataset (A + B) and found that the classification accuracy of T 10 on the data set (A + B) is 0.85, whereas the classification accuracy of T 100 on the data set (A + B) is 0.87. Based on this new information and your observations from the table above, which classification model would you finally choose for classification?

Data Set T10 0.86 T300 0.97 0.77 0.84

Step by Step Solution

3.52 Rating (145 Votes )

There are 3 Steps involved in it

Step: 1

a you should choose T10 on unseen instances because it ha... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction to Algorithms

Authors: Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest

3rd edition

978-0262033848

More Books

Students also viewed these Mathematics questions