Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Problem 3 . ( 2 0 points ) In this problem, you will investigate building a decision tree for a binary classification problem. The training

Problem 3.(20 points) In this problem, you will investigate building a decision tree for a binary classification problem. The training data is given in Table 1 with 16 instances that will be used to learn a decision tree for predicting whether a mushroom is edible or not based on its attributes (Color, Size and Shape). Please note the label set is a binary set {Yes, No}. Table 1:
Instance Color Size Shape Edible?
D1
D2
D3
D5
D6
D7
D8
D9
D10
D11 D12 D13 D14 D15 D16 Yellow Yellow Green Green Yellow Yellow Yellow Yellow Green Yellow Yellow Yellow Yellow Yellow Yellow Yellow Small Small Small Large Large Small Small Small Small Large Large Large Large Large Small Large Round Round Irregular Irregular Round Round Round Round Round Round Round Round Round Round Irregular Irregular Yes
No Yes
No
Yes Yes Yes Yes
No
No Yes
No
No
No
Yes Yes
Mushroom data with 16 instances, three categorical features, and binary labels.
a) Which attribute would the algorithm choose to use for the root of the tree. Show the details of your calculations. Recall from lectures that if we let S denote the data set at current node, A denote the feature with values v EV, H denote the entropy function, and S, denote the subset of S for which the feature A has the value v, the gain of a split along the feature A, denoted InfoGain(S, A) is computed as: InfoGain(s,4)= H(S)- E (1H(S.) VEV That is, we are taking the difference of the entropy before the split, and subtracting off the entropies of each new node after splitting, with an appropriate weight depending on the number of training examples at each node.
b) Draw the full decision tree that would be learned for this data (assume no pruning and you stop splitting a leaf node when all samples in the node belong to the same class, i.e., there is no information gain in splitting the node).

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle Database 11g SQL

Authors: Jason Price

1st Edition

0071498508, 978-0071498500

More Books

Students also viewed these Databases questions