Exercise 7.10 In choosing which feature to split on in decision-tree search, an alternative heuristic to the

Question:

Exercise 7.10 In choosing which feature to split on in decision-tree search, an alternative heuristic to the max information split of Section 7.3.1 is to use the Gini index.

The Gini index of a set of examples (with respect to target feature Y) is a measure of the impurity of the examples:

giniY(Examples) = 1 −Σ

Val

|{e ∈ Examples : val

(e, Y) = Val}|

|Examples|

2 where |{e ∈ Examples : val

(e, Y) = Val}| is the number of examples with value Val of feature Y, and |Examples| is the total number of examples. The Gini index is always non-negative and has value zero only if all of the examples have the same value on the feature. The Gini index reaches its maximum value when the examples are evenly distributed among the values of the features.

One heuristic for choosing which property to split on is to choose the split that minimizes the total impurity of the training examples on the target feature, summed over all of the leaves.

(a) Implement a decision-tree search algorithm that uses the Gini index.

(b) Try both the Gini index algorithm and the maximum information split algorithm on some databases and see which results in better performance.

(c) Find an example database where the Gini index finds a different tree than the maximum information gain heuristic. Which heuristic seems to be better for this example? Consider which heuristic seems more sensible for the data at hand.

(d) Try to find an example database where the maximum information split seems more sensible than the Gini index, and try to find another example for which the Gini index seems better. [Hint: Try extreme distributions.]

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Answer rating: 100% (QA)

Answered By

Tobias sifuna

I am an individual who possesses a unique set of skills and qualities that make me well-suited for content and academic writing. I have a strong writing ability, allowing me to communicate ideas and arguments in a clear, concise, and effective manner. My writing is backed by extensive research skills, enabling me to gather information from credible sources to support my arguments. I also have critical thinking skills, which allow me to analyze information, draw informed conclusions, and present my arguments in a logical and convincing manner. Additionally, I have an eye for detail and the ability to carefully proofread my work, ensuring that it is free of errors and that all sources are properly cited. Time management skills are another key strength that allow me to meet deadlines and prioritize tasks effectively. Communication skills, including the ability to collaborate with others, including editors, peer reviewers, and subject matter experts, are also important qualities that I have. I am also adaptable, capable of writing on a variety of topics and adjusting my writing style and tone to meet the needs of different audiences and projects. Lastly, I am driven by a passion for writing, which continually drives me to improve my skills and produce high-quality work.

5.00+ 1+ Reviews 10+ Question Solved

Related Book For book-img-for-question