Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Exercise 5.7 (Decision trees, regularization; Section 5.6) Pruning a decision tree: Let S be a labeled sample drawn iid from some distribution D over 0,1),
Exercise 5.7 (Decision trees, regularization; Section 5.6) Pruning a decision tree: Let S be a labeled sample drawn iid from some distribution D over 0,1), and suppose we have used S to create some decision tree T. However, the tree T is large, and we are concerned we might be overfitting. Give a polynomial-time algorithm for pruning T that finds the pruning h of T that optimizes the right-hand-side of Corollary 5.8, i.e., that for a given ? > 0 minimizes errs(h) +?l size(h) In(4) + In(2/6) 21S To discuss this, we need to define what we mean by a "pruning" of T and what we mean by the "size" of h. A pruning h of T is a tree in which some internal nodes of T have been turned into leaves, labeled "+" or "depending on whether the majority of eramples in S that reach that node are positive or negative. Let size(h) = L(h) log(n) where L(h) is the number of leaves in h Hint # 1: it is sufficient, for each integer L = 1,2, ,L(T), to find the pruning of T with L leaves of lowest empirical error on S, that is, hL = argminh:L(h)-Lerrs(h). Then you can just plug them all into the displayed formula above and pick the best one Hint #2: use dynamic programming Exercise 5.7 (Decision trees, regularization; Section 5.6) Pruning a decision tree: Let S be a labeled sample drawn iid from some distribution D over 0,1), and suppose we have used S to create some decision tree T. However, the tree T is large, and we are concerned we might be overfitting. Give a polynomial-time algorithm for pruning T that finds the pruning h of T that optimizes the right-hand-side of Corollary 5.8, i.e., that for a given ? > 0 minimizes errs(h) +?l size(h) In(4) + In(2/6) 21S To discuss this, we need to define what we mean by a "pruning" of T and what we mean by the "size" of h. A pruning h of T is a tree in which some internal nodes of T have been turned into leaves, labeled "+" or "depending on whether the majority of eramples in S that reach that node are positive or negative. Let size(h) = L(h) log(n) where L(h) is the number of leaves in h Hint # 1: it is sufficient, for each integer L = 1,2, ,L(T), to find the pruning of T with L leaves of lowest empirical error on S, that is, hL = argminh:L(h)-Lerrs(h). Then you can just plug them all into the displayed formula above and pick the best one Hint #2: use dynamic programming
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started