Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

So I currently have this problem that I need to solve, but I am very stuck on the actual solution. It revolves around the algorithm

image text in transcribed

So I currently have this problem that I need to solve, but I am very stuck on the actual solution. It revolves around the algorithm for building a CART model in case of regression.

I've added the images for the problem here in the question:

image text in transcribedimage text in transcribed
Problem 1: (30 points) Consider the algorithm for building a CART model in the case of regression. Following and ex- panding on the notation from class, suppose that our current tree, denoted by Told, has |Told| = M terminal nodes/buckets. For each bucket m = 1, ..., M, let: 1. Am denote the number of observations in bucket m , 2. Qm(Told) denote the value of the impurity function at bucket m , and 3. Am denote the region in the feature space corresponding to bucket m . Also let N be the overall total number of observations. Recall that, in the case of regression we have that: Qm(Told) = No E ( yi - ym)? i:rie Rm where ym = N Zizi ER, yi is the mean response in bucket m.Then the total impurity cost of the tree Told is defined as: M Cimp (Told) = _ Nmom(Told) . m=1 Consider a potential split at the final bucket M (we're using M just for ease of notation), which results in a new tree Tnew. This new tree has |new| = M + 1 terminal nodes/buckets, and for this new tree we let 1. Am denote the number of observations in bucket m , 2. Qm(Tnew) denote the value of the impurity function at bucket m , and 3. Am denote the region in the feature space corresponding to bucket m . The total impurity cost of the tree Tnew is defined analogously as: M+1 Cimp(Tnew) = > NmQm(Tnew) . m=1 Please answer the following: a) (10 points) Let A = Gimp(Told) - Cimp(Tnew) be the absolute decrease in total impurity resulting from the split. Derive a formula for A that can be computed locally at the bucket M, in other words it should only depend on the data points that fall in region Ry in the original tree Told. (Hint: we've discussed this concept in class, this question is asking for a more formal argument. You may assume that the two new buckets in Tnew resulting from the split are labeled as buckets M and M + 1 in Thew.) b) (10 points) Show that A 2 0, hence splitting always reduces the total impurity cost. (Hint: you can use the fact that, given a sequence of real numbers 21, 22, ..., Zn, the mean 2 = n Lin z is the minimizer of the function RSS(z) = >_1(z - z)") c) (10 points) Let Rog be the training set R" value for the model defined by Told, and likewise let Rnew be the training set R' value for the model defined by Thew. Let SST = > N (yi - 9)2 be the total sum of squared errors, where y = , _"_, y; is the overall mean. For a given value of the complexity parameter (cp) o 2 0, recall the modified cost function that is relevant in the pruning step: Ca(T) = Cimp (T) + Q . SST . IT| Show that Ca(Tnew) S Ca(Told) if and only if R2- Rnew - Rold 2 0. (Hence the choice of retaining a split if the increase in R is at least o is equivalent to retaining a split if the modified cost function is smaller after the split.)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Differential Geometry And Continuum Mechanics

Authors: Gui Qiang G Chen, Michael Grinfeld, R J Knops

1st Edition

331918573X, 9783319185736

More Books

Students also viewed these Mathematics questions