Question: Given the weather conditions, we want to predict if a person is going to go for a run or not. The data that we

Given the weather conditions, we want to predict if a person is going to go for a run or not. The data that we have collected are the following: Sample Features Outcome: Go for Run? Forecast Temperature 1 Sunny Cool Yes 2 Sunny Hot No 3 Overcast Cool Yes + Rain Cool Yes 5 Rain Hot No Overcast Hot Yes a) (15 points) Draw a depth = 2 tree. This means splitting the tree once on one variable and once on the other variable, using the Gini Index, which is defined as follows: Gini = =1Pmk (1-Pmk), where K is the number of classes, and pm represents the proportion of the training observations in the mth region that are from the kth class. Show the calculations at each step. You can round/approximate. b) (5 points) Pre-pruning - please draw the tree with maximum depth 1. c) (5 points) Now using the full tree and the tree of depth 1, evaluate the following test set, please provide the accuracy in the following test set. Please explain your findings. If a leaf node of the tree is not 100% pure, describe how you select in this scenario. (Hint: Even if your trees from above are incorrect, please provide what you would expect to find with regards to this test set and the two decision trees). Sample Features Outcome: Go for Run? Forecast Temperature 7 Sunny Cool Yes 8 Sunny Hot Yes 9 Overcast Cool No
Step by Step Solution
There are 3 Steps involved in it
a Lets start by calculating the Gini index for each feature and split 1 Temperature Split by Cool Gi... View full answer
Get step-by-step solutions from verified subject matter experts
