Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The following figures illustrate a step in the application of the Classification Tree method to the Riding Mowers case study ( 2 4 total observations

The following figures illustrate a step in the application of the Classification Tree method to the Riding Mowers case study (24 total observations).
The first three splits are shown on the scatter plot, in which Region 1 has 1 observation, Region 2 has 7 observations, Region 3 has 9 observations, and Region 4 has 7 observations.
The tree diagram shows the classification algorithm that corresponds to the displayed three splits.
Figure 1 Figure 2
1. Compute the Gini impurity measure for Region 2 in the scatter plot.
2. Compute the Gini impurity measure for Region 3 in the scatter plot.
3. Compute the Gini impurity measure for Region 4 in the scatter plot.
4. Compute the Combined Gini Impurity for the entire dataset at this step of the algorithm.
Consider the following Test dataset consisting of 4 consumers and their actual ownership status:
Obs # Income Lot_Size Ownership
1110.120.8 Owner
2108.017.2 Owner
382.820.4 Non
469.017.6 Owner
The algorithm shown in the tree diagram in Figure 2 will correctly classify observation #1 as an Owner (class 1) and incorrectly classify observation #2 as a Non (class 0). With this as a hint to enable you to
verify your understanding of the method, use the tree diagram to classify the remaining observations. Report your answers via the next two questions.
5. How will the algorithm shown in the tree diagram classify observation #3 from this Test dataset? (Owner vs. Non)
6. How will the algorithm shown in the tree diagram classify observation #4 from this Test dataset? (Owner vs. Non)
With 1 and 0 representing "Owner" and "Non" respectively, below is the Confusion Matrix for the classification algorithm encoded in the tree diagram in Figure 2 when applied to the entire 4- observation Test dataset presented above:
7. What is the numerical value of B in the Confusion Matrix?
8. What is the numerical value of C in the Confusion Matrix?
9. Is the tree in Figure 2 fully-grown?
10. Would pruning the tree in Figure 2 increase the risk of overfitting?
The following questions consider part 1 of this module's exercises (Classification Tree applied to the Undecideds dataset, with pruning). Answer all questions for the Best Pruned tree resulting from a 50:30:20 partition using seed=10101.
11. Which of the predictor variables were determined by the tree to not be sufficiently relevant to the classification decision?
12. Suppose Voter A has these attributes: Age=39, HomeOwner=0, Female=0, Married=0, HouseholdSize=2, Income=99, Education=12, Church=1. After the tree makes its prediction for Voter A, you discover that the values for Church and Married were misrecorded. What can you say about how the tree's classification of Voter A will be affected by this discovery?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Processing Fundamentals, Design, and Implementation

Authors: David M. Kroenke, David J. Auer

14th edition

133876705, 9781292107639, 1292107634, 978-0133876703

More Books

Students also viewed these Databases questions