Refer to the scenario in Problem 42 regarding the identification of undecided voters. Apply a classification tree
Question:
Refer to the scenario in Problem 42 regarding the identification of undecided voters. Apply a classification tree to classify observations as undecided or not by using Vote as the target (or response) variable. Set aside 20% of the data as a test set and use 80% of the data for training and validation.
a. Train a full classification tree and report its AUC from a validation experiment.
b. Train a pruned classification tree and report its AUC from a validation experiment.
c. Compare the AUC from the full classification tree in part
(a) to the AUC from the pruned classification tree in part (b). Explain the difference.
d. Consider a 50-year-old man who attends church, has 15 years of education, owns a home, is married, lives in a household of four people, and has an annual income of $150,000. For a cutoff value of 0.5, does the pruned tree classify this observation as Undecided?
e. For the pruned tree, compute and interpret the precision on the test set.
Problem 42
Campaign organizers for both the Republican and Democratic parties are interested in identifying individual undecided voters who would consider voting for their party in an upcoming election. A non-partisan group has collected data on a sample of voters with tracked variables. The variables in this data are listed in the following table.
Apply logistic regression with lasso regularization to classify observations as being undecided or not by using Vote as the target (or response) variable. Set aside 20% of the data as a test set and use 80% of the data for training and validation.
Step by Step Answer:
Business Analytics
ISBN: 9780357902219
5th Edition
Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann