Refer to the scenario in Problem 42 regarding the identification of undecided voters. Apply k-nearest neighbors to
Question:
Refer to the scenario in Problem 42 regarding the identification of undecided voters. Apply k-nearest neighbors to classify observations as undecided or not by using Vote as the target (or response) variable. Set aside 20% of the data as a test set and use 80% of the data for training and validation.
a. Based on all the input variables, determine the value of k that maximizes the AUC in a validation procedure.
b. For the best-performing k-nearest neighbors model in the validation procedure (with respect to AUC), compute and interpret the precision on the test set.
Problem 42
Campaign organizers for both the Republican and Democratic parties are interested in identifying individual undecided voters who would consider voting for their party in an upcoming election. A non-partisan group has collected data on a sample of voters with tracked variables. The variables in this data are listed in the following table.
Apply logistic regression with lasso regularization to classify observations as being undecided or not by using Vote as the target (or response) variable. Set aside 20% of the data as a test set and use 80% of the data for training and validation.
Step by Step Answer:
Business Analytics
ISBN: 9780357902219
5th Edition
Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann