Question
1. This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to
1. This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010.
Fit the logistic regression model using a training data period from 1990 to 2008, with Lag1, Lag2, and Lag3 as predictors and Direction as the response. Get the overall fraction of correct predictions for the held out data (that is, the data from 2009 and 2010).
2. Following problem 1, do the same by employing the KNN method learned in the course, using a) K = 3 and then b) K = 5.
3. This problem tests the understanding of KNN. You may use Excel or other calculation tools in your computer to facilitate your computation, but there is no need to submit the resulting files, if any.
The table below provides a training data set containing six observations, three predictors (X1, X2, and X3), and one binary response variable (Y). Do not scale the data in your analysis.
a.Compute the distance between each observation and a test point X1 = -1, X2 = 0, X3 = 1. Please show the formula of your computation for Observation 1, and simply report results for other observations.
b.Suppose that you would like to use this data set to make a prediction for the test point X1 = -1, X2 = 0, X3 = 1 using K-Nearest Neighbors. What is the prediction with K = 1? What is the prediction with K = 3? Show the calculation that leads to your result.