Show the codes in rstudio also please specify which code use for which question The following questions (3) to (8) should be answered using the
Show the codes in rstudio also please specify which code use for which question
The following questions (3) to (8) should be answered using the Weekly data set, which is part of the
ISLR package. This dataset contains 1089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010.
(3) Use require(ISLR) and library (ISLR) to load the ISLR package.
a) Use summary( ) function to produce some numerical summaries of the Weekly data.
b) Use pairs ( ) function to produce a scatterplot matrix of the variables of the data.
c) is there any relationship between Year and Volume? What is the pairwise correlation value
between Year and Volume?
d) Is the relationship positive or negative?
(4) Use the full dataset to perform a logistic regression with Direction as the dependent variable and
Lag1, Lag2, Lag3, Lag4 and Volume as independent variables (i.e. predictors). Use the summary()
function to print the results. use any of the predictors appear to be statistically significant? If so, which
ones?
(5) Based on 4)'s results, compute the confusion matrix and overall faction of correct predictions (Hint: use 0.5 as the predicted probability cut-off for the classifier). What is the precision rate? What is the recall rate?
(6) Now fit the logistic regression model using a training data period from 1990 to 2009 with Lag 2 as
the only predictor. Compute the confusion matrix and the overall fraction of correct predictions for the held out data (i.e. test data) (the data from 2010). In addition, please calculate the precision rate and recall rate.
probability cut-off for the classifier). Take a screenshot of your output and then answer the questions.
(7) Repeat (6) using KNN with K=1. Compute the confusion matrix and the overall fraction of correct
predictions for the held-out data. In addition, please calculate the precision rate and recall rate. (Hint: If encounter some errors such as "dims of 'test' and 'train' differ", try to use knn(data.frame(train.X), ...) ). (Use set.seed(1))
(8) Repeat (6) using KNN with K=10. Compute the confusion matrix and the overall fraction of correct predictions for the held-out data. In addition, please calculate the precision rate and recall rate.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started