Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. Use theOJ (Orange Juice)data set for this question, which is part of theISLRpackage. Make sure to use R or R-studio for this question. You
1. Use theOJ (Orange Juice)data set for this question, which is part of theISLRpackage. Make sure to use R or R-studio for this question. You can find the data in ISLR package online. This data is also in the below link. Besides answers and steps, PLEASE also prepare the R file for R script (you can add screenshot of R script. Thanks!)
- (a)Creating a training set containing a random sample of 800 observations, and a test set containing the remaining observations.
- (b)Fit a tree to the training data, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics about the tree, and describe the results obtained. What is the training error rate? How many terminal nodes does the tree have?
- (c)Creating a plot of the tree, and interpret the results.
- (d)Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the test error rate?
- (e)Creating a pruned tree with four terminal nodes.
- (f)Compare the test error rates between the pruned and unpruned trees. Which is higher?
https://www.dropbox.com/scl/fi/bgil1agj5j05tkm0ywrkq/Orange-Juice-Data.docx?dl=0&rlkey=aplgmpuqztgaw8qurkvma3vke
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started