Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. In the car library (you will need to install it), there is a data set named Salaries that contains information on professors and their
1. In the car library (you will need to install it), there is a data set named Salaries that contains information on professors and their salaries. (a) (b) (C) (d) (f) (5%) Create a regression tree for a professor's salary given the remainder of the variables in the data set. Provide the tree, including labels s using the command text (treename, pretty=0) will provide more understandable split labelling for the questions that follow. Based on this tree, as an early career professor (Assistant or Associate) would you rather be in an 'applied' or 'theoretical' department? Why? Using set.seed(6421), perform 20fold crossvalidation using cv.tree. Plot the resulting object. How many terminal nodes does crossvalidation suggest? Give the predicted salary for me, assuming I was at this university. That is, what is the predicted salary for an Assistant Professor, in an applied department, who got their PhD in 2012 (5 years ago), has 4 years of service (usually counted as total amount of time as a university professor), and is male. You are welcome to either use the tree and your brain, or enter the data into B. and use the predictC) function. Use the following commands to setup a training and testing set: set.seed(763) trainindex < sample(1:nrow(Sa1aries), 200) proftrain < Salaries [trainindex, J proftest < Salaries [trainindex, 1 Now t a model to the training set and give the predicted salary for me again. Also provide the estimated MSE of the model a that is, calculate the MSE of the test set. Use set.seed(474) and then the randomForestO function to perform bagging on the full Salary data set (still using salary as the response). Give the MSE of this model. How does it compare to our estimate in the previous question? According to this model, what is the most important variable for predicting salary? Use set.seed (474) and then the randomForestO function to t a random forest model to the data. Give the MSE of this model. How does it compare to our estimate in the previous two questions? According to this model, what is the most important variable for predicting salary
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started