Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. In the car library (you will need to install it), there is a data set named Salaries that contains information on professors and their

1. In the car library (you will need to install it), there is a data set named Salaries that contains information on professors and their salaries. (a) (b) (C) (d) (f) (5%) Create a regression tree for a professor's salary given the remainder of the variables in the data set. Provide the tree, including labels s using the command text (treename, pretty=0) will provide more understandable split labelling for the questions that follow. Based on this tree, as an early career professor (Assistant or Associate) would you rather be in an 'applied' or 'theoretical' department? Why? Using set.seed(6421), perform 20fold crossvalidation using cv.tree. Plot the resulting object. How many terminal nodes does crossvalidation suggest? Give the predicted salary for me, assuming I was at this university. That is, what is the predicted salary for an Assistant Professor, in an applied department, who got their PhD in 2012 (5 years ago), has 4 years of service (usually counted as total amount of time as a university professor), and is male. You are welcome to either use the tree and your brain, or enter the data into B. and use the predictC) function. Use the following commands to setup a training and testing set: set.seed(763) trainindex < sample(1:nrow(Sa1aries), 200) proftrain < Salaries [trainindex, J proftest < Salaries [trainindex, 1 Now t a model to the training set and give the predicted salary for me again. Also provide the estimated MSE of the model a that is, calculate the MSE of the test set. Use set.seed(474) and then the randomForestO function to perform bagging on the full Salary data set (still using salary as the response). Give the MSE of this model. How does it compare to our estimate in the previous question? According to this model, what is the most important variable for predicting salary? Use set.seed (474) and then the randomForestO function to t a random forest model to the data. Give the MSE of this model. How does it compare to our estimate in the previous two questions? According to this model, what is the most important variable for predicting salary

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

An Introduction to the Mathematics of Financial Derivatives

Authors: Ali Hirsa, Salih N. Neftci

3rd edition

012384682X, 978-0123846822

More Books

Students also viewed these Mathematics questions