Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Hitters data set contains information on Major League Baseball from the 1986 and 198:! seasons. Among other information, it contains 1983? annual salary of baseball

image text in transcribed
image text in transcribed
Hitters data set contains information on Major League Baseball from the 1986 and 198:! seasons. Among other information, it contains 1983? annual salary of baseball players [in thousands of dollars) on opening dayr of the season. Our goal is the predict salaries of the players. This data set is available from the lSLR package. 1. Remove the observations with unknown salary information. How many observations were removed in this process? 2. Generate logtransform the salaries. Can you justifythis transformation? 3. Create a scatterplot with Hits on the yaxis and Years on the xoris using all the observations. Color code the observations using the log Salary variable. What patterns do you notice on this chart, if any? 4. Run a linear regression model of Log Salary on all the predictors using the entire dataset. Use regsubsetsO function to perform best subset selection tom the regression model. Identify the best model using BIC. 1Which predictor variables are included in this {best} model? 5. Now create a training data set consisting of So percent of the observations, and a test data set consisting of the remaining observations. 6. Generate a regression tree of log Salary using only Years and Hits variables from the training data set. 1Which players are likely to receive highest salaries according to this model? Write down the rule and elaborate on it. 3?. Now create a regession tree using all the variables in the training data set. Perform boosting on the training set with 1,000 trees for a range of values of the shrinkage parameter it. Produce a plot with different shrinkage values on the K- anis and the corresponding training set MSE on the yaxis. 8. Produce a plot with different shrinkage values on the Kariis and the corresponding test set MSE on the yaxis. 9. 1Which variables appear to be the most important predictors in the boosted model? 10. Now apply bagging to the training set. 1What is the test set MSE for this approach

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

College Algebra And Trigonometry A Unit Circle Approach,

Authors: Mark Dugopolski

5th Edition

0321908252, 9780321908254

More Books

Students also viewed these Mathematics questions