Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data Exploration and Multiple Linear Regression (MLR) using SAS The College data set contains the statistics for a large number of US Colleges from the

image text in transcribed
Data Exploration and Multiple Linear Regression (MLR) using SAS The "College" data set contains the statistics for a large number of US Colleges from the 1995 issue of US News and World Report. It has 777 observations on 18 variables. The colleges war to predict the enrollment of the student for the next semester based on the past data available. For a description of the data see "College.txt" in Canvas which contains college data and attribute information. Main task is to check if the number of enrollments is dependent on the characteristics of university. "Private" is the dummy variable. Do the dummy coding accordingly (See the "Regression with Dummy Variables in SAS.docx in Canvas) 1. Generate box-plots of the accept (Number of applications accepted), top10perc (\% of new students from top 10% of High School class) attributes and the dependent variable enroll (Number of new students enrolled) and identify/remove the cutoff values for outliers. 2. Try to fit an MLR to this dataset, with ENROLL as the dependent variable. P.UNDERGRAD has somewhat longish tail, so we will take a log transform, (use LP UNDERGRAD =log(P UNDERGRAD)) and then use LP_UNDERGRAD as one of predictor. Keep the first 544 records as a training set (call it ENROLLTRAIN) which you will use to fit the model; the remaining 233 will be used as a test set (ENROLLTEST). 3. Use only the following variables in your model: ENROLL =ACCEPT + TOP1OPERC + F UNDERGRAD + LP UNDERGRADE + ROOM_BOARD + GRADE_RATE + PRIVATEDUMMY (a) Report the coefficients obtained by your model. Would you drop any of the variables used in your model (based on the t-scores or p-values)? (b) Report the MSE obtained on ENROLLTRAIN. How much does this increase when you score your model on ENROLLTEST? (c) (Bonus 2 points). Do you think your MLR model is reasonable for this problem? You may look at the distribution of residuals to provide an informed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction To Materials Management

Authors: Arnold J. R. Tony, Gatewood Ann K., M. Clive Lloyd N. Chapman Stephen

8th edition

9386873249, 134156323, 978-9386873248

More Books

Students also viewed these General Management questions