Question
According to this data, This data set consists of a previous semester's students' scores in STAT 2118. The variables from the data set are the
According to this data, This data set consists of a previous semester's students' scores in STAT 2118. The variables from the data set are the scores in eight homeworks, two midterms, and the final. The objective is to study the final exam score and which variables influence it.
The goal is to fit the best multiple regression model to the response (final exam score).
1.Fit the full model with all ten predictor variables.
a)Does there appear to be any multicollinearity?
b)Are there any high influence points (use a threshold of Cook's D greater than 0.5)?
If there are any outlier(s), remove the observation(s) for the rest of the exercises.
2. Use the stepwise regression methods (use = 0.1) to see which model is the best. Repeat using "best subset" regression. Do they agree?
3. Come up with one model that you think best describes the data and can be used for future predictions. Show the residual plot for this one. Does the model seem appropriate?
4. Use this model to predict what your final exam score will be (it is out of 50). Construct a 95% prediction interval for this estimate.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started