Question
all data sets can be found in the faraway package installed by `install.packages(faraway)` 1. Using the sat data set (from package faraway), fit a model
all data sets can be found in the "faraway" package installed by `install.packages("faraway")` 1. Using the sat data set (from package faraway), fit a model with the total SAT score as the response, and expend, salary, ratio, and takers as predictors. Answer the following parts. In parts (b) through (g), you should draw a specific conclusion and clearly refer to the diagnostic tool(s) (plots or statistics) you used to draw your conclusion. (a) [2 pts] Produce the four default diagnostic plots given by R. (b) [2 pts] Using an appropriate diagnostic plot, check whether the assumed mean function is appropriate. (c) [2 pts] Check the assumption of constant variance. (d) [2 pts] What is the largest (most positive) least-squares residual value? What is the smallest (most negative) least-squares residual value? (Give accurate values don't just use the plots!) (e) [1 pt] Which observation has the greatest leverage value? (f) [2 pts] Check for outliers. (You may use diagnostic plots a formal test is not necessary.) (g) [2 pts] Check for influential points. 2. [13 pts] Using the eco data set (from package faraway), fit a model with home as the response and all of the other variables as predictors. (a) [2 pts] Produce the four default diagnostic plots given by R. (b) [2 pts] Using an appropriate diagnostic plot, check whether the assumed mean function is appropriate. (c) [2 pts] Check the assumption of constant variance. (d) [2 pts] What is the largest (most positive) least-squares residual value? What is the smallest (most negative) least-squares residual value? (Give accurate values don't just use the plots!) (e) [1 pt] Which observation has the greatest leverage value? (f) [2 pts] Check for outliers. (You may use diagnostic plots a formal test is not necessary.) (g) [2 pts] Check for influential points. 3. Using the swiss data set, fit a model with Fertility as the response and all of the other variables as predictors. Answer the following: (a) [2 pts] Produce a plot of the standardized residuals ri versus the ordinary (least squares) residuals i . (Show R code.) (b) [2 pts] The points in this plot do not exactly fall on a straight line. Briefly explain why. [ Hint: What is the formula for the standardized residuals? ] (c) [2 pts] List the studentized residuals ti (which are used as test statistics in the Mean Shift Test). (d) [2 pts] Perform the Mean Shift Test without Bonferroni adjustment, using = 0.05. Which provinces are identified as outliers? (e) [2 pts] Perform the Mean Shift Test with Bonferroni adjustment, using = 0.05. Which provinces are identified as outliers? 4. Using R, produce a grid of 9 normal probability plots (qqnorm) for samples of size n = 50 simulated independently from a geometric distribution. Specifically, simulate using the R function rgeom with parameter prob equal to 0.4. (Refer to the last section of the Diagnostics in Linear Regression slides.) (a) [2 pts] Display your plots and the R code you used to produce them. (b) [2 pts] Describe two distinct ways in which these plots tend to differ in appearance from what you would expect for normally-distributed data.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started