The Galapagos Islands data set is posted on our course website. The data set comes from "Data" by Andrews and Herzberg (1985), with a few missing values for Elevation filled in thanks to Julian Faraway. The variables for the 30 islands are: Record Number (Record), Island, Number of observed species (NoSpp), Number of native species (NativeSpp), Area in km (Area), Elevation in m, Distance from nearest island in km (NearestIsland), Distance from Santa Cruz in km (SantaCruz), Area of adjacent island in km" (AreaAdjacent). In this assignment we'll look at the relationship between the number of species and the area of the island. This is a fundamental relationship in the Theory of Island Biogeography, which models how species numbers increase with increased island area. If S is the number of species and A is the area of the island, then S = CA where C is a constant and y is a "biologically meaningful" parameter. We can convert this to a pseudolinear model by taking logs on both sides: log S = log C + ylog A. For theoretical reasons, then, we can expect log transforming both the area and the number of species will do a good job of linearizing the relationship (although it really doesn't matter which base of log we use, we'll use base 10 logs here for consistency). However, we'll start off "naive" about the Theory of Island Biogeography and try a couple of other plots first. (a) First try graphing a linear regression of NoSpp on Area. The fit is truly awful. Why? (b) The results in (a) would suggest that at the very least the variable Area needs to be transformed. Graph a regression of NoSpp on log, (Area). Obtain the coefficient of determination (r) for this fit. You should note this fit is substantially better than in (a) but still leaves much to be desired. What shortcomings do you see in the fit of the linear model here? (C) Now try a fit of log,,(NoSpp) on log,,(Area). You should find this fit quite aesthetically pleasing! Perform a regression analysis for this fitted model. (d) What is the estimated linear regression equation for the model in (c)? (e) Obtain 95% confidence intervals for the intercept and slope parameters. Use the results in (e) to obtain 95% confidence intervals for y and C. Plot log,,(NoSpp) versus log,,(AreaAdjacent). Is there a statistically significant (at the 5% level) linear relationship here