(a) Construct a correlation matrix with price, acres, bedrooms, bathrooms, square feet, age, and rooms. Is there...
Question:
(a) Construct a correlation matrix with price, acres, bedrooms, bathrooms, square feet, age, and rooms. Is there any reason to be concerned with multicollinearity based on the correlation matrix?
(b) Find the least-squares regression equation
yÌ… = b0 + b1x1 +b2x2 + b3x3 + b4x4 + b5x5 + b6x6, where
x1 is acres, x2 is bedrooms, and so on.
(c) Test H0: bi = 0 versus H1: at least one of the βi ≠0 at the a = 0.05 level of significance.
(d) Test the hypotheses H0: βi = 0 versus H1: βi ≠0 for i = 1, 2, . . . , 6, at the a = 0.05 level of significance.
(e) Examine your regression results and remove any explanatory variable whose coefficient is not significantly different from 0 to obtain the model of best fit.
(f) Once you have obtained your model of best fit, draw residual plots and a boxplot of the residuals to assess the adequacy of the model.
(g) Determine and interpret R2 and adjusted R2. How well does your model appear to fit the data?
(h) Use your model to predict the selling price for another house from the agent's territory that has the following characteristics: 0.18 acre, 3 bedrooms, 1 bath, 1176 square feet, 47 years old, and 6 total rooms. Compare your prediction to the actual selling price: $99,900. Location, location, location! The location of the house can have a large effect on its selling price. The first 12 houses listed are from the same zip code, the next 10 are from a second zip code, and the last 6 are from a third zip code.
(i) Construct side-by-side boxplots of selling price for the three zip codes. Is there any reason to believe that selling prices vary from one zip code to the next within the agent's territory?
(j) Introduce dummy explanatory variables to represent zip code and repeat parts (b)-(g) to find the model of best fit for price.
(k) Repeat part (h) assuming the house comes from the first zip code. Which model did a better job predicting the selling price?
(l) Explain the limitations of this model. Which, if any, can be dealt with, and how would you do so?
During the early 2000s, the United States experienced a boom in the housing industry, in large part due to efforts by the government to boost consumer spending. For many, the lure of low interest rates put them in the market for a house.
When house shopping, a natural question is "How much is the house worth?" This is difficult to answer because it depends on what the market will bear-the house is worth what someone else is willing to pay for it.
A real estate agent wishes to examine several recent house sales in his territory and develop a model that could be used to give a rough idea of a house's fair market value. Articles on how to determine the value of a house often suggest comparing square footage, number of bedrooms, number of bathrooms, and size of the lot. The agent decided to examine these four variables, along with the age of the house and number of rooms, in an effort to predict the house's selling price.
Step by Step Answer:
Statistics Informed Decisions Using Data
ISBN: 9780134133539
5th Edition
Authors: Michael Sullivan III