Question
Suppose that you were hired to examine if there is a link between property values and proximity to the Airport.You use Redfin to collect housing
Suppose that you were hired to examine if there is a link between property values and proximity to the Airport.You use Redfin to collect housing prices and distance to the Airport from 317 home sales in zip code 92833 (contains airport) sold within the last year.The Redfin data also provides you information about other factors that affect local housing prices.The core regression that you run is as follows:
ln() = 1 + 2i + 3ln() + 4 + 52 + 6ln() + 7 + 8 +
:() :() :() : : ::1() :()
ln( ):
You also run a series of diagnostic tests to identify potential problems with heteroscedasticity, imperfect multicollinearity, and functional form issues.The results of these regressions are in the separate packet of regression results.Please consult them as you answer questions 1-10.
Page 2 of 12
1. Suppose that you want to test the functional form of your core regression.Please perform a regression equation specification error test (RESET). In your answer, please clearly provide the null and alternative hypothesis of your test, the test statistic that you will use, the critical value of your test statistic/p value of your statistic, and the result of the test.(12 points)
2. Based on your results of the RESET from part a, would you recommend changing the functional form of the regression?Why or why not?(6 points)
Page 3 of 12
3. Now you want to run a general test for heteroscedasticity on the core regression.Please perform the White test.In your answer, please clearly provide the null and alternative hypothesis of your test, the test statistic that you will use, the critical value of your test statistic/p value of your statistic, and the result of the test. (12 points)
4. Based on your results of the White test from part d, would you estimate the regression any differently to account for any heteroscedasticity?Why or why not?(6 points)
Page 4 of 12
5. According to the core model results (regression 1), are there any problems related to imperfect multicollinearity in the regression?(8 points)
6. Please consider the effects of omitted variable bias, functional form problems, imperfect multicollinearity, and heteroscedasticity on regression results in general (not just this specific regression).Which of these problems is a violation of the classical linear model assumptions (there may be multiple that are problems; no explanation needed)?(8 points)
Page 5 of 12
7. Please perform the F test of the model as a whole on the core regression (regression 1).In your answer, please clearly provide the null and alternative hypothesis of your test, the test statistic that you will use, the critical value of your test statistic/p value of your statistic, and the result of the test.(12 points)
8. Please interpret the parameter estimate for the lot size variable in the core regression (regression 1, 3 = 0.146).What effect does an increase in the lot size have on its price?(Hint:please be careful with the units)(6 points)
Page 6 of 12
9. Please interpret the parameter estimate for the square footage variable in the core regression (regression 1, 2 = 0.00017).What effect does an increase in the square footage of a home have on its price?(Hint:please be careful with the units)(6 points)
Page 7 of 12
10. Please perform an F test of the claim that the year the home was built has no effect on housing prices (4 = 0,5 = 0).In your answer, please clearly provide the null and alternative hypothesis of your test, the test statistic that you will use, the critical value of your test statistic, and the result of the test.(12 points)
Page 8 of 12
11. Suppose that you are trying to model the determinants of high infant mortality.You have data at the county level about the infant mortality rate (Infant), per capita income (Income), and the number of doctors per 100,000 people (Docs) from 102 urban counties in the United States.Suppose that you want to choose between the linear and log-log functional forms for the regression:
= 1 + 2 + 3 +
ln() = 1 + 2 ln() + 3 ln() +
You estimate each regression and perform the MacKinnon-White-Davidson test.The test regressions are:
= 1 + 2 + 3 + 41, +
ln() = 1 + 2 ln() + 3 ln() + 52, + 1, = ln( ) 2, = exp( ) : :ln()
You obtain the following parameter estimates from the test regressions: 4 = 17.626;(4 ) = 12.472 5 = 0.239;(5 ) = 0.039 Does the test indicate that you should prefer the linear or the log-log model?If so, which one should you prefer?Please show your work.(12 points)
Page 9 of 12
Regression 1 (Core regression):
ln() = 1 + 2i + 3ln() + 4 + 52 + 6ln() + 7 + 8 + ln():natural logarithm function
ParameterStandard Error ofVariable Estimate ( ) Parameter Estimate (( )) t p-value of t
Constant
244.9069 60.73357 4.03248 6.96E-05
SQFT
0.00017 1.63E-05 10.41908 5.54E-22
()
0.145776 0.022413 6.504084 3.14E-10
-0.24011 0.06187 -3.88083 0.000127
^
6.18E-05 1.57E-05 3.9282 0.000106
()
0.016019 0.035124 0.456066 0.648663
0.142644 0.018751 7.607393 3.4E-13
0.00649 0.007964 0.814997 0.415702
R Squared
0.8574 Explained Sum of Squares (ESS)
20.712
3.445
24.157
F Statistic (Model as a Whole)
265.4 (p-value:<0.00001)
Residual Sum of Squares (RSS)
Sample size 317
Total Sum of Squares (TSS)
Page 10 of 12
Regression 2:
ln() = 1 + 2i + 3ln() + 4ln() + 5 + 6 +
ParameterStandard Error ofVariable Estimate ( ) Parameter Estimate (( )) t p-value of t
Constant
12.57631 0.137729 91.31188 1.2E-226
SQFT
0.000227 1.43E-05 15.87132 5.76E-42
()
0.039068 0.015571 2.508997 0.012616
()
0.000419 0.036865 0.011364 0.99094
0.192446 0.01782 10.79955 2.67E-23
0.002275 0.008363 0.272053 0.785762
R Squared
0.8388 Explained Sum of Squares (ESS)
20.264
3.893
24.157
F Statistic (Model as a Whole)
323.8 (p-value:<0.00001)
Residual Sum of Squares (RSS)
Sample size 317
Total Sum of Squares (TSS)
Page 11 of 12
Regression 3:
ln() = 1 + 2i + 3ln() + 4 + 52 + 6ln() + 7 + 8 + 9 ln() 2 + 10ln() 3 +
ln() :core regression's prediction of natural logarithm of housing price
ParameterStandard Error ofVariable Estimate ( ) Parameter Estimate (( )) t p-value of t
Constant
11313.65 5706.471 1.982601 0.054488
SQFT
-1409.03 710.8298 -1.98223 0.054531
()
34.66433 17.48713 1.982277 0.054526
16.49184 8.319639 1.982279 0.054525
^
-14.0929 7.109541 -1.98225 0.054529
()
0.653283 0.329572 1.982218 0.054533
-1.17069 0.590588 -1.98225 0.054529
-16.3087 8.227456 -1.98222 0.054532
()
-0.80363 0.4054 -1.9823 0.054523
( )
0.000211 0.000106 1.982313 0.054522
R Squared
0.8601 Explained Sum of Squares (ESS)
20.778
3.38
24.157
F Statistic (Model as a Whole)
209.7 (p-value:<0.00001)
Residual Sum of Squares (RSS)
Sample size 317
Total Sum of Squares (TSS)
Page 12 of 12
Regression 4:
Dependent variable:2(:1)
Independent variables:all independent variables, their squares, and cross products
Regression parameter estimates omitted for the sake of sanity (33 parameters/ in model)
Model statistics:
R Squared
0.1108 Explained Sum ofSquares (ESS)
0.0535
0.4294
0.4829
F Statistic (Model as a Whole)
1.106 (p-value:0.3243)
Residual Sum of Squares (RSS)
Sample size 317
Total Sum of Squares (TSS)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started