Question
Data set for the tasks: https://drive.google.com/file/d/14hxIfvGvAaNqOBoq8cdwnKc864Q-X9Ey/view?usp=sharing Loxton is a small town with two suburbs. The data file Major Project - Data Set contains data on
Data set for the tasks: https://drive.google.com/file/d/14hxIfvGvAaNqOBoq8cdwnKc864Q-X9Ey/view?usp=sharing
Loxton is a small town with two suburbs. The data file "Major Project - Data Set" contains data on 545 houses sold in Loxton between 2015 and 2020. This data includes the price at which the house was sold, which of two agents sold the house (all houses are sold through an agent by law), the year in which the house was sold as well as data on various characteristics of each house sold (age, size, number of stories etc.). These characteristics serve as possible explanatory variables of sale price.
Data definitions follow:
OBS = observation
AGE = age of house in years
SHOPS = 1 if house is close to shopping precinct, 0 otherwise
CRIME = crime rate of the suburb within which the house is located
TOWN = distance in kilometres to the town centre
STORIES = number of dwelling stories
OCEAN = 1 if house has an ocean view, 0 otherwise
POOL = 1 if house has a pool, 0 otherwise
PRICE = price at which the house was sold (in dollars)
SELLER = selling agent - "W&M" (0) or "A&B" (1)
SIZE = size of the house in square metres
SUBURB = Mayfair (0) or Claygate (1)
TENNIS = 1 if house has a tennis court, 0 otherwise
SOLD = year of last sale (2015 to 2020)
Task 1
You are required to provide a comprehensive summary of the data set contained in the "Major Project - Data Set" file. How you choose to do this is entirely at your discretion. However, it is recommended that you consider using both summary statistic and graphical methods while also noting any peculiarities within the data set.
Task 2
You have been hired by Jane, the wealthy owner of a house on Elm Street in Loxton (not included in the data set) to predict the price at which her house will sell. Her house has two stories, is in Claygate, is 192 square metres large, is not near a shopping precinct and is 10 km from the town centre. She estimates that the house is about 10 years old and in a low crime area according to her experiences. Jane inherited the house from her uncle and is therefore unsure when it was last sold. Some other features of the property can be seen below:
You are expected to build a regression model of house prices. In doing so, make sure that you use an appropriate number of predictors to develop your estimates. Once you have constructed an appropriate model, use it to obtain and provide for Jane's house:
1. A point prediction of the sales price which it can be expected to fetch
2. A 95% interval prediction for this sale price
3. An estimate of the marginal effect of house size on this sale price
4. Financial advice on whether Jane should use "W&M" or "A&B" to sell her house. "W&M" charges a commission of 5% whereas "A&B" charges a commission of 10% of the final sale price.
Jane, who claims to have some knowledge of regression analysis, has stressed that she thinks you should use a regression model with an R2 of at least 85%.
Note: Task 1 directed you to take note of any peculiarities in the data set. There are other additional errors in the data set that you may not have picked up on in Task 1. These will only become clear to you once you start working on Task 2. Several problems can result if you fail to handle these issues correctly, so be mindful to address them, both in your regression application as well as your final report. If resolving any of the errors in the dataset requires you to make assumptions, make sure to clearly state your reasoning and approach in your report.
Task 3
Please provide a reflective discussion on how you executed Task 2 of the project above. Specifically consider the following:
1. Verify that your regression model does not suffer from any misspecification errors and provide the relevant regression diagnostics which support your findings.
2. If you found that your model is in fact partially misspecified in part (1) of Task 3 above, explain what you did to ensure that the misspecification only has a minimal impact on your results in Task 2 above. 3. Were there any other oddities in the data set or your model? Explain. 4. Is there anything else worth mentioning which is relevant to your work or to your results for Jane?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started