Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In this assignment you will examine data used by a Real Estate investment advisor. She wants you to answer some specific questions put by clients
In this assignment you will examine data used by a Real Estate investment advisor. She wants you to answer some specific questions put by clients about houses prices in the neighbourhood encompassed by 4 suburbs around city of Melbourne. The data is contained in the file Real_Estate.xls and contains the following columns (variables):
Variable Name
ID
Price Bedrooms Size
Pool
Distance Suburb Garage
Random Sample:
Description
House Identity number
Selling Price of the house (in 000s) Number of bedrooms
House Size (m2)
0=House without a Pool
1=House with a Pool
Distance from city centre (km) Suburb number
0=House without a Garage 1=House with a Garage
Before you begin your analysis, you are required to take a random sample of size 150 from the 170 cases in the file. Use the file Random_Sample_Generator-2.xls to do this. Answers to the questions below are to be based on your sample of 110 cases. Make sure to keep a safe copy of the sample you use since you cannot use Random_Sample_Generator-2.xls to reproduce it. Provide a printout of the data in your sample, with ID numbers in ascending order.
Part 1: Initial Data Analysis
1. Variable List
a. Using the variables listed in the table above, state for each variable whether it is
qualitative or quantitative.
b. If it is qualitative, state whether it is nominal or ordinal, and if it is quantitative, state
whether it is discrete or continuous.
2. Histogram
a. Create a histogram showing the distribution of selling price of the house.
b. Comment upon the shape of the distribution: is it symmetric? If it is not, is it positively or
negatively skewed?
c. Are there any outliers present? If so, are they of particular interest?
d. State which central measure would be best to use to describe the centre of this distribution,
and the reason(s) why.
3. Descriptive statistics
a. Prepare a table that shows the 5-number summary of price for houses in the 4 suburbs.
b. Construct side-by-side boxplots for the price of the houses in the 4 suburbs. Briefly
comment upon any differences you observe in house price for each suburb.
c. Are there any outliers present? If so, are they of particular interest?
d. State which central measure would be best to use to describe the centre of this distribution,
and the reason(s) why.
e. Prepare a summary table that shows the mean and standard deviation of Price for houses in
the 4 Suburbs according (subject) to the variable Bedrooms. Think carefully about the layout of the rows and columns of your table. As well as means and standard deviations you should also include the number of houses in each group. So each cell in your final table should contain the mean, the standard deviation and n, the number of houses in that group.
f. Refer to part (e). Comment, in bullet point form, on the Price of any combinations for Suburb and Bedrooms variables (i.e. cells in the table).
4. Statistical inferences
One of the clients wants information on size of houses as it relates to price.
a. Produce a scatter plot of Price vs Size (Size should be on the horizontal axis). Make sure you
label your axes properly and that your graph has an appropriate title.
b. Refer to part (a). Briefly, describe the nature of the relationship between these 2 variables.
c. Now, create a new variable (column) labelled Size Group which divides Size up into two size
groups as follows:
Under 200 square meters Small 200 square meters and over Large
I. Produce suitable graphs or charts to help in providing the information requested on the Size of the house as it relates to Price.
II. Construct 95% confidence interval for small and large houses Price.
III. Refer to (ii). Is there any interaction (overlap) between the 2 Confidence Intervals?
What does this tell you about the Prices for the two Sizes.
Part 2: Research Questions
Based on your random sample, identify and investigate TWO research questions of your own using inferential statistics (estimation and hypothesis testing).
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started