Question
State Location of the home: CA NJ NY PA Price Asking price (in $1,000's) Size Area of all rooms (in 1,000's sq. ft.) Beds Number
State | Location of the home: CA NJ NY PA |
Price | Asking price (in $1,000's) |
Size | Area of all rooms (in 1,000's sq. ft.) |
Beds | Number of bedrooms |
Baths | Number of bathrooms |
Prepare a scatterplot of Price and Size and justify an appropriate model using log transformation for these 2 variables (add graphs on your PDF submission). Which transformation works better for a model with these 2 variables (log-lin/ lin-log/ log-log, use these options, careful with typos).
Run a regression model predicting the Price as a function of Size, Bedrooms, and Baths (Model1). Build an appropriate model considering only main effects using the transformation you selected for the variables Price and Size. Report the coefficients of the variables you used (or their transformations) with 4 values after the decimal point. This is Model1.
Run residuals diagnostic plots and attach them in your file. Run a Shapiro test of Normality to evaluate residuals for this model, report the p-value with 4 digits after the decimal point.
Find any observations that are unusual and evaluate the top 3 extreme observations for impact on your conclusion. List the number of observations you considered extreme (or outliers) based on the residual’s plots in ascending order. Based your answer in the Residual vs Fitted plot or in the QQ Plot
Run the regression (Model2) without these 3 outliers/extreme values, and compare it with Model1. Do you recommend to remove them? Show the work in your file and use this question to explain (be brief).
Calculate the estimated effect of a 10% increase in home size on the price. Use Model1 and report your answer as follow 12.45%, report it as 12.45
Calculate the difference between California and Pennsylvania in terms of average house price. To answer this question, you need to run another model (Model3) with the explanatory variables Size, Bedrooms, and Baths and variable(s) to account for the differences between California and Pennsylvania.
Calculate the difference in Multiple R-squared from the original Model1 and Model3. Report 4 digits.
Step by Step Solution
3.44 Rating (160 Votes )
There are 3 Steps involved in it
Step: 1
We first import the data of house sales using the following co...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started