Question
USE RSTUDIO TO ANSWER THE FOLLOWING QUESTIONS. PRESENT R CODES AND SCREENSHOTS OF EVERY OUTPUT WITH THE CODE AND ASSIGN THE QUESTION NUMBER TO THE
USE RSTUDIO TO ANSWER THE FOLLOWING QUESTIONS. PRESENT R CODES AND SCREENSHOTS OF EVERY OUTPUT WITH THE CODE AND ASSIGN THE QUESTION NUMBER TO THE ANSWER. DATA SET IS HERE: https://docs.google.com/spreadsheets/d/1fthpWnzxx4vljWjuZIzSU4_ShpfzdHMl/edit?usp=sharing&ouid=101981614875951507390&rtpof=true&sd=true
Use the attached dataset HomesforSale.xls which contains a random sample of home prices in 4 different states with 120 observations on the following 5 variables.
State | Location of the home:CANJNYPA |
Price | Asking price (in $1,000's) |
Size | Area of all rooms (in 1,000's sq. ft.) |
Beds | Number of bedrooms |
Baths | Number of bathrooms |
1. Make a scatterplot of Price and Size and justify an appropriate model using log transformation for these 2 variables (add the graphs on your PDF submission). Which transformation works better for a model with these 2 variables (log-lin/ lin-log/ log-log, use these options, be careful with typos).
2 .Run a regression model predicting the Price as a function of Size, Bedrooms, and Baths (Model1). Build an appropriate model considering only main effects using the transformation you selected for the variables Price and Size. Report the coefficients of the variables you used (or their transformations) with 4 values after the decimal point. This is Model1.
bo(intercept) =
b1(Size) =
b2(Beds) =
b3(Baths) =
3. Run residuals diagnostic plots and attach them in your file. Run a Shapiro test of Normality to evaluate residuals for this model, report the p-value with 4 digits after the decimal point.
4. Find any observations that are unusual and evaluate the top 3 extreme observations for impact on your conclusion. List the number of observations you considered extreme (or outliers) based on the residual's plots in ascending order (1, 2, 3...). Based your answer in the Residual vs Fitted plot or in the QQ Plot , , .
5. Run the regression (Model2) without these 3 outliers/extreme values, and compare it with Model1. Do you recommend to remove them? Show the work in your file and use this question to explain (be brief).
6. Calculate the estimated effect of a 20% increase in home size on the price. Use Model1 and report your answer as follow 12.45%, report it as 12.45 and Explain in ONE sentence the effect that you just calculated.
7. Calculate the difference between California (CA) and New Jersey (NJ) in terms of average house price. To answer this question, you need to run another model (Model3) with the explanatory variables Size, Bedrooms, and Baths and variable(s) to account for the differences between California and New Jersey. For consistency on this procedure, make sure to keep all the records in your regression (n=120); hint: you need to create dummy variables by State and Explain in ONE sentence the effect that you just calculated.
8. Calculate the difference in Multiple R-squared from the original Model1 and Model3. Report 4 digits.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started