Answered step by step
Verified Expert Solution
Question
1 Approved Answer
A. Histogram: For your two variables, create histograms. B. Summary statistics: For your two variables, create a table to show the mean, median, and standard
A. Histogram: For your two variables, create histograms. B. Summary statistics: For your two variables, create a table to show the mean, median, and standard deviation. C. Interpret the graphs and statistics: a. Based on your graphs and sample statistics, interpret the center, spread, shape, and any unusual characteristic (outliers, gaps, etc.) for the two variables. b. Compare and contrast the shape, center, spread, and any unusual characteristic for your sample of house sales with the national population. Is your sample representative of national housing market sales? Develop Your Regression Model A. Scatterplot: Provide a graph of the scatterplot of the data with a line of best fit. a. Explain if a regression model is appropriate to develop based on your scatterplot. B. Discuss associations: Based on the scatterplot, discuss the association (direction, strength, form) in the context of your model. b. Identify any possible outliers or influential points and discuss their effect on the correlation. c. Discuss keeping or removing outlier data points and what impact your decision would have on your model. C. Find r. Find the correlation coefficient (). a. Explain how the rvalue you calculated supports what you noticed in your scatterplot. Determine the Line of Best Fit. Clearly define your variables. Find and interpret the regression equation. Assess the strength of the model. A. Regression equation: Write the regression equation (i.e., line of best fit) and clearly define your variables. B. Interpret regression equation: Interpret the slope and intercept in context. C. Strength of the equation: Provide and interpret R-squared. a. Determine the strength of the linear regression equation you developed. D. Use regression equation to make predictions: Use your regression equation to predict how much you should list your home for based on the square footage of your home. Conclusions A. Summarize findings: In one paragraph, summarize your findings in clear and concise plain language for the CEO to understand. Summarize your results a. Did you see the results you expected, or was anything different from your expectations or experiences? i. What changes could support different results, or help to solve a different problem? ii. Provide at least one question that would be interesting for follow-up research. O You can use the following tutorial that ic specifically about this accianment Make cure to check the accianment promot for cnecific numbers lIced 10:1 5/27Overview The purpose of this project is to have you complete all of the steps of a real-world linear regression research project starting with developing a research question, then completing a comprehensive statistical analysis, and ending with summarizing your research conclusions. Scenario You have been hired by the D. M. Pan National Real Estate Company to develop a model to predict housing prices for homes sold in 2019. The CEO of D. M. Pan wants to use this information to help their real estate agents better determine the use of square footage as a benchmark for listing prices on homes. Your task is to provide a report predicting the housing prices based square footage. To complete this task, use the provided real estate data set for all U.S. home sales as well as national descriptive statistics and graphs provided. Directions Using the Project One Template located in the What to Submit section, generate a report including your tables and graphs to determine if the square footage of a house is a good indicator for what the listing price should be. Reference the National Statistics and Graphs document for national comparisons and the Real Estate Data spreadsheet (both found in the Supporting Materials section) for your statistical analysis. Note: Present your data in a clearly labeled table and using clearly labeled graphs. Specifically, include the following in your report: Introduction A. Describe the report: Give a brief description of the purpose of your report. a. Define the question your report is trying to answer. b. Explain when using linear regression is most appropriate. i. When using linear regression, what would you expect the scatterplot to look like? c. Explain the difference between response and predictor variables in a linear regression to justify the selection of variables. Data Collection A. Sampling the data: Select a random sample of 50 houses. a. Identify your response and predictor variables. B. Scatterplot: Create a scatterplot of your response and predictor variables to ensure they are appropriate for developing a linear model. Data Analysis A. Histogram: For your two variables, create histograms. O B. Summary statistics: For your two variablesGeneral Normal Bad Good Neutral Calculation 2 AutoSum Copy EB EER Fill - ZY Paste B IUDA 587 6 2 2 Merge & Center $ % " 100 40 Conditional Format as Check Cell Explanatory ... Input Linked Cell Note Insert Delete Format Sort & Find & Format Painter Formatting ~ Table Clear - Filter ~ Select - Clipboard Font Alignment Number Styles Cells Editing R32 X v fx A B C D E G H L M N Q R S Median listing Median $'s per Median square 1 Region State County price square foot feet Random 2 West North Central ks wyandotte 259500 159.1048437 1631 0.868151 Listing Price Square Feet Area 3 East North Central oh lake 225900 134.7852029 1676 0.180808 Mean $367,033 2243.166667 4 New England nh grafton 358200 143.7976716 2491 0.01287 Median $350,200 1978.5 5 East North Central wi douglas 461400 128.8466909 3581 0.715112 Std Dev $129,633.52 1072.755933 New England ri bristol 355100 205.7358053 1726 0.778902 7 West North Central sd lincoln 461000 195.4217889 2359 0.618012 8 South Atlantic FI collier 413800 178.208441 2322 0.064025 9 Mid Atlantic 273400 Scatterplot of y vs x queen anne's 190.3899721 1436 0.300883 y = 103.48x + 134912 10 Pacific or washington 296800 223.3258089 1329 0.647482 900000 11 Mountain nm eddy 383900 185.3693868 2071 0.403756 80000 + 12 South Atlantic fl duval 320400 169.9734748 1885 0.906496 70000 13 Mid Atlantic va stafford 306800 152.8649726 2007 0.33833 60000 14 New England ct new haven 387100 180.7189542 2142 0.359607 50000 5 South Atlantic sc beaufort 314400 149.2168961 2107 0.907795 40000 16 East North Central oh trumbull 243000 133.0049261 1827 0.636315 17 Mountain ut salt lake 304300 184.9848024 1645 0.581819 300000 18 New England ri providence 379100 147.7396726 2566 0.996462 20000 19 Pacific ca yuba 0.992047 10000 20 New England ma suffolk 423600 170.0521879 2491 0.022267 21 Northeast pa luzerne 386500 186.0857005 2077 0.989906 1000 2000 3000 4000 5000 6000 7000 22 West South Central tx midland 515000 141.1732456 3648 0.690863 23 Pacific wa lewis 402300 218.5225421 1841 0.509874 equation y_hat=103.48x 134912 100 1000 24 East South Central al tuscaloosa 633100 107.4143196 5894 0.511591 0.856320292 slo 103.48 10348 103480 134912 25 East North Central bay 145100 117.110573 R-squared 0.733284443 intercept mi 1239 0.870019 26 New England t rutland 344200 150.9649123 2280 0.327394 27 New England ma worcester 346000 213.3168927 1622 0.363282 predictions 319000 168.6046512 1892 0.559221 2700 sqft 144484 28 Mountain id twin falls 214600 173.2041969 1239 0.567028 1000 sqft -31432 29 Mid Atlantic nj atlantic 30 Pacific or josephine 49190 266.3237683 1847 0.873965 Histogram for Median Square Feet 31 West North Central mo st. louis 440100 197.4427995 2229 0.775723 45 32 East South Central tn greene 226000 115.8974359 1950 0.175822 40 33 West North Central ia polk 720300 161.5384615 4459 0.855059 34 Mountain nm otero 2220 0.049009 35 46850 211.036036 30 35 New England ma norfolk 80340 141.9434629 5660 0.55633 36 New England ma franklin 38520 171.4285714 2247 0.803801 25 37 East North Central mi marquette 172500 120.3768318 1433 0.980918 20 D Project 1 data + 100% Ready * Accessibility. Investigate Type here to search O J. Rain coming ~ 9 () 10:18 PM 5/27/2022& Share 16 Cut Calibri Copy - 11 - A A 29 Wrap Text General Normal Bad Good Neutral Calculation 2 AutoSum . A O Paste Fill - Format Painter IUV = =42 Merge & Center $ % Conditional Format as Check Cell Explanatory ... Input Linked Cell Note nsert Delete Format Sort & Find & Formatting " Table Clear Filter " Select Clipboard Font Alignment Number Styles Cells Editing R45 X V fx A D E H M N O P R S U V 25 East North Central bay 145100 117.110573 1239 0.870019 R-squared 0.733284443 intercept 134912 26 New England vt rutland 344200 150.9649123 2280 0.327394 27 New England ma worcester 346000 213.3168927 1622 0.363282 predictions 28 Mountain id twin falls 319000 168.6046512 1892 0.559221 2700 sqft 144484 29 Mid Atlantic nj atlantic 214600 173.2041969 1239 0.567028 1000 sqft -31432 30 Pacific or josephine 491900 266.3237683 1847 0.873965 Histogram for Median Square Feet 31 West North Central mo st. louis 440100 197.4427995 2229 0.775723 45 32 East South Central tn greene 226000 115.8974359 1950 0.175822 40 33 West North Central ia polk 720300 161.5384615 4459 0.855059 35 34 Mountain nm otero 468500 211.036036 220 0.049009 35 New England ma norfolk 303400 141.9434629 5660 0.55633 30 36 New England ma franklin 385200 171.4285714 2247 0.803801 25 37 East North Central mi marquette 172500 120.3768318 1433 0.980918 20 38 Mountain ut washington 370500 209.3220339 1770 0.618897 15 39 Northeast pa columbia 277800 174.169279 1595 0.505426 10 40 Northeast pa lancaster 248500 169.6245734 1465 0.828105 41 East South Central ms lafayette 605300 115.3391768 5248 0.264513 42 Pacific or washington 0.371909 (4239, 5739] (5739, 7239] 43 Mid Atlantic va pittsylvania 347600 209.9033816 1656 0.736246 [1239, 2739] (2739, 4239] 44 New England me cumberland 347900 144.7773616 2403 0.339326 45 Mid Atlantic md carroll 361100 179.920279 2007 0.769532 Histogram for Median Listing Price 46 Pacific wa grays harbor 395300 283.776023 1393 0.616615 47 Mid Atlantic hunterdon 352500 249.1166078 1415 0.44904 nj 16 48 New England vt windsor 293800 182.0322181 1614 0.898791 14 49 South Atlantic nc hoke 305700 152.0139234 2011 0.74306 50 East North Central in wayne 203800 141.4295628 1441 0.581086 51 East South Central tn davidson 326400 126.2669246 2585 0.447377 53 54 55 56 [145100).. (215100,.. (285100,.. (355100,.. (425100,.. (495100,.. (565100,.. (635100,... (705100,. (775100,.. 57 58 59 60 61 62 Project 1 data + 100% 10:20 PM Ready * Accessibility. Investigate X J. Rain coming ~ 9 () 5/27/2022 Type here to search OMAT 240 Project One Template - Word HOME REFERENCES REVIEW VIEW FILE INSERT DESIGN PAGE LAYOUT MAILINGS Median Housing Price Model for D. M. Pan National Real Estate Company 2 Introduction [Describe the report: Include in this section a brief overview, including the purpose of the report and your approach.] Data Collection [Sampling the data: Outline how you obtained your sample data, including the response and predictor variables.] [Scatterplot: Insert a correctly labeled scatterplot of your chosen variables.] Data Analysis [Histogram: Insert the histogram of the two variables. Be sure to include appropriate labels.] [Summary statistics: Insert a table to show the summary statistics.] [Interpret the graphs and statistics: Describe the shape, center, spread, and any unusual characteristic (outliers, gaps, etc.) and what they mean based on your sample data and the graphs you created.] [Explain how these characteristics of the sample data compare to the same characteristics of the national population. Also, determine whether your sample is representative of the national housing market sales.] The Regression Model [Scatterplot: Include the scatterplot graph of the sample with a line of best fit and the regression equation.] [Based on your graph, explain whether a regression model can be developed for the data and how.] PAGE 2 OF 3 326 WORDSMAT 240 Project One Template - Word FILE HOME INSERT DESIGN PAGE LAYOUT REFERENCES MAILINGS REVIEW VIEW Median Housing Price Model for D. M. Pan National Real Estate Company 3 [Discuss associations: Explain the associations in the scatterplot, including the direction, strength, form in the context of your model.] [Find r: Calculate the correlation coefficient and explain how it aligns with your interpretation of the data from the scatterplot.] The Line of Best Fit [Regression equation: Insert the regression equation.] [Interpret regression equation: Interpret the slope and intercept in context.] [Strength of the equation: Interpret the strength of the regression equation, R-squared.] [Use regression equation to make predictions: Use the regression equation to make a sample prediction.] Conclusions [Summarize findings: Summarize your findings in clear and concise plain language. I Outline any questions arising from the study that might be interesting for follow-up research.] PAGE 2 OF 3 326 WORDS
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started