Question
Imagine that you have been hired by a philanthropist group to analyze the relationship between house values and neighborhood characteristics. For example, they would like
Imagine that you have been hired by a philanthropist group to analyze the relationship between house values and neighborhood characteristics. For example, they would like to know whether houses in neighborhoods with desirable characteristics command a higher price. Moreover, they are specifically interested in environmental features, such as proximity to water (i.e. lake, river, or ocean) and air quality. The group has obtained information from tens of thousands of neighborhoods throughout the United States. You have been given a subset of this data, contained in house values.csv, along with the variable descriptions in house values description.txt.
Your task is to perform a statistical analysis on this data to answer the philanthropist group's questions. Build a statistical model that allows you to test hypotheses of interest to the group. Additionally, include a discussion of statistical issues that may be caused by omitted variables.
Successful submissions will include the following elements:
- A comprehensive analysis of data quality and integrity
- A discussion of any observations you delete from the dataset, including implications for the final model results.
- A discussion of any data imputation technique you use, including implications for the final model results.
- A thorough exploratory analysis of each variable (and combinations of variables)
- An explanation of how the exploratory data analysis is linked to modeling choices
- An assessment and formal test of all key model assumptions
3
7. A table of regression results that shows multiple model specifications
8. A detailed discussion of the model results (in terms of answering the business questions posed by the philanthropist group)
9. A discussion of biases caused by omitted variables
DATA: https://drive.google.com/file/d/1DMXld_bMovgXGakpERVlIdwftuyfhpsY/view?usp=sharing
DESCRIPTION: https://drive.google.com/file/d/1qfSqeBec-H7tbx-8g0KbDJbB5-tDsNc7/view?usp=sharing
You should turn in 1) a pdf report detailing your analysis with all relevant statistical output (please do not suppress the code) and 2) Python commands that you used to generate your analysis.
Make sure to let me know which IDE did you use.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started