Question
Question 1: # load the data into a dataframe called housing data #MISSING 1 line of code housing_df = pd.read_csv('BostonHousing.csv') # display column/variable names #Create
Question 1:
# load the data into a dataframe called housing data #MISSING 1 line of code housing_df = pd.read_csv('BostonHousing.csv')
# display column/variable names #Create a list called columns with all of the housing_df columns names in it #MISSING 1 line of code
print("Variables in the data are: ") print(columns)
# review first 5 records in the data print(" First 5 records in the data are:") #MISSING 1 line of code
output:
Variables in the data are: ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT', 'MEDV', 'CAT. MEDV'] First 5 records in the data are:
CRIM | ZN | INDUS | CHAS | NOX | RM | AGE | DIS | RAD | TAX | PTRATIO | LSTAT | MEDV | CAT. MEDV | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.00632 | 18.0 | 2.31 | 0 | 0.538 | 6.575 | 65.2 | 4.0900 | 1 | 296 | 15.3 | 4.98 | 24.0 | 0 |
1 | 0.02731 | 0.0 | 7.07 | 0 | 0.469 | 6.421 | 78.9 | 4.9671 | 2 | 242 | 17.8 | 9.14 | 21.6 | 0 |
2 | 0.02729 | 0.0 | 7.07 | 0 | 0.469 | 7.185 | 61.1 | 4.9671 | 2 | 242 | 17.8 | 4.03 | 34.7 | 1 |
3 | 0.03237 | 0.0 | 2.18 | 0 | 0.458 | 6.998 | 45.8 | 6.0622 | 3 | 222 | 18.7 | 2.94 | 33.4 | 1 |
4 | 0.06905 | 0.0 | 2.18 | 0 | 0.458 | 7.147 | 54.2 | 6.0622 | 3 | 222 | 18.7 | 5.33 | 36.2 | 1 |
Question 2:
# select columns for regression analysis outcome = 'MEDV' predictors = ['CRIM', 'CHAS', 'RM']
#Create a dataframe called x containing the predictor columns #MISSING 1 line of code
#Create a dataframe (technically a series) containing the outcome variable. Call it y #MISSING 1 line of code
Question 3:
#Create a model called housing_lm and set it to be a LinearRegression() model #MISSING 1 line of code
# fit the regression model y on x #MISSING 1 line of code
# print the intercept #MISSING 1 line of code
#print the list of predictor columns and the coefficients #MISSING 1 line of code
output:
intercept -28.81068250635914 Predictor coefficient 0 CRIM -0.260724 1 CHAS 3.763037 2 RM 8.278180
Questionn 4:
new_df = pd.DataFrame( [[0.1, 0, 6]], columns=['CRIM', 'CHAS', 'RM'] ) new_df
output:
CRIM CHAS RM
0 0.1 0 6
#Run the prediction model that you created using the above created dataframe containing # the new predictor values the the results housing_lm_pred #MISSING 1 line of code
print('Predicted value for median house price based on the model built using dataset is:', housing_lm_pred)
output:
Predicted value for median house price based on the model built using dataset is: [20.83232392]
Question 5:
# variables in the data housing_df.columns
output:
Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT', 'MEDV', 'CAT. MEDV'], dtype='object')
# Create a new dataframe called predictors_df with only numerical predictors #MISSING 1-5 lines of code (many different ways we have done this before)
predictors_df.columns
output:
Index(['CRIM', 'ZN', 'INDUS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'LSTAT'], dtype='object')
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started