Dataframe/Python URL=https://bit.ly/2WKPUXI *You may need to use encoding='latin1' as additional parameter for read_csv() Q1: from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_absolute_error model = LinearRegression()
Dataframe/Python URL="https://bit.ly/2WKPUXI"
*You may need to use encoding='latin1' as additional parameter for read_csv()
Q1:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
model = LinearRegression()
model.fit(X_train, y_train)
predict = model.predict(X_test)
r_sq = model.score(X_test, y_test)
print('coefficient of determination:', r_sq)
print("errors in predictions: ", mean_absolute_error(y_test, predict)) # summation| predict - real|
print("coefficient: ", model.coef_).
Justify the reasons behind the error you got.
Q2: Discovering the Data.
Q3: Handle the missing values as follows:
- Drop the columns that include 95% missing data.
- Estimate missing values if only an acceptable percentage of values are missing. Hint, you are supposed to depend on the relations between columns (e.g., ("car" and "engV"), ("model" and "drive")).
- Drop the rows that still contain missing values
Q4: Check outliers and suggest a solution if it is necessary.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started