Question
Create a file called features.py and implement the following functions. def train_test_split(X, y, test_size, shuffle, random_state=None) : X, y - features and the target variable.
Create a file called features.py and implement the following functions.
def train_test_split(X, y, test_size, shuffle, random_state=None) :
X, y - features and the target variable. test_size - between 0 and 1 - how much to allocate to the test set; the rest goes to the train set. shuffle - if True, shuffle the dataset, otherwise not. random_state, integer; if None, then results are random, otherwise fixed to a given seed. Example: X_train, X_test, y_train, y_test = train_test_split(feat_df, y, 0.3, True, 12)
create_categories(df, list_columns)
Converts values, in-place, in the columns passed in the list_columns to numerical values. Follow the same approach: "string" -> category -> code. Replace values in df, in-place.
X, y = preprocess_ver_1(csv_df)
Apply the feature transformation steps to the dataframe, return new X and y for entire dataset. Do not modify the original csv_df . Remove all rows with NA values Convert datetime to a number Convert all strings to numbers. Split the dataframe into X and y and return these.
https://www.kaggle.com/anthonypino/melbourne-housing-market Download Melbourne_housing_FULL.csv
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started