Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Create a file called features.py and implement the following functions. def train_test_split(X, y, test_size, shuffle, random_state=None) : X, y - features and the target variable.

Create a file called features.py and implement the following functions.

def train_test_split(X, y, test_size, shuffle, random_state=None) :

X, y - features and the target variable. test_size - between 0 and 1 - how much to allocate to the test set; the rest goes to the train set. shuffle - if True, shuffle the dataset, otherwise not. random_state, integer; if None, then results are random, otherwise fixed to a given seed. Example: X_train, X_test, y_train, y_test = train_test_split(feat_df, y, 0.3, True, 12)

create_categories(df, list_columns)

Converts values, in-place, in the columns passed in the list_columns to numerical values. Follow the same approach: "string" -> category -> code. Replace values in df, in-place.

X, y = preprocess_ver_1(csv_df)

Apply the feature transformation steps to the dataframe, return new X and y for entire dataset. Do not modify the original csv_df . Remove all rows with NA values Convert datetime to a number Convert all strings to numbers. Split the dataframe into X and y and return these.

https://www.kaggle.com/anthonypino/melbourne-housing-market Download Melbourne_housing_FULL.csv

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional Microsoft SQL Server 2014 Integration Services

Authors: Brian Knight, Devin Knight

1st Edition

1118850904, 9781118850909

More Books

Students also viewed these Databases questions