Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Create a function, preprocess _ data, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps:

Create a function, preprocess_data, which performs data preprocessing for a classification task. The function will preprocess the data by performing the following steps: 1. For categorical variables: replace NaN values with the most frequent value; create dummy variables based on levels (all presented values) and drop the first one in alphabetical order. Name the new binary columns using this schema: name of categorical variable +-+ level name. 2. For numerical variables: replace NaN values with the median; standardize values by subtracting the mean and dividing by the standard deviation. 3. For the target variable: convert text values into integers, so that the first text value alphabetically is converted to 0 and so on. The preprocess_data function accepts one argument: dataframe - pandas DataFrame where target is a classification label and other variables are explanatory variables. The function returns a tuple (X, y), where: X is a pandas DataFrame obtained after performing the preprocessing of numerical and categorical variables and after dropping the target column; y is a list of values of the target variable after preprocessing. Example For this sort of data: |target married |degree Isalary loccupation| the preprocess_data function should return the following tuple: -> All changes saved 38\deg F Cloudy Q Search To lea Tes male 01111.5 nurse I female 1112 NaN nurse I female 111312.3 policeman I male 111212.0|fireman male 101311.5 NaN

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Databases Illuminated

Authors: Catherine Ricardo

2nd Edition

1449606008, 978-1449606008

More Books

Students also viewed these Databases questions