Question
This is a multi-classification task where you are expected to predict the response variable. Given the attributes or features of a client, the task of
This is a multi-classification task where you are expected to predict the response variable.
Given the attributes or features of a client, the task of the predictive system is to measure the level of risk in providing insurance to the client. Risk is categorized into 8 levels.
In this project, you meet a challenge to predict response variable for a client given the history of the client (training.csv). Note, that some of the information for the client is not available (missing).
You will be provided with a file (testing.csv) for predicting a response variable for a client.
training.csv: It is a comma-separated training dataset file that contains attributes of a client and the ground-truth response variable (label).
testing.csv: It is a comma-separated testing dataset file that contains attributes of unseen clients for which the response variable has to be predicted.
sample_solution.csv: It is a comma-separated sample solution file that contains the ids of the clients in test dataset and their predicted response variable by your algorithm.
For every test instance in the testing.csv submission files should contain two columns: Id and Response. Details are available on Kaggle website.
you are allowed to use any library and packages. It is required to implement at least one algorithm (PCA preffered.)
DATASET:
In this dataset, you are provided with over a hundred variables describing attributes of life insurance applicants. The task is to predict the "Response" variable for each Id in the test set. "Response" is an ordinal measure of risk that has 8 levels.
File descriptions
training.csv - the training set, contains the Response values
testing.csv - the test set, you must predict the Response variable for all rows in this file
sample_submission.csv - a sample submission file in the correct format
Data fields
Variable Description
Id A unique identifier associated with an
application. Product_Info_1-7 A set of normalized variables relating
to the product applied for Ins_Age Normalized age of applicant
Ht Normalized height of applicant Wt Normalized weight of applicant
BMI Normalized BMI of applicant Employment_Info_1-6 A set of
normalized variables relating to the employment history of the
applicant. InsuredInfo_1-6 A set of normalized variables providing
information about the applicant. Insurance_History_1-9 A set of
normalized variables relating to the insurance history of the
applicant. Family_Hist_1-5 A set of normalized variables relating to
the family history of the applicant. Medical_History_1-41 A set of
normalized variables relating to the medical history of the applicant.
Medical_Keyword_1-48 A set of dummy variables relating to the presence
of/absence of a medical keyword being associated with the application.
Response This is the target variable, an ordinal
variable relating to the final decision associated with an application
The following variables are all categorical (nominal):
Product_Info_1, Product_Info_2, Product_Info_3, Product_Info_5, Product_Info_6, Product_Info_7, Employment_Info_2, Employment_Info_3, Employment_Info_5, InsuredInfo_1, InsuredInfo_2, InsuredInfo_3, InsuredInfo_4, InsuredInfo_5, InsuredInfo_6, InsuredInfo_7, Insurance_History_1, Insurance_History_2, Insurance_History_3, Insurance_History_4, Insurance_History_7, Insurance_History_8, Insurance_History_9, Family_Hist_1, Medical_History_2, Medical_History_3, Medical_History_4, Medical_History_5, Medical_History_6, Medical_History_7, Medical_History_8, Medical_History_9, Medical_History_11, Medical_History_12, Medical_History_13, Medical_History_14, Medical_History_16, Medical_History_17, Medical_History_18, Medical_History_19, Medical_History_20, Medical_History_21, Medical_History_22, Medical_History_23, Medical_History_25, Medical_History_26, Medical_History_27, Medical_History_28, Medical_History_29, Medical_History_30, Medical_History_31, Medical_History_33, Medical_History_34, Medical_History_35, Medical_History_36, Medical_History_37, Medical_History_38, Medical_History_39, Medical_History_40, Medical_History_41
The following variables are continuous:
Product_Info_4, Ins_Age, Ht, Wt, BMI, Employment_Info_1, Employment_Info_4, Employment_Info_6, Insurance_History_5, Family_Hist_2, Family_Hist_3, Family_Hist_4, Family_Hist_5
The following variables are discrete:
Medical_History_1, Medical_History_10, Medical_History_15, Medical_History_24, Medical_History_32
Medical_Keyword_1-48 are dummy variables.
I am not able to add csv files. Please let me know how to add it?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started