Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Task 1 : Import the raw data ( CC _ Default.csv ) into your Jupyter notebook. 1 . 1 Check if the data is loaded

Task 1:
Import the raw data (CC_Default.csv) into your Jupyter notebook.
1.1 Check if the data is loaded correctly by printing a few observations. Check the total number of observations and variables.
1.2 Provide the descriptive statistics and manipulate data.
a. Check for missing values if any.
b. Plot the univariate distribution (Atleast 5 plots)
c. Convert the relevant variables such as payment variables (Pay0-Pay6 and customer related variables) to categorical variables as appropriate.
1.3 Find the variables that are correlated and the variables that might help in finding the defaulters next month using a few plots. The plots should provide insights on the following:
a. The independent variable that should help identify those who will default from the next months credit card payment
b. The relation between dependent and independent variables
c. The correlations among the variables, etc.
1.4 Provide your insights into the variables and their relationship based on your analysis in Task 1.3 in a markdown cell in your Jupyter notebook.
ID A numerical value assigned to each credit card customer
LIMIT_BAL The remaining credit a customer can use
SEX 1= male ; 2= female
EDUCATION 1= graduate school ; 2= university ; 3= high school ; 4= others ; 5= unknown ; 6= unknown
MARRIAGE 0= unknown 1= married 2= single 3= others
AGE A customers age in years
PAY_0 Repayment status in September 2005: 0 or less: Paid duly ; 1 or greater = payment was delayed
PAY_2 Repayment status in August 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_3 Repayment status in July 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_4 Repayment status in June 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_5 Repayment status in May 20050 or less: Paid duly ; 1 or greater = payment was delayed
PAY_6 Repayment status in April 20050 or less: Paid duly ; 1 or greater = payment was delayed
BILL_AMT1 The amount in the bill statement for September 2005 in NT dollar
BILL_AMT2 The amount in the bill statement for August 2005 in NT dollar
BILL_AMT3 The amount in the bill statement for July 2005 in NT dollar
BILL_AMT4 The amount in the bill statement for June 2005 in NT dollar
BILL_AMT5 The amount in the bill statement for May 2005 in NT dollar
BILL_AMT6 The amount in the bill statement for April 2005 in NT dollar
PAY_AMT1 The amount paid in NT dollar in September 2005
PAY_AMT2 The amount paid in NT dollar in August 2005
PAY_AMT3 The amount paid in NT dollar in July 2005
PAY_AMT4 The amount paid in NT dollar in June 2005
PAY_AMT5 The amount paid in NT dollar in May 2005
PAY_AMT6 The amount paid in NT dollar in April 2005
default.payment.next.month Shows customers who defaulted on their payments on the following month: 1= yes 0= no

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Concepts of Database Management

Authors: Philip J. Pratt, Joseph J. Adamski

7th edition

978-1111825911, 1111825912, 978-1133684374, 1133684378, 978-111182591

More Books

Students also viewed these Databases questions

Question

Who are the key players in a typical mutual fund organization?

Answered: 1 week ago

Question

What are the need and importance of training ?

Answered: 1 week ago

Question

What is job rotation ?

Answered: 1 week ago