Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

3. (16 points) In this question we will be understanding correlation between the features in the dataset credit risk dataset.csv. Load this dataset from shared/data/credit

3. (16 points) In this question we will be understanding correlation between the features in the dataset credit risk dataset.csv. Load this dataset from shared/data/credit risk dataset.csv. More information about the data can be found here: https://www.kaggle.com/datasets/ laotse/credit-risk-dataset/data (a) (2 points). Check whether there are any missing values i.e. NAs in the data. For this, explore dataframe.isna() function. i. Report the column names having NAs. ii. Drop all those rows which have NAs. (b) (2 points). Now we will be analyzing only a subset of dataframe. Create a subset of dataframe, containing only the columns person age, person income, loan amnt, loan percent income, cb person cred hist length (c) (4 points). Find correlation between the columns in the data using dataframe.corr(). Pick a pair of covariates and interpret their correlations. Which two predictors are the most highly correlated? The least? Does these correlations make sense in context? (d) (1 points) Using matplotlib.pyplot, plot a scatter plot that includes person income on X-axis and loan amnt on Y-axis. (e) (3 points) Study the plot from Q.3(d) i. Do you identify any outliers? ii. If yes, then suggest a transformation of the data that would reduce the influence of those outlier

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

A First Course in Differential Equations with Modeling Applications

Authors: Dennis G. Zill

11th edition

1305965728, 978-1305965720

More Books

Students also viewed these Mathematics questions

Question

Under what circumstances should derived associations be used?

Answered: 1 week ago

Question

Graph the inequality. xy 1

Answered: 1 week ago

Question

What can be done to get more people to value water more highly?

Answered: 1 week ago