Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Using spark 1. Read the dataset using sqlContext from pyspark.sql import SQLContext sqlContext = SQLContext(sc) spark_df = sqlContext.sql(Select * from Washington_State_HDMA_2016_csv) 2. Compute how many

Using spark

1. Read the dataset using sqlContext

from pyspark.sql import SQLContext sqlContext = SQLContext(sc) spark_df = sqlContext.sql("Select * from Washington_State_HDMA_2016_csv")

2. Compute how many floating and string variables this dataset has

num_float = # (help here) num_string = # (help here)

3. Create a new column named denied (help)

Assume that if denial_reason_name_1 column is not null, then the loan application is rejected/denied

Create a new column in the dataset - Name the column as denied

Encode the denied column as 0 if denial_reason_name_1 is null, otherwise encode the denied column as 1

4. Find the percentage of denied loans (help)

Use the new variable named denied in this analysis

What percentage of loans are denied?

Google the average loan application denial rate in the country. Is this number similar to the US average?

5. Compare the income of approved applicants vs rejected applicants (help)

Use applicant_income_000s variable

Calculate the average income for denied = 1 and denied = 0 applicants (you can use groupBy())

What do you think (e.g., approved aplicants make more money?)

If not, this is against our intuition. Why do you think denied applicants make more money?

6. Relationship between sex and application status (help)

Investigate if female applicants have higher rejection rate as compared to male applicants

Find the rejection rate for males and females.

For simplicity, consider rejection rate is number of denied applicants(denied = 1) / number of approved applicants (denied = 0)

Use applicant_sex_name for detemining the sex of the applicant

Any comments?

7. Relationship between race and application status (help)

Investigate the relationship between the applicants race and the loan status.

You can use the denied column you have created and applicant_race_name_1 column

For each race, find the ratio of denied loans

Consider the ratio of denied loans as the number of denied applicants(denied = 1) / number of approved applicants (denied = 0)

What are your comments? Which race has the highest denied ratio?

8. Check loan_income_ratio (help)

Let's do some more deep down analysis

Let's create a new variable by dividing the loan_amount_000s with applicant_income_000s

Name this variable loan_to_income_ratio

Let's check if the denied loans are the ones with high loan_to_income_ratio.

What are your thoughts?

hint: logically, we expect that the denied loans should have higher loan_to_income_ratio. Is this the case? Include race variable into the analysis. What do you think about the relationship among applicant_race_name_1, loan_to_income_ratio, denied variables?

9. What is the most common denial reason (help)

Use the denial_reason_name_1 variable

Google the most common mortgage denial reasons. Did you get similar results?

10. Give at least 3 more insights (help)

Give us more insights. Use your intuition and do some more analysis to give us more insights about the dataset.

Feel free to experiment

You can use python visualization tools

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Expert Oracle Database Architecture

Authors: Thomas Kyte, Darl Kuhn

3rd Edition

1430262990, 9781430262992

More Books

Students also viewed these Databases questions

Question

What is electric dipole explain with example

Answered: 1 week ago

Question

What is polarization? Describe it with examples.

Answered: 1 week ago

Question

Question Who can establish a Keogh retirement plan?

Answered: 1 week ago