Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Uing python answer these questions as per attached bellow data. NBC problem The file Bank.csv contains data on 5000 customers of Universal Bank. The data

Uing python answer these questions as per attached bellow data.

NBC problem

The file Bank.csvcontains data on 5000 customers of Universal Bank. The data include customer demographic information (age, income, etc.), the customer's relationship with the bank (mortgage, securities account, etc.), and the customer response to the last personal loan campaign

(Personal Loan). Among these 5000 customers, only 480 (=9.6%) accepted the personal loan that was offered to them in the earlier campaign. In this exercise, we focus on two predictors: Online (whether or not the customer is an active user of online banking services) and Credit Card (abbreviated CC below) (does the customer hold credit card issued by the bank), and the outcome Personal Loan (abbreviated Loan below).

Partition the data into training (60%) and validation (40%) sets.

a. Create pivot table for the training data with Online as column variable, CC as row variable, and Loan as a secondary row variable. The values inside the table should convey the count. Use the pandas dataframe methods melt()and pivot().

b. Consider the task of classifying customer who owns bank credit card and is actively using online banking services. Looking at the pivot table, what is the probability that this customer will accept the loan offer? [This is the probability of loan acceptance (Loan = 1) conditional on having bank credit card (CC = 1) and being an active user of online banking services (Online = 1).]

c. Create two separate pivot tables for the training data. One will have Loan (rows) as a function of Online (columns) and the other will have Loan (rows) as a function of CC.

d. Compute the following quantities [P(AB) means "the probability of A givenB"]:

  1. P(CC = 1Loan = 1) (the proportion of credit card holders among the loan acceptors)
  2. P(Online = 1Loan = 1)
  3. P(Loan = 1) (the proportion of loan acceptors)
  4. P(CC = 1Loan = 0)
  5. P(Online = 1Loan = 0)
  6. P(Loan = 0)

e. Use the quantities computed above to compute the naive Bayes probability

P(Loan = 1CC = 1, Online = 1).

f. Compare this value with the one obtained from the pivot table in (b). Which is a more accurate estimate?

g. Which of the entries in this table are needed for computing P(Loan = 1CC = 1, Online = 1)? In Python, run naive Bayes on the data. Examine the model output on training data, and find the entry that corresponds to P(Loan = 1 CC = 1, Online = 1). Compare this to the number you obtained in (e).

Here is the data:

ID Age Experience Income ZIP Code Family CCAvg Education Mortgage Personal Loan Securities Account CD Account Online CreditCard
1 25 1 49 91107 4 1.6 1 0 0 1 0 0 0
2 45 19 34 90089 3 1.5 1 0 0 1 0 0 0
3 39 15 11 94720 1 1 1 0 0 0 0 0 0
4 35 9 100 94112 1 2.7 2 0 0 0 0 0 0
5 35 8 45 91330 4 1 2 0 0 0 0 0 1
6 37 13 29 92121 4 0.4 2 155 0 0 0 1 0
7 53 27 72 91711 2 1.5 2 0 0 0 0 1 0
8 50 24 22 93943 1 0.3 3 0 0 0 0 0 1
9 35 10 81 90089 3 0.6 2 104 0 0 0 1 0
10 34 9 180 93023 1 8.9 3 0 1 0 0 0 0
11 65 39 105 94710 4 2.4 3 0 0 0 0 0 0
12 29 5 45 90277 3 0.1 2 0 0 0 0 1 0
13 48 23 114 93106 2 3.8 3 0 0 1 0 0 0
14 59 32 40 94920 4 2.5 2 0 0 0 0 1 0
15 67 41 112 91741 1 2 1 0 0 1 0 0 0
16 60 30 22 95054 1 1.5 3 0 0 0 0 1 1
17 38 14 130 95010 4 4.7 3 134 1 0 0 0 0
18 42 18 81 94305 4 2.4 1 0 0 0 0 0 0
19 46 21 193 91604 2 8.1 3 0 1 0 0 0 0
20 55 28 21 94720 1 0.5 2 0 0 1 0 0 1
21 56 31 25 94015 4 0.9 2 111 0 0 0 1 0
22 57 27 63 90095 3 2 3 0 0 0 0 1 0
23 29 5 62 90277 1 1.2 1 260 0 0 0 1 0
24 44 18 43 91320 2 0.7 1 163 0 1 0 0 0
25 36 11 152 95521 2 3.9 1 159 0 0 0 0 1
26 43 19 29 94305 3 0.5 1 97 0 0 0 1 0
27 40 16 83 95064 4 0.2 3 0 0 0 0 0 0
28 46 20 158 90064 1 2.4 1 0 0 0 0 1 1
29 56 30 48 94539 1 2.2 3 0 0 0 0 1 1
30 38 13 119 94104 1 3.3 2 0 1 0 1 1 1
31 59 35 35 93106 1 1.2 3 122 0 0 0 1 0
32 40 16 29 94117 1 2 2 0 0 0 0 1 0
33 53 28 41 94801 2 0.6 3 193 0 0 0 0 0
34 30 6 18 91330 3 0.9 3 0 0 0 0 0 0
35 31 5 50 94035 4 1.8 3 0 0 0 0 1 0
36 48 24 81 92647 3 0.7 1 0 0 0 0 0 0
37 59 35 121 94720 1 2.9 1 0 0 0 0 0 1
38 51 25 71 95814 1 1.4 3 198 0 0 0 0 0
39 42 18 141 94114 3 5 3 0 1 1 1 1 0
40 38 13 80 94115 4 0.7 3 285 0 0 0 1 0
41 57 32 84 92672 3 1.6 3 0 0 1 0 0 0
42 34 9 60 94122 3 2.3 1 0 0 0 0 0 0
43 32 7 132 90019 4 1.1 2 412 1 0 0 1 0

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Business Communication Essentials a skill based approach

Authors: Courtland L. Bovee, John V. Thill

6th edition

978-0132971324

Students also viewed these Mathematics questions

Question

Evaluate each expression if possible. -V0.49

Answered: 1 week ago

Question

social sciencess

Answered: 1 week ago

Question

=+b. Determine the internal rate of return (IRR) for the project.

Answered: 1 week ago