Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Your boss, the leader of the Data Science Team at the bank where you work, has decided to extend their vacation for another 4 weeks

Your boss, the leader of the Data Science Team at the bank where you work, has decided to extend their vacation for another 4 weeks and has left you in charge. Prior to leaving on vacation, your boss was just starting to work on a report requested by the bank's Vice President of Fraud. The Vice President is concerned about 'New Account Fraud' at the bank and has asked the bank's Data Science Team to answer some questions about new account application events. To answer these questions, your boss has compiled a data set containing relevant historical events. You must now complete the work for your boss by analyzing the data and writing a report to the Vice President.

Using and analyzing the variant1.csv data file available at this link - https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022?resource=download&select=Variant+I.csv . answer the following

select any two different machine learning methods and apply the methods to the variant1.csv data set and do analysis.

QUESTIONS TO ANSWER:

Which two methods you selected,

Why you selected those methods to analyze the data

Explain your analysis.

What does the method and data tell you about the bank's new account applicants? Explain.

What features in the data seem to matter more in your analysis and method?

There are fraudulent applicant events flagged in the data. Is there anything unique about those events?

Are the fraud applicant events outliers or do they seem to blend in with or look the same as the rest of the bank's legitimate applicants?

Does the bank have multiple clusters of applicants? If yes, explain how they are unique? If no, explain why not.

Is there a chance that the bank might potentially miss applicant events that might actually be fraudulent? Explain your answer.

Within the data, are their certain applicant events that will never be considered fraudulent? If yes, what are the common data features of those applicants? If no, why?

Do you think the bank could predict future fraudulent events with this data set? Yes or No? Provide a short explanation.

What metrics did you use to evaluate the accuracy of the methods you chose? Please list.

At what specific step (or steps) within the fraud kill cycle could your method be utilized by the bank to help stop potential fraud amongst applicants?

How could your analysis be used by the bank to help pro-actively or re-actively detect fraud?

What are the potential weaknesses in your chosen methods?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

From Zero To Data Hero With Chatgpt

Authors: Andrew Wu

1st Edition

B0CQRJPXD9, 979-8989523009

More Books

Students also viewed these Databases questions