Answered step by step
Verified Expert Solution
Question
1 Approved Answer
The goal of this project is to detect fraudulent online payments using machine learning methods. You will apply various supervised classification algorithms to identify fraudulent
The goal of this project is to detect fraudulent online payments using machine learning methods. You will apply various supervised classification algorithms to identify fraudulent transactions from a provided dataset. This project will enhance your understanding of data preprocessing, feature engineering, model building, and hyperparameter tuning in the context of a realworld financial application. You will work with a dataset that contains online payment transaction records. Each record has the following columns: the data set is of rows and columns
step: integer ex
type: Type of transaction string eg CASHIN CASHOUT, DEBIT, PAYMENT,.
amount: The amount of the transaction. float ex
nameOrig: The account identifier of the originator of the transaction.String Integer Ex C
oldbalanceOrg: The initial balance of the originator before the transaction. float ex
newbalanceOrig: The balance of the originator after the transaction. float ex
nameDest: The account identifier of the recipient of the transaction. string integer ex M
oldbalanceDest: The initial balance of the recipient before the transaction. float ex
newbalanceDest: The balance of the recipient after the transaction. float ex
isFraud: Binary indicator if the transaction is fraudulent or not
isFlaggedFraud: Binary indicator if the transaction is flagged as fraudulent by the system or not
Begin by importing the dataset into your Python environment and handling any missing values appropriately. Proceed to feature engineering, creating new features that may be useful for fraud detection. Normalize the numerical features to ensure all values are within a similar range.
Next, conduct an exploratory data analysis EDA Generate summary statistics for the dataset and create visualizations to understand the distribution of the data, identify patterns, and detect any anomalies.
For model building, split the dataset into training and testing sets. Implement the following machine learning algorithms: Logistic Regression LR Random Forest RF Support Vector Machine SVM and Gradient Boosting Machine GBM and hybrid algorithm of your choice. Initially, run each model with default parameters to establish a baseline performance.
After establishing baseline models, proceed with hyperparameter tuning using techniques like Grid Search or Random Search or any you want to find the best parameters for each algorithm. Evaluate the performance of each model using a confusion matrix and calculate the Accuracy, Precision, Recall, and FScore. Additionally, plot the ROC curves and calculate the AUC for each model to evaluate the tradeoff between the true positive rate and false positive rate.
Note : Enhance your model by identifying and visualizing the most important features for fraud detection. Explore and implement ensemble methods to combine multiple models for improved performance. Propose and implement any additional enhancements or optimizations, such as incorporating domainspecific knowledge or using advanced feature selection methods. Additionally, attempt to create a hybrid algorithm to further improve detection accuracy.
Document your entire process, including data preprocessing steps, EDA findings, model building, tuning, and evaluation results. Analyze which model performed the best and explain why, discussing any challenges encountered and how you addressed them. Prepare a presentation summarizing your findings, model performance, and any proposed enhancements.
Submit your Python code Jupyter notebooks or py files used for data preprocessing, EDA, model building, and evaluation
Submit the graph you obtained from each algo and also a combine graph showing the comparison.
Tip: the dataset is available in the Kaggle by named Online Payments Fraud Detection Dataset
Online payment fraud big dataset for testing and practice purpose
Note: please do not write the ChatGPT code or already existing code. Write your own code with enhanced version.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started