Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Project 2 Data Analysis and Visualisation of Malicious Credit Card Transaction Worth: 1 5 % of the unit Submission: ( 1 ) your code and
Project
Data Analysis and Visualisation of Malicious Credit Card Transaction
Worth: of the unit
Submission: your code and your data analysis and visualisation report on the quiz server.
Deadline: th May pm
Late submissions: late submissions attract a raw penalty per day up to days iest May pm
After that, the mark will be zero Also, any plagiarised work will be marked zero.
Outline
In this project, we will continue from our Project where we implemented a malicious credit card transaction
detection system. But instead of implementing the features which we completed in Project we will now
focus on data analysis and visualisation skills to better present what our datasets contain. For this project, you
will be given a dataset CreditCardProjectcsv that contain credit card transactions that are
already labelled normal or malicious. Your task is to perform the following steps more details in the tasks
section:
Data analysis
Data visualisation
Write data analysis and visualisation report
bonus use machine learning to implement detection
Note : This is an individual project, so please refrain from sharing your code or files with others. However,
you can have highlevel discussions about the syntax of the formula or the use of modules with other examples.
Please note that if it is discovered that you have submitted work that is not your own, you may face penalties. It
is also important to keep in mind that ChatGPT and other similar tools are limited in their ability to generate
outputs, and it is easy to detect if you use their outputs without understanding the underlying principles. The
main goal of this project is to demonstrate your understanding of programming principles and how they can be
applied in practical contexts.
Note : you do not necessarily have to complete project to do this project, as it is more about data analysis and
visualisation of the datasets you are given.
Tasks
To begin, you need to define a mainfilename filtervalue, typeofcard function that will
read the dataset and store the transaction records in data and call the below functions to display appropriate
results.
Sample Input:
mainCreditCardProjectcsv 'Port Lincoln', 'ANZ'
Task : Data Analysis using NumPy Mark:
Answer the following NumPy related tasks for data analysis. These will require use of NumPy functions and
methods, matrix manipulations, vectorized computations, NumPy statistics, NumPy where function, etc. To
complete this task, write a function called taskdata filtervalue, typeofcard where
CITS
Computer Analysis
and Visualisation
Page of
data contains all records from the dataset and filtervalue is an area name and typeofcard is the
name of the card provider. The function should return a list containing values from the following questions.
Return all results rounded to two decimal points.
Input:
cosdist, var, median, corr, pca taskdata 'Port Lincoln 'ANZ
output:
i cosdist: Calculate cosine distance between normal and malicious transactions based on
IPvalidityscore.
Formula:
Output:
printcosdist
ii var: Filter transactions based on certain geographical area eg Port Lincoln and calculate and display
the variance of transaction amount for a specific area. Note: use the Actual area column from the
dataset. Use sample variance formula for calculation.
Output:
printvar
iii. median: Filter data based on Typeofcard and then calculate the median of
Authenticationscore value for transactions that are in the lower th inclusive and upper
th inclusive percentile.
Output:
printmedian
iv corr: Filter malicious transactions where Actual and Origin places are different. Calculate
elementwise product between Authenticationscore and IPvalidationscore and then
perform correlation between the resultant vector and Amount column.
Output:
printcorr
v pca: Create a N x matrix where N is number of rows in the dataset and is the number of
columns, we will call these features Transactiontype, Entrymode, Amount,
Authenticationscore, and IPvalidityscorebefore that you need to convert all
CITS
Computer Analysis
and Visualisation
Page of
string values to numerical values. You can assume there will always be Transactiontype and
use the following values ATM: EFTPOS: and Internet: and four Entrymode Magnetic
Stripe: Manual: Chip Card Read: and NFC: Calculate principal component analysis PCA
to reduce the dimensionality of data to N X
The algorithm for PCA is:
a Standardize the data along all the features subtract mean and divid
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started