Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

To begin, you need to define a main ( filename , filter _ value, type _ of _ card ) function that will read the

To begin, you need to define a main(filename, filter_value, type_of_card) function that will
read the dataset and store the transaction records in data and call the below functions to display appropriate
results.
Sample Input:
main('CreditCard_2024_Project2.csv', 'Port Lincoln', 'ANZ')
Task 1: Data Analysis using NumPy Mark: 15
Answer the following 5 NumPy related tasks for data analysis. These will require use of NumPy functions and
methods, matrix manipulations, vectorized computations, NumPy statistics, NumPy where function, etc. To
complete this task, write a function called task1(data, filter_value, type_of_card), where
CITS 2401
Computer Analysis
and Visualisation
Page 2 of 6
data contains all records from the dataset and filter_value is an area name and type_of_card is the
name of the card provider. The function should return a list containing values from the following questions.
Return all results rounded to two decimal points.
Input:
cos_dist, var, median, corr, pca = task1(data, 'Port Lincoln ', 'ANZ ')
output:
[0.06,1337142.45,[5.75,7.21],-0.06,[0.73,0.81,0.7,0.93,0.72,...]]
i. cos_dist: Calculate cosine distance between normal and malicious transactions based on
IP_validity_score.
Formula:
=1
(,)=
Output:
print(cos_dist)=0.06
ii. var: Filter transactions based on certain geographical area e.g. Port Lincoln and calculate and display
the variance of transaction amount for a specific area. Note: use the Actual area column from the
dataset. Use sample variance formula for calculation.
Output:
print(var)=1337142.45
iii. median: Filter data based on Type_of_card and then calculate the median of
Authentication_score value for transactions that are in the lower 25th (inclusive) and upper
75th (inclusive) percentile.
Output:
print(median)=[5.75,7.21]
iv. corr: Filter malicious transactions where Actual and Origin places are different. Calculate dot
product between Authentication_score and IP_validation_score and then perform
correlation between the resultant vector and Amount column.
Output:
print(corr)=-0.06
v. pca: Create a N x 5 matrix where N is number of rows in the dataset and 5 is the number of
columns, we will call these features (Transaction_type, Entry_mode, Amount,
Authentication_score, and IP_validity_score)(before that you need to convert all
CITS 2401
Computer Analysis
and Visualisation
Page 3 of 6
string values to numerical values. You can assume there will always be 3 Transaction_type and
use the following values - ATM: 1, EFTPOS: 2, and Internet: 3 and four Entry_mode - Magnetic
Stripe: 1, Manual: 2, Chip Card Read: 3, and NFC: 4). Calculate principal component analysis (PCA)
to reduce the dimensionality of data to N X 1.
The algorithm for PCA is:
a) Standardize the data along all the features (subtract mean and divide by standard deviation
over the feature dimension).
b) Calculate the covariance matrix for the features
c) Perform eigen decomposition on the covariance matrix to get eigenvectors (principal
components) and eigenvalues
d) Sort the eigenvectors based on their eigenvalues from highest to lowest
e) Select top k eigenvectors (k=1)
f) Transform the data using the selected eigenvectors (dot product of eigenvectors and
Standardized data in step a)
Output:
print(pca.shape)=(10000,1)
print(pca[0:5])=[0.73,0.81,0.7,0.93,0.72]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

OCA Oracle Database SQL Exam Guide Exam 1Z0-071

Authors: Steve O'Hearn

1st Edition

1259585492, 978-1259585494

More Books

Students also viewed these Databases questions