Answered step by step
Verified Expert Solution
Question
1 Approved Answer
To begin, you need to define a main ( filename , filter _ value, type _ of _ card ) function that will read the
To begin, you need to define a mainfilename filtervalue, typeofcard function that will
read the dataset and store the transaction records in data and call the below functions to display appropriate
results.
Sample Input:
mainCreditCardProjectcsv 'Port Lincoln', 'ANZ'
Task : Data Analysis using NumPy Mark:
Answer the following NumPy related tasks for data analysis. These will require use of NumPy functions and
methods, matrix manipulations, vectorized computations, NumPy statistics, NumPy where function, etc. To
complete this task, write a function called taskdata filtervalue, typeofcard where
CITS
Computer Analysis
and Visualisation
Page of
data contains all records from the dataset and filtervalue is an area name and typeofcard is the
name of the card provider. The function should return a list containing values from the following questions.
Return all results rounded to two decimal points.
Input:
cosdist, var, median, corr, pca taskdata 'Port Lincoln 'ANZ
output:
i cosdist: Calculate cosine distance between normal and malicious transactions based on
IPvalidityscore.
Formula:
Output:
printcosdist
ii var: Filter transactions based on certain geographical area eg Port Lincoln and calculate and display
the variance of transaction amount for a specific area. Note: use the Actual area column from the
dataset. Use sample variance formula for calculation.
Output:
printvar
iii. median: Filter data based on Typeofcard and then calculate the median of
Authenticationscore value for transactions that are in the lower th inclusive and upper
th inclusive percentile.
Output:
printmedian
iv corr: Filter malicious transactions where Actual and Origin places are different. Calculate dot
product between Authenticationscore and IPvalidationscore and then perform
correlation between the resultant vector and Amount column.
Output:
printcorr
v pca: Create a N x matrix where N is number of rows in the dataset and is the number of
columns, we will call these features Transactiontype, Entrymode, Amount,
Authenticationscore, and IPvalidityscorebefore that you need to convert all
CITS
Computer Analysis
and Visualisation
Page of
string values to numerical values. You can assume there will always be Transactiontype and
use the following values ATM: EFTPOS: and Internet: and four Entrymode Magnetic
Stripe: Manual: Chip Card Read: and NFC: Calculate principal component analysis PCA
to reduce the dimensionality of data to N X
The algorithm for PCA is:
a Standardize the data along all the features subtract mean and divide by standard deviation
over the feature dimension
b Calculate the covariance matrix for the features
c Perform eigen decomposition on the covariance matrix to get eigenvectors principal
components and eigenvalues
d Sort the eigenvectors based on their eigenvalues from highest to lowest
e Select top k eigenvectors k
f Transform the data using the selected eigenvectors dot product of eigenvectors and
Standardized data in step a
Output:
printpcashape
printpca:
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started