Answered step by step
Verified Expert Solution
Link Copied!
Question
1 Approved Answer

Python Assignment. Function to analyze Numpy Array: Assume we have an array which contains term frequency of each document. Where each row is a document,

Python Assignment.

Function to analyze Numpy Array:

Assume we have an array which contains term frequency of each document. Where each row is a document, each column is a word, and the value denotes the frequency of the word in the document. Define a function named "analyze_tf" which:has two input parameters:

1. a rank 2 input array

2. a parameter "binary" with a default value set to False

It does the following steps in sequence:

a) If "binary" is True, binarizes the input array, i.e. if a value is greater than 1, change it to 1.

b) Normalizes the frequency of each word as: word frequency divided by the length of the document (i.e. sum of each row). Save the result as an array named tf (i.e. term frequency). The sum of each row of tf should be 1.

c) calculates the document frequency (df) of each word, i.e. how many documents contain a specific word

d) calculate the inverse document frequency (idf) of each word as N/df (df divided by N) where N is the number of documents

e) calculates tf_idf array as: tf * log (idf) (tf multiply the log (base e) of idf ). The reason is, if a word appears in most documents, it does not have the discriminative power and often is called a "stop" word. The inverse of df can downgrade the weight of such words.

returns the tf_idf array.

Note, for all the steps, do not use any loop. Just use array functions and broadcasting for high performance computation.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image
Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students explore these related Databases questions