Question
In Python 3.6, you can create your own file to use and put each word on a single line. Write a program that computes the
In Python 3.6, you can create your own file to use and put each word on a single line.
Write a program that computes the TF-IDF of a file of tweets. TF-IDF is a function that finds highly discriminating words or important words from a set of documents. (Formula for tfidf below)
Functions:
1) Write a function that accepts a single word as a parameter, sets the word to lower case, and removes the following punctuation: .,->"<'
2) A function that is passed two parameters: a word (a str), and a sentence (also a str). The function should return True if the word is in the sentence, and False if it is not.
3) A function that is passed text (either a file pointer, array of sentences, or one large string), and returns a dict that has a word as a key, and the value is the count of that word. Make sure there are no empty strings. The keys should have been passed through the clean function
4) A function that takes a list of tuples that was generated by the TF-IDF algorithm. The tuples will be a word, and the TF-IDF ranking of the word. The function should return a sorted list, sorted from largest TF-IDF ranking to smallest. ***Create a list using the steps below
Each tweet is on a single line. Each tweet should be processed as a separate document, in terms of usage for TF-IDF. Calculate the TF-IDF ranking of all the words in the entire tweets.txt file. Use the functions listed above to complete this task. To do this:
Read the file,
Find the frequency of every word.
For every word seen in the file: calculate the associated TF-IDF ranking. Save this into a list of tuples.
Sort the list in reverse order by TF-IDF ranking
Print a report
Formulas:
TF = freq(w) / total unique words
IDF = log(# of documents with word w/# of documents)
TFIDF = TF x IDF
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started