Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Write Python code to count the frequency of hashtags in a twitter feed. Your code assumes a twitter feed variable tweets exists, which is a
Write Python code to count the frequency of hashtags in a twitter feed.
Your code assumes a twitter feed variable tweets exists, which is a list of strings containing tweets. Each element of this list is a single tweet, stored as a string. For example, tweets may look like:
tweets Happy #IlliniFriday!",
It is a pretty campus, isn't it #illini?",
"Diving into the last weekend of winter break like... #ILLINI #JoinTheFight",
"Are you wearing your Orange and Blue today, #Illini Nation?"
Your code should produce a sorted list of tuples stored in hashtagcounts, where each tuple looks like hashtag count hashtag is a string and count is an integer. The list should be sorted by count in descending order, and if there are hashtags with identical counts, these should be sorted alphabetically, in ascending order, by hashtag.
From the above example, our unsorted hashtagcounts might look like:
#illini', #jointhefight', #illinifriday!#illini?
The hashtagcounts sorted by the above specifications will look like:
#illini', #illini?#illinifriday!#jointhefight',
You may use strsplit to split each tweet into a list of words. A hashtag is any word that starts with a hash mark #That means that the hash mark # should be included in the hashtag value above.
StepsHints:
Preprocessing: You will need to convert each hashtag to lower case before you count it For example, for this question #UIUC and #Uiuc add to the count of same hashtag #uiuc
Do not further process the tweets or hashtags beyond using split such as attempting to remove punctuation. While in the 'real world' you would absolutely do this, in this problem the autograder will be unhappy with you if you do
And if using split do not pass any arguments when no arguments are added then every kind of whitespace will be considered
You may find it helpful to use an intermediate data structure for this problem to count the frequency of each hastag.
If you aren't sure how to sort or convert to lowercase, you may find Python docs how to sort and Python docs for string methods useful.
Optional Practice Plotting with Matplotlib
Try using pltbarh to plot a histogram illustrating the frequency of each word. Try adding xaxis and yaxis labels and a title to your plot.
We won't be grading your plot. However we will be grading plots in future assignments. Therefore we strongly recommend giving this a shot to make sure you are familiar with matplotlib plots, how to add labels, titles, etc.
The setup code gives the following variable:
Name Type Description
tweets list a list of strings containing tweets
Your code snippet should define the following variable:
Name Type Description
hashtagcounts list list of tuples hashtagcount where hashtag is a string and count is an integer
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started