Question
Currently working on the basics of Python with Zelle's book and doing some extra exercises. The question is based on chapter 11 and covers list
Currently working on the basics of Python with Zelle's book and doing some extra exercises. The question is based on chapter 11 and covers list application and dictionary basics.
Question: Write a Python program that reads a text from standard input and subsequently keeps track of all the bigrams in the text.
All bigrams with their frequency must be written to standard output, in order of frequency.
The dictionary you need has bigrams as its key, which in Python can be represented best as a tuple of the two words.
A bigram only counts if the words are on the same line.
The text has already been tokenized: - Punctuation marks/special characters have already been separated from the words. - Therefore the punctuation marks themselves will also be considered as words. - Every line contains exactly 1 sentence. - Capital letters are irrelevant.
example.txt: This sentence contains 5 bigrams . This sentence
Output:
the command 'cat example.txt | python3 bigrams.py' should show:
This sentence 2 Sentence contains 1 contains 5 1 5 bigrams 1 bigrams . 1
What I have so far (likely to be completely wrong):
import sys
def main():
for line in sys.stdin:
words = line.split()
bigrams_list = []
for i in range(len(words) - 1): bigrams_list.append((words[i], words[i + 1]))
mydict = {}
for bigram in bigrams_list:
if bigram in mydict:
mydict[bigram] = mydict[bigram] + 1
else:
mydict=[bigram] = 1
print(mydict)
main()
Can anyone help with this?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started