Question
# Your program should count the number of word occurrences contained in a file and # output a table showing the top 20 frequently used
# Your program should count the number of word occurrences contained in a file and # output a table showing the top 20 frequently used words in decreasing order of use.
def extract_words(string): """ Returns a list containing each word in the string, ignoring punctuation, numbers, etc. DO NOT CHANGE THIS CODE """ l = [] word = '' for c in string+' ': if c.isalpha(): word += c else: if word != '': l.append(word.lower()) word = '' return l def count_words(filename): """Returns a dictionary containing the number of occurrences of each word in the file.""" # create a dictionary # open the file and read the text # extract each word in the file # count the number of times each word occurs. # return the dictionary with the word count. return
def report_distribution(count): """Creates a string report of the top 20 word occurrences in the dictionary.""" # create a list containing tuples of count and word, # while summing the total number of word occurrences # add lines with the title and total word count to the output string # sort the list from largest number to smallest, # add a line to the output for each word in the top 20 containing count and word # return the string containing the report return
def main(): """ Prints a report with the word count for a file. DO NOT CHANGE THIS CODE """ filename = input('filename? ') print(report_distribution(count_words(filename)))
if __name__ == '__main__': main()
Assignment Your program should count the number of word occurrences contained in a file and output a table showing the top 20 frequently used words in decreasing order of use. Words with the same number of occurrences should be sorted in reverse alphabetic order Example Input The input files contains one or more lines of input containing words, spaces, and punctuation For example: How much wood would a woodchuck chuck if a woodchuck could chuck wood Note that the count_words function takes a single string. It is capable of processing the entire file contents as a string so you do not need to process the input a line at a time Example Output The output shows the top 20 most frequently used words in decreasing order of use. The output starts with a title line as shown in the example below. The second line should the count of the total number of words found (5 digits right aligned) Each line contains two columns separated by a space the number of the occurrences (5 digits right aligned) the word (left aligned) count word 13 2 woodchuck 2 wood 2 chuck 1 would 1 much 1 if 1 how 1 could Note the order of both the counts and the words. Remember that.sort and sorted will compare multiple elements of the items in the list when elements are equal as we discussed in class. One way to take advantage of this behavior is to order the elements in each item of the list so the sort produces the correct order by comparing first items, then comparing second items if the first are equal, and so on. Test Data The following test files are available for your use during development woodchuck (This is the full woodchuck tongue twister. It's a bit longer than the sample above) mlkdream ( Have a Dream speech by Martin Luther King, Jr., 1963) .susanbanthony (Woman's Rights to the Suffrage" by Susan B. Anthony, 1873) hamlet (Hamlet by William Shakespeare, written between 1599 and 1602) 1 # Your program should count the number of word occurrences contained in a file and 2 # output a table showing the top 20 frequently used words in decreasing order of use 4 def extract_words (string): Returns a list containing each word in the string, ignoring punctuation, numbers, etc. DO NOT CHANGE THIS CODE word = ' for c in string- 16 if c.isalpha() 12 13 14 15 16 17 18 19 20 def count_words (filename) 21 else: if word E'' 1.append(word. lower()) word'' return 1 "" "Returns a dictionary containing the number of occurrences of each word in the file.""" 23 24 25 26 27 28 29 31 32 # create a dictionary # open the file and read the text # extract each word in the file # count the number of times each word occurs # return the dictionary with the word count. return 34 def report_distribution (count): 35 36 37 38 39 40 41 42 43 """Creates a string report of the top 20 word occurrences in the dictionary.""" # create a list containing tuples of count and word, # while summing the total number of word occurrences # add lines with the title and total word count to the output string # sort the list from largest number to smallest, # add a line to the output for each word in the top 29 containing count and word # return the string containing the report return 45 46 47 48 def main() 49 50 51 52 53 54 print (report_distribution (count_words (filename))) Prints a report with the word count for a file DO NOT CHANGE THIS CODE filenameinput ('filename? ) 56 main() 58 59Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started