Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

# Your program should count the number of word occurrences contained in a file and # output a table showing the top 20 frequently used

image text in transcribedimage text in transcribed

# Your program should count the number of word occurrences contained in a file and # output a table showing the top 20 frequently used words in decreasing order of use.

def extract_words(string): """ Returns a list containing each word in the string, ignoring punctuation, numbers, etc. DO NOT CHANGE THIS CODE """ l = [] word = '' for c in string+' ': if c.isalpha(): word += c else: if word != '': l.append(word.lower()) word = '' return l def count_words(filename): """Returns a dictionary containing the number of occurrences of each word in the file.""" # create a dictionary # open the file and read the text # extract each word in the file # count the number of times each word occurs. # return the dictionary with the word count. return

def report_distribution(count): """Creates a string report of the top 20 word occurrences in the dictionary.""" # create a list containing tuples of count and word, # while summing the total number of word occurrences # add lines with the title and total word count to the output string # sort the list from largest number to smallest, # add a line to the output for each word in the top 20 containing count and word # return the string containing the report return

def main(): """ Prints a report with the word count for a file. DO NOT CHANGE THIS CODE """ filename = input('filename? ') print(report_distribution(count_words(filename)))

if __name__ == '__main__': main()

Assignment Your program should count the number of word occurrences contained in a file and output a table showing the top 20 frequently used words in decreasing order of use. Words with the same number of occurrences should be sorted in reverse alphabetic order Example Input The input files contains one or more lines of input containing words, spaces, and punctuation For example: How much wood would a woodchuck chuck if a woodchuck could chuck wood Note that the count_words function takes a single string. It is capable of processing the entire file contents as a string so you do not need to process the input a line at a time Example Output The output shows the top 20 most frequently used words in decreasing order of use. The output starts with a title line as shown in the example below. The second line should the count of the total number of words found (5 digits right aligned) Each line contains two columns separated by a space the number of the occurrences (5 digits right aligned) the word (left aligned) count word 13 2 woodchuck 2 wood 2 chuck 1 would 1 much 1 if 1 how 1 could Note the order of both the counts and the words. Remember that.sort and sorted will compare multiple elements of the items in the list when elements are equal as we discussed in class. One way to take advantage of this behavior is to order the elements in each item of the list so the sort produces the correct order by comparing first items, then comparing second items if the first are equal, and so on. Test Data The following test files are available for your use during development woodchuck (This is the full woodchuck tongue twister. It's a bit longer than the sample above) mlkdream ( Have a Dream speech by Martin Luther King, Jr., 1963) .susanbanthony (Woman's Rights to the Suffrage" by Susan B. Anthony, 1873) hamlet (Hamlet by William Shakespeare, written between 1599 and 1602) 1 # Your program should count the number of word occurrences contained in a file and 2 # output a table showing the top 20 frequently used words in decreasing order of use 4 def extract_words (string): Returns a list containing each word in the string, ignoring punctuation, numbers, etc. DO NOT CHANGE THIS CODE word = ' for c in string- 16 if c.isalpha() 12 13 14 15 16 17 18 19 20 def count_words (filename) 21 else: if word E'' 1.append(word. lower()) word'' return 1 "" "Returns a dictionary containing the number of occurrences of each word in the file.""" 23 24 25 26 27 28 29 31 32 # create a dictionary # open the file and read the text # extract each word in the file # count the number of times each word occurs # return the dictionary with the word count. return 34 def report_distribution (count): 35 36 37 38 39 40 41 42 43 """Creates a string report of the top 20 word occurrences in the dictionary.""" # create a list containing tuples of count and word, # while summing the total number of word occurrences # add lines with the title and total word count to the output string # sort the list from largest number to smallest, # add a line to the output for each word in the top 29 containing count and word # return the string containing the report return 45 46 47 48 def main() 49 50 51 52 53 54 print (report_distribution (count_words (filename))) Prints a report with the word count for a file DO NOT CHANGE THIS CODE filenameinput ('filename? ) 56 main() 58 59

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Implementing Ai And Machine Learning For Business Optimization

Authors: Robert K Wiley

1st Edition

B0CPQJW72N, 979-8870675855

More Books

Students also viewed these Databases questions

Question

Which are non projected Teaching aids in advance learning system?

Answered: 1 week ago