Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This is need coding with : Python v3 Your program should be able to read a file and then counts the word frequencies as well

This is need coding with : Python v3 Your program should be able to read a file and then counts the word frequencies as well as the letter frequencies. Results should be stored in the two global variables. You will need at least 4 functions where one is already given:

cleanupLine(line)

- accepts one line and removes characters that are not needed. Unwanted characters are all characters except a-z, A-Z, 0-9 and '. They should be replaced with a space.

Samples:

long-term -> long term

It's amazing, isn't it? -> It's amazing isn't it

countWords(line)

- For a stripped line, this function counts the words and updates the global variable wordsDict{}. For instance, the line "Hello hello World" should update wordsDict to {'hello' : 2, 'world' : 1}.

Note, we convert upper case letters to lower case

countLetters(line)

- For a stripped line, this function counts the letters and updates the global variable lettersDict{}. For instance, "hell" should update to lettersDict {'h' : 1, 'e' : 1, 'l' : 2}

Note, we convert upper case letters to lower case.

Note2, numbers and ' should be ignored"""

Please process the textfiles on your machine and then have a last function named results() returning a list with the frequencies as follows:

e for file1, t for file2 and w for file3 followed by the frequency of

"to" for file1, "the" for file2 and "computer" for file3.

Pseudocode e.g.:

return [e for file1, t for file2, w for file3, "to" for file1, "the" for file2, "computer" for file3]

might look like:

return [1234, 123456, 213, 123556, 122, 53421]

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

wordsDict = {} lettersDict = {}

def cleanupLine(line): """this function will remove characters that are not needed from the line-string. Unwanted characters are all characters except a-z, A-Z, 0-9 and ' and should be replaced with a space long-term -> long term It's amazing, isn't it? -> Is's amazaing isn't itgghnhhh Note, if you are familiar with regex, you can use that, otherwise a loop is fine""" stripped_line = "" return stripped_line

def countWords(line): """For a stripped line, this function counts the words and updates the globla variable wordsDict{}. Note, we convert upper case words to lower case words""" global wordsDict return wordsDict

def countLetters(line): """For a stripped line, this function counts the letters and updates the globla variable lettersDict{}. Note, we convert upper case letters to lower case Note2, numbers and ' should be ignored""" global lettersDict return lettersDict

def readFile(filename): handle = open(filename, 'r') for line in handle: stripped_line = cleanupLine(line) countWords(stripped_line) countLetters(stripped_line) def results(): return [0,0,0,0,0,0]

readfile("text1.txt") print(lettersDict['e'])

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional Visual Basic 6 Databases

Authors: Charles Williams

1st Edition

1861002025, 978-1861002020

More Books

Students also viewed these Databases questions