Question
This is need coding with : Python v3 Your program should be able to read a file and then counts the word frequencies as well
This is need coding with : Python v3 Your program should be able to read a file and then counts the word frequencies as well as the letter frequencies. Results should be stored in the two global variables. You will need at least 4 functions where one is already given:
cleanupLine(line)
- accepts one line and removes characters that are not needed. Unwanted characters are all characters except a-z, A-Z, 0-9 and '. They should be replaced with a space.
Samples:
long-term -> long term
It's amazing, isn't it? -> It's amazing isn't it
countWords(line)
- For a stripped line, this function counts the words and updates the global variable wordsDict{}. For instance, the line "Hello hello World" should update wordsDict to {'hello' : 2, 'world' : 1}.
Note, we convert upper case letters to lower case
countLetters(line)
- For a stripped line, this function counts the letters and updates the global variable lettersDict{}. For instance, "hell" should update to lettersDict {'h' : 1, 'e' : 1, 'l' : 2}
Note, we convert upper case letters to lower case.
Note2, numbers and ' should be ignored"""
Please process the textfiles on your machine and then have a last function named results() returning a list with the frequencies as follows:
e for file1, t for file2 and w for file3 followed by the frequency of
"to" for file1, "the" for file2 and "computer" for file3.
Pseudocode e.g.:
return [e for file1, t for file2, w for file3, "to" for file1, "the" for file2, "computer" for file3]
might look like:
return [1234, 123456, 213, 123556, 122, 53421]
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
wordsDict = {} lettersDict = {}
def cleanupLine(line): """this function will remove characters that are not needed from the line-string. Unwanted characters are all characters except a-z, A-Z, 0-9 and ' and should be replaced with a space long-term -> long term It's amazing, isn't it? -> Is's amazaing isn't itgghnhhh Note, if you are familiar with regex, you can use that, otherwise a loop is fine""" stripped_line = "" return stripped_line
def countWords(line): """For a stripped line, this function counts the words and updates the globla variable wordsDict{}. Note, we convert upper case words to lower case words""" global wordsDict return wordsDict
def countLetters(line): """For a stripped line, this function counts the letters and updates the globla variable lettersDict{}. Note, we convert upper case letters to lower case Note2, numbers and ' should be ignored""" global lettersDict return lettersDict
def readFile(filename): handle = open(filename, 'r') for line in handle: stripped_line = cleanupLine(line) countWords(stripped_line) countLetters(stripped_line) def results(): return [0,0,0,0,0,0]
readfile("text1.txt") print(lettersDict['e'])
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started