[Solved] data1.txt:https://www.cse.msu.edu/~cse231

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 24, 2024

data1.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/data1.txt data2.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/data2.txt document1.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/document1.txt document2.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/document2.txt lab09a.py:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/lab09a.py Assignment overview This lab exercise provides practice with dictionaries in Python. A. Modify a program that uses Dictionaries. Consider the

image text in transcribed

data1.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/data1.txt

data2.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/data2.txt

document1.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/document1.txt

document2.txt:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/document2.txt

lab09a.py:https://www.cse.msu.edu/~cse231/Online/Labs/Lab09/lab09a.py

Assignment overview This lab exercise provides practice with dictionaries in Python. A. Modify a program that uses Dictionaries. Consider the file named "lab08a.py". That file contains the skeleton of a Python program to do a simple analysis of a text file: it will display the number of unique words which appear in the file, along with the number of times each word appears. Case does not matter: the words "pumpkin", "Pumpkin" and "PUMPKIN" should be treated as the same word. (The word "map" is used in identifiers because sometimes a dictionary is called a "map.) Execute the program (which currently uses "document1.txt" as the data file) and inspect the output. a. Replace each of the lines labeled "YOUR COMMENT' with meaningful comments to describe the work being done in the next block of statements. Use more than one comment line, if necessary b. Add doc strings to each function to describe the work being done in the function c. The program currently processes the empty string as a word. Revise the program to exclude empty strings from the collection of words. d. The program currently processes words such as "The" and "the" as different words. Revise the program to ignore case when processing words. e. The program currently always uses "document 1.txt" as the input file. Revise the program to prompt the user for the name of the input file f. Revise the program to display the collection of words sorted by greatest frequency of occurrence to least frequency, and sorted alphabetically for words with the same frequency count. Since the sorted function and the sort method are stable sorts, you can first sort the words alphabetically, and then sort them by frequenc y (with reverse True (You do the two sorts in that order because you do the primary key last, frequency is the primary key in this case.) By default sorting is done on the first item in a list or tuple. To sort on other items use itemgetter from the operator module. See documentation here, focus on the students tuple example https //docs thon.org/3/howto/sorting.html g. Test the revised program. There are two sample documents available: "document l.txt" (The Declaration of Independence) and "document2.txt" The Gettysburg Address) Assignment overview This lab exercise provides practice with dictionaries in Python. A. Modify a program that uses Dictionaries. Consider the file named "lab08a.py". That file contains the skeleton of a Python program to do a simple analysis of a text file: it will display the number of unique words which appear in the file, along with the number of times each word appears. Case does not matter: the words "pumpkin", "Pumpkin" and "PUMPKIN" should be treated as the same word. (The word "map" is used in identifiers because sometimes a dictionary is called a "map.) Execute the program (which currently uses "document1.txt" as the data file) and inspect the output. a. Replace each of the lines labeled "YOUR COMMENT' with meaningful comments to describe the work being done in the next block of statements. Use more than one comment line, if necessary b. Add doc strings to each function to describe the work being done in the function c. The program currently processes the empty string as a word. Revise the program to exclude empty strings from the collection of words. d. The program currently processes words such as "The" and "the" as different words. Revise the program to ignore case when processing words. e. The program currently always uses "document 1.txt" as the input file. Revise the program to prompt the user for the name of the input file f. Revise the program to display the collection of words sorted by greatest frequency of occurrence to least frequency, and sorted alphabetically for words with the same frequency count. Since the sorted function and the sort method are stable sorts, you can first sort the words alphabetically, and then sort them by frequenc y (with reverse True (You do the two sorts in that order because you do the primary key last, frequency is the primary key in this case.) By default sorting is done on the first item in a list or tuple. To sort on other items use itemgetter from the operator module. See documentation here, focus on the students tuple example https //docs thon.org/3/howto/sorting.html g. Test the revised program. There are two sample documents available: "document l.txt" (The Declaration of Independence) and "document2.txt" The Gettysburg Address)