Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

C++ IDE: Code Block Programming Assignments Project: Text Analyzer: Word Count Write a program that parses a text file into words and counts the number

C++

IDE: Code Block

Programming Assignments Project: Text Analyzer: Word Count

Write a program that parses a text file into words and counts the number of occurrences of each word. As you are reading the words, keep track of the number of distinct words, maximum length, and highest frequency. Read a new word then search for it: if found, add one to the counter; if not found, add the new word at the end of the list and initialize its counter to 1 (see a detailed example on the last page). Sort the array in alphabetical order (from a to z) using the insertion sort algorithm. Save the sorted list of words to a file as shown below:

6 words

Maximum length: 7

Highest frequency: 125

9 a

21 boat

125 dream

44 merrily

125 row

5 stream

Display a list of high-frequency words as shown below:

125 dream

125 row

There are 2 high-frequency words.

You may assume that the text does not contain more than 100 words (i.e. declare an array of MAX_SIZE = 100) and that the frequency is not greater than 999 (for output formatting). Assume that the text files contain only letters, spaces and punctuation characters and that only the first letter in some words could be in upper case.

Run the program once using the following input files: song_row.txt, song_ten.txt, and test.txt(see below).

Create the input file song_row.txt, with the following data:

Row, row, row your boat,

Gently down the stream.

Merrily, merrily, merrily, merrily,

Life is but a dream.

Create the input file song_ten.txt, with the following data:

Ten green bottles hanging on the wall,

Ten green bottles hanging on the wall,

And if one green bottle should accidentally fall,

There will be nine green bottles hanging on the wall.

Create the input file test.txt, with the following data

The one two three waltz: one two three one two three step two three step two

three one two three one two three waltz

waltz waltz waltz waltz waltz. The end.

The program will generate names for the output files by appending OUT at the end of the input file name, as shown below: song_row_OUT.txt, song_ten_OUT.txt, and test_OUT.txt

Example: How to read data from file into an array of structures and keep track of the longest words length and highest frequency.

Row, row, row your boat,

Gently down the stream.

Merrily, merrily, merrily, merrily,

Life is but a dream.

1. Read Row,, convert it to row, then insert it into the array: { {row, 1} } // array has one item, highest frequency is 1, longest word is 3.

2. Read row,, convert it to row, since it is found in the array add 1 to its counter { {row, 2} } // array has 1 item, highest frequency is 2, longest word has 3 characters.

3. Read row, since it is found in the array add 1 to its counter { {row, 3} } // array has 1 item, highest frequency is 3, longest word has 3 characters.

4. Read your, since it is not found, add it at the end of the array { {row, 3}, {your, 1} } // array has 2 items, highest frequency is 3, longest word has 4 characters.

5. Read boat,, convert it to boat then add it at the end of the array { {row, 3}, {your, 1}, {boat, 1} } // array has 3 items, highest frequency is 3, longest word has 4 characters.

6. Read Gently, converted to gently, then add it at the end of the array { {row, 3}, {your, 1}, {boat, 1}, {gently, 1} } // array has 4 items, highest frequency is 3, longest word has 6 characters.

7. Read down, then add it at the end of the array { {row, 3}, {your, 1}, {boat, 1}, {gently, 1}, }, {down, 1} } // array has 5 items, highest frequency is 3, longest word has 6 characters.

8. Read the, then add it at the end of the array { {row, 3}, {your, 1}, {boat, 1}, {gently, 1}, {down, 1}, {the, 1} } // array has 6 items, highest frequency is 3, longest word has 6 characters.

. . . and so on.

Pseudo-code for the readWords function:

open input file (with validation)

set highest frequency to 0

set number of words to 0

set length of the longest word to 0

loop (not end of the file)

read one word

process word as needed

if (word is not found in the list)

add the new word at the end of the list

set the counter of the new word to 1

add 1 to the number of words

update the longest length if needed

else // found

add 1 to its counter

update the highest frequency if needed

end if

end loop

close file

code:

/**

CIS 22B: Homework 4C

Structures and Strings

Project: Text Analyzer: Word Count

NAME:

*/

#include

#include

using namespace std;

const bool DE_BUG = true; // when done debugging change the DE_BUG flag to false

const int ARY_SIZE = 100;

/* Define a struct named Word with the following fields:

1. word, a string

2. cnt, an integer

*/

/* Define a struct named WordStats with the following fields:

1. list, an array of ARY_SIZE Word structures defined above

2. noWords, an integer, for the number of distinct words

3. maxLength, an integer, for the longest word's length

4. highFreq, an integer, for the highest frequency

*/

// Function prototypes

void printWelcome(void);

void readWords(string inFilename, WordStats *stats);

void insertionSort(Word ary[], int size);

void printHigestFreq( const WordStats *stats);

void writeFile( string outFilename, const WordStats *stats);

void printEnd(void);

int main()

{

string inFilename[] = {"song_row.txt", "song_ten.txt", "test.txt", ""};

string outFilename = "";

// use the following data for testing

// insertionSort, writeFile, and printprintHigestFreq functions

WordStats stats =

{

{ {"a", 9}, {"boat", 21}, {"dream", 125}, {"merrily", 44}, {"row", 125}, {"stream", 5}} ,

6,

7,

125

};

printWelcome();

for (int i = 0; inFilename[i] != ""; i++)

{

cout << "Read data from: " << inFilename[i] << endl;

// readWords(inFilename, &data);

insertionSort(stats.list, stats.noWords);

writeFile(outFilename, &stats);

printHigestFreq(&stats);

// generate the output file name

cout << "Write report to: " << outFilename << endl << endl;

}

printEnd();

return 0;

}

/**************************************************

*/

void buildList(string inFilename, WordStats *stats)

{

if (DE_BUG)

cout << "\tDEBUG: This is the readWords function" << endl;

}

/**************************************************

*/

void insertionSort(Word ary[], int size)

{

if (DE_BUG)

cout << "\tDEBUG: This is the insertionSort function" << endl;

}

/**************************************************

*/

void printHigestFreq( const WordStats *stats)

{

if (DE_BUG)

cout << "\tDEBUG: This is the printHigestFreq function" << endl;

}

/**************************************************

*/

void writeFile( string outFilename, const WordStats *stats)

{

if (DE_BUG)

cout << "\tDEBUG: This is the writeFile function" << endl;

}

/**************************************************

This function displays a welcome message and briefly

explains the purpuse of this program.

*/

void printWelcome(void)

{

if (DE_BUG)

cout << "\tDEBUG: This is the welcome function " << endl;

cout << "\t\tHomework 4C \tText Analyzer: Word Count ";

cout << "Developer: " << "write your name here" << endl;

}

/**************************************************

This function displays an end of the program message

*/

void printEnd(void)

{

if (DE_BUG)

cout << "\tDEBUG: This is the end of the program function " << endl;

}

/***************************************************************

Save the OUTPUT below

DEBUG: This is the welcome function

Homework 4C

Text Analyzer: Word Count

Read data from: song_row.txt

DEBUG: This is the insertionSort function

DEBUG: This is the writeFile function

DEBUG: This is the printHigestFreq function

Write report to:

Read data from: song_ten.txt

DEBUG: This is the insertionSort function

DEBUG: This is the writeFile function

DEBUG: This is the printHigestFreq function

Write report to:

Read data from: test.txt

DEBUG: This is the insertionSort function

DEBUG: This is the writeFile function

DEBUG: This is the printHigestFreq function

Write report to:

DEBUG: This is the end of the program function

Program ended with exit code: 0

*/

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Put Your Data To Work 52 Tips And Techniques For Effectively Managing Your Database

Authors: Wes Trochlil

1st Edition

0880343079, 978-0880343077

More Books

Students also viewed these Databases questions

Question

In an Excel Pivot Table, how is a Fact/Measure Column repeated?

Answered: 1 week ago