Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

No vector functions please. Thank you in advance! It won't let me add files other than jpg, or png. Can you write a template of

image text in transcribed

image text in transcribedimage text in transcribed

image text in transcribedimage text in transcribed

No vector functions please. Thank you in advance! It won't let me add files other than jpg, or png. Can you write a "template" of the code? You can google any txt file and use as reference. Appreciate the help.

Array doubling with dynamic memory OBJECTIVES 1. Read a file with unknown size and store its contents in a dynamic array 3. Store, search and iterate through data in an array of structs 4. Use array doubling via dynamic memory to increase the size of the array Overview There are several fields in computer science that aim to understand how people use language. One interesting example is analyzing the most frequently used words by certain authors, then using those frequencies to determine who authored a lost manuscript or anonymous note In this assignment, we will write a program to determine the most frequent words in a document. Because the number of words in the document may not be known a priori, we will implement a dynamically doubling array to store the necessary information Please read all the directions before writing code, as this write-up contains specific requirements for how the code should be written Your Task There are two files on Moodle. One contains text to be read and analyzed, and is named HarryPotter.txt. As the name implies, this file contains the full text from Harry Potter and the Sorcerer's Stone. For your convenience, all the punctuation has been removed, all the words have been converted to lowercase, and the entire document is written on a single line. The other file contains the 50 most common words in the English language, which your program will ignore. It is called ignoreWords.txt Your program must take three command line arguments in the following order - a number N, the name of the text to be read, and the name of the text file with the words that should be ignored. It will read in the text (ignoring the words in the second file) and store all unique words in a dynamically doubling array. It should then calculate and print the following information The number of array doublings needed to store all the unique words The number of unique "non-ignore" words in the file The total word count of the file (excluding the ignore words) The N most frequent words along with their probability of occurrence (up to 4 decimal places) 2. Use the array-doubling algorithm to increase the size of your array Your array will need to grow to fit the number of words in the file. Start with an array size of 100, and double the size whenever the array runs out of free space. You wil need to allocate your array dynamically and copy values from the old array to the new array Note: Don't use the built-in std::vector class. This will result in a loss of points. You're actually writing the code that the built-in vector uses behind-the-scenes! 3. Ignore the top 50 most common words that are read in from the second file To get useful information about word frequency, we will be ignoring the 50 most common words in the English language. These words will be read in from a file, whose name is the third command line argument. 4. Take three command line arguments Your program must take three command line arguments - a number N which tells your program how many of the most frequent words to print, the name of the text file to be read and analyzed, and the name of the text file with the words that should be ignored. 5. Output the top N most frequent words Your program should print out the top N most frequent words in the text - not including the common words - where N is passed in as a command line argument. 6. Format your output this way: Array doubled: Unique non-common words: Total non-common words: Probabilities of top most frequent words - - - For example, using the command ./Assignment2 10 HarryPotter.txt ignorewords.txt you should get the output Array doubled: 6 Unique non-common words: 5985 Total non-common words: 50331 Probabilities of top 10 most frequent words 0.0241 - harry 0.0236- Was 0.0158-said 0.0139 - had 0.0100- him 0.0081 - ron 0.0068were 0.0067 - hagrid 0.0065 - thenm 0.0052- back 7. You must include the following functions (they will be tested by the autograder): a. In your main function i. If the correct number of command line arguments is not passed, print the below statement and exit the program std: :cout "<>

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Secrets Of Analytical Leaders Insights From Information Insiders

Authors: Wayne Eckerson

1st Edition

1935504347, 9781935504344

More Books

Students also viewed these Databases questions