Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 25, 2024

I'm trying to create a C++ program that can generate the statistics of any large text. That can calculate which word is frequently in the

I'm trying to create a C++ program that can generate the statistics of any large text. That can calculate which word is frequently in the text. This is the introctuction, it doesnt have to open the file Corpus_cleaner because i will do that part but the program should open any file. so please deregard the Corpus_Cleaner name.

Statistics analyzer (plaintext):

Please create a program called sol_sap.ext, where ext denotes the file extension corresponding to your choice of programming language (.py, .c, .cpp, .c++, or .java).

This program should open and read in a file named corpus_clean.txt, which is the output from sol_cleaner.ext. As output it should produce a file called corpus_freq.txt, which contains the following. Each row is a pair ,

where letter is a character occurring in the text and rel_freq is the relative frequency of the letter in the corpus. The relative frequency of a letter c is defined by:

relative frequency of c = #occurrences of c in corpus #letters in corpus

Note that this will be a floating point number.

The letter/frequency pairs should be given in order of descending frequency. For example, the file should roughly look like this.

e, 0.082198

a, 0.050031 (etc)

z, 0.003000

I have made up the values in the above table for the sake of example.

It is possible to implement this algorithm in a way that only makes one pass over the corpus. However, if you prefer to read through the file 27 times, that is feasible.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Databases In Networked Information Systems 6th International Workshop Dnis 2010 Aizu Wakamatsu Japan March 2010 Proceedings Lncs 5999

Databases In Networked Information Systems 6th International Workshop Dnis 2010 Aizu Wakamatsu Japan March 2010 Proceedings Lncs 5999

Authors: Shinji Kikuchi ,Shelly Sachdeva ,Subhash Bhalla

2010th Edition

3642120377, 978-3642120374

More Books

Students also viewed these Databases questions

Question

List a few challenges selection process has to face in the current business environment.

Answered: 1 week ago

Question

★★★★★

Kinnion Medical Clinic has budgeted the following cash flows. Kinnion Medical had a cash balance of $8,000 on January 1. The company desires to maintain a cash cushion of $5,000. Funds are assumed to...

Answered: 1 week ago

Question

★★★★★

I'm trying to create a C++ program that can generate the statistics of any large text. That can calculate which word is frequently in the text. This is the introctuction, it doesnt have to open the...

Answered: 1 week ago

Question

★★★★★

Douglas McGregor described two very different sets of managerial attitudes about employees, which he called Multiple Choice the positive view and the negative view. the macro perspective and the micro

Answered: 1 week ago

Question

★★★★★

PSY-520 Graduate Statistics Topic 3 - Benchmark - Correlation and Regression Project Directions: Use the following information to complete the questions below. Use the following data points that have...

Answered: 1 week ago

Question

★★★★★

Wonder land Ltd. operates a housing development project.There was a fire on its premises on 31 January 2019 which destroyed most of the building, although stock to the value of Sh.3,960,000 was...

Answered: 1 week ago

Question

★★★★★

21. Davis Corporation expects the following transactions in 20X3. Their first year of operations: Sales (90% collectible in 20X3). P1,500,000 Bad debt write-offs...... 60,000 Disbursements of costs...

Answered: 1 week ago

Question

★★★★★

Question 5 [9 marks] Bronson, a successful fund manager at a well-known trust fund firm, has more than 15 years of experience under his belt. He has managed as significant as $500 million worth of...

Answered: 1 week ago

Question

★★★★★

*\ Headset manufactures headphone cases. During September 2024, the company produced and sold 106,000 cases and recorded the following cost data: (Click the icon to view the cost data.) Read the...

Answered: 1 week ago

Question

★★★★★

=+The IHR manager should know who the peer group is of the particular employee considered for assignment so that they can justify the elements of the C&B package.

Answered: 1 week ago

Question

★★★★★

=+contract, or moving on to a third country? INTERNATIONAL COMPENSATION, BENEFITS, AND TAXES

Answered: 1 week ago

Question

★★★★★

=+5 Where is the IA leaving from and going to? Which are the home and host countries of the IA and what is the context of these countries with regard to

Answered: 1 week ago

Previous Question Next Question