Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Distributed word counting program in Java Please do not copy & paste off the internet, as the other people have done. Im trying to implement

Distributed word counting program in Java

Please do not copy & paste off the internet, as the other people have done.

Im trying to implement a program to calculate a frequency table for the words that appear in a set of individual text documents using multiple threads in Java. So, for example given a single file, write a method to compute the frequency of each word that appears in that file. This could be done using the space character as the delimiter for tokenising texts and assume all tokens separated by a space character to be a word. Counting the word frequency distribution in all the files shall be carried out by separate threads. Each thread being implemented as a distinct thread (multithreading). Each thread must first get the list of the names of the files to process from the files containing random text.

It shall then proceed in phases, in each phase choosing a file at random from those not yet processed by any worker and then processing that file. The program should implement some method to guarantee that no file is processed twice (For each file, only a single thread should ever process that file, of course, without being to tell in advance which thread that will be). Once a thread has computed the word frequency distribution of a single file, it must write the result to a central data structure (shared between all threads), such as an associative array or a hash table named. However, no two threads must be writing to this array/hash table at the same time. Implement this level of thread synchronisation and mutual exclusion using some locking mechanism.

Also, vary the number of threads from 1 to 100 and measure in each repetition the time to complete the entire task. Plot results in an x y scatter plot, where the x-axis represents the number of threads and the y-axis represents the time taken in milliseconds. Please explain how to work the program.

Thank you, have a pleasant day

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions