Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

USING THE FOLLOWING PYTHON VERSION AS AN EXAMPLE, write a t erm frequency program following the following STYLE constraints and requirements in JAVASCRIPT with NODE.JS:

USING THE FOLLOWING PYTHON VERSION AS AN EXAMPLE, write a term frequency program following the following STYLE constraints and requirements in JAVASCRIPT with NODE.JS: 1. Program must run on command line and take an input file of text called pride-and-prejudice.txt and must output only the TOP 25 most frequent words with their counts and MUST be in order of most frequent at the top and MUST output to a new text file called output.txt NOT the command line. It must FILTER out the STOP WORDS from the list below and take the stop_words.txt file as input (not a string of words hardcoded). Make sure to have the appropriate filters to avoid including output under 2 characters or anything that will make the output different than what I have included below!

stop_words.txt:

a,able,about,across,after,all,almost,also,am,among,an,and,any,are,as,at,be,because,been,but,by,can,cannot,could,dear,did,do,does,either,else,ever,every,for,from,get,got,had,has,have,he,her,hers,him,his,how,however,i,if,in,into,is,it,its,just,least,let,like,likely,may,me,might,most,must,my,neither,no,nor,not,of,off,often,on,only,or,other,our,own,rather,said,say,says,she,should,since,so,some,than,that,the,their,them,then,there,these,they,this,tis,to,too,twas,us,wants,was,we,were,what,when,where,which,while,who,whom,why,will,with,would,yet,you,your

*****Correct output will look like this if written correctly so ENSURE THE STOP WORDS ARE PROPERLY REMOVED TO PROVIDE THE FOLLOWING OUTPUT BEFORE POSTING SOLUTION OR IT WILL BE DOWNVOTED!*****

output.txt: mr - 786 elizabeth - 635 very - 488 darcy - 418 such - 395 mrs - 343 much - 329 more - 327 bennet - 323 bingley - 306 jane - 295 miss - 283 one - 275 know - 239 before - 229 herself - 227 though - 226 well - 224 never - 220 sister - 218 soon - 216 think - 211 now - 209 time - 203 good - 201

STYLE CONSTRAINTS: - Similar to the letterbox style, but where the 'things' have independent threads of execution. - The larger problem is decomposed into 'things' that make sense for the problem domain - Each 'thing' has a queue meant for other \textit{things} to place messages in it - Each 'thing' is a capsule of data that exposes only its ability to receive messages via the queue - Each 'thing' has its own thread of execution independent of the others. Possible style names: Free agents, Active letterbox, Actors

PYTHON CODE:

import sys, re, operator, string from threading import Thread from queue import Queue

class ActiveWFObject(Thread): def __init__(self): Thread.__init__(self) self.name = str(type(self)) self.queue = Queue() self._stopMe = False self.start()

def run(self): while not self._stopMe: message = self.queue.get() self._dispatch(message) if message[0] == 'die': self._stopMe = True

def send(receiver, message): receiver.queue.put(message)

class DataStorageManager(ActiveWFObject): """ Models the contents of the file """ _data = ''

def _dispatch(self, message): if message[0] == 'init': self._init(message[1:]) elif message[0] == 'send_word_freqs': self._process_words(message[1:]) else: # forward send(self._stop_word_manager, message)

def _init(self, message): path_to_file = message[0] self._stop_word_manager = message[1] with open(path_to_file) as f: self._data = f.read() pattern = re.compile('[\W_]+') self._data = pattern.sub(' ', self._data).lower()

def _process_words(self, message): recipient = message[0] data_str = ''.join(self._data) words = data_str.split() for w in words: send(self._stop_word_manager, ['filter', w]) send(self._stop_word_manager, ['top25', recipient])

class StopWordManager(ActiveWFObject): """ Models the stop word filter """ _stop_words = []

def _dispatch(self, message): if message[0] == 'init': self._init(message[1:]) elif message[0] == 'filter': return self._filter(message[1:]) else: # forward send(self._word_freqs_manager, message)

def _init(self, message): with open('../stop_words.txt') as f: self._stop_words = f.read().split(',') self._stop_words.extend(list(string.ascii_lowercase)) self._word_freqs_manager = message[0]

def _filter(self, message): word = message[0] if word not in self._stop_words: send(self._word_freqs_manager, ['word', word])

class WordFrequencyManager(ActiveWFObject): """ Keeps the word frequency data """ _word_freqs = {}

def _dispatch(self, message): if message[0] == 'word': self._increment_count(message[1:]) elif message[0] == 'top25': self._top25(message[1:])

def _increment_count(self, message): word = message[0] if word in self._word_freqs: self._word_freqs[word] += 1 else: self._word_freqs[word] = 1

def _top25(self, message): recipient = message[0] freqs_sorted = sorted(self._word_freqs.items(), key=operator.itemgetter(1), reverse=True) send(recipient, ['top25', freqs_sorted])

class WordFrequencyController(ActiveWFObject):

def _dispatch(self, message): if message[0] == 'run': self._run(message[1:]) elif message[0] == 'top25': self._display(message[1:]) else: raise Exception("Message not understood " + message[0])

def _run(self, message): self._storage_manager = message[0] send(self._storage_manager, ['send_word_freqs', self])

def _display(self, message): word_freqs = message[0] for (w, f) in word_freqs[0:25]: print(w, '-', f) send(self._storage_manager, ['die']) self._stopMe = True

# # The main function # word_freq_manager = WordFrequencyManager()

stop_word_manager = StopWordManager() send(stop_word_manager, ['init', word_freq_manager])

storage_manager = DataStorageManager() send(storage_manager, ['init', sys.argv[1], stop_word_manager])

wfcontroller = WordFrequencyController() send(wfcontroller, ['run', storage_manager])

# Wait for the active objects to finish [t.join() for t in [word_freq_manager, stop_word_manager, storage_manager, wfcontroller]]

**ENSURE THE SOLUTION IS WORKING AND FOLLOW THE CORRESPONDING STYLE BEFORE YOU SUBMIT THEM OR I WILL HAVE TO DOWNVOTE! YOU CAN TEST THEM LOCALLY FIRST BY LOOKING UP THE PRIDE-AND-PREJUDICE.TXT FILE AND USING IT SINCE I CAN'T PASTE IT HERE, BUT IT IS WIDELY AVAILABLE.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

JDBC Database Programming With J2ee

Authors: Art Taylor

1st Edition

0130453234, 978-0130453235

More Books

Students also viewed these Databases questions

Question

8. Describe the steps in the development planning process.

Answered: 1 week ago