Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this assignment you will be writing a Java program named KeyWordList.java that creates key words to a text file. Key words will be any

In this assignment you will be writing a Java program named KeyWordList.java that creates key words to a text file. Key words will be any words that show up in the text file that are not in a provided list of words to ignore. The importance of a word will be determined by how many times the word appears in the text file. Your program implements a method as specified below:

public Word[] keyWordsList(File inputFile1, int N, File inputFile2)

where inputFile1 is used to generate key words list; inputFile2 contains the words to ignore, N specifies the maximum number of words to be included in the key wors list.

To construct the key word list, your program will first go through the text file and determine the key words that appear and how many times each key word shows up. Each key word can be thought of as a (word, # of occurrences) pair. However, we ignore the following words:

1. appear less than three (3) times (appear only once or twice);

2. punctuations

3. numbers

4. the words to ignore (that are listed in the ignore file)

After collecting all the key word information, your program will find the N key words with the most occurrences to include in the word cloud. To do this, your program will put all the key words (except the words in the ignore list, see above) into a Priority Queue (prioritized by the number of occurrences) and then display the queue structure (either complete or full tree). When you print the queue structure, each node is represented with pairs of word and its frequency (for example, [Elephant 35]). For this program, you are REQUIRED to implement a Priority Queue by using an array-based heap.

Two abstract classes are provided. Design two class files named WordHeap.java and KeyWordList.java, that extend WordHeapAbstract and KeyWordListAbstract, respectively. Here is more detailed explanation on the methods in the KeyWorkList class:

public abstract Word[] wordsList(File fileForKeyWords, File fileForWordsToIgnore)

This method gets all words from the fileForKeyWords, except the words in the ignore list, then returns words as a Word array. The word array will be used in the following method to contract a heap-priority queue.

public abstract Word[] keyWordsList(File inputFile1, int N, File inputFile2)

This method returns key word list. Within this method, you should call wordsList method and construct a heap. Then use the heap to create and return key words list. This method is also required to display the heap structure before and after creating the key word list. Here is a sample output of the heap structure:

Must be in JAVA

Heap structure before constructing key word list:

=================================================

[R:3]

[we:8]

[value:3]

[and:9]

[priority:6]

[node:9]

[max:4]

[is:9]

[binary:6]

[heap:7]

[will:5]

[to:8]

[job:3]

[has:4]

[SiftDown:3]

[jobs:3]

[A:3]

Heap structure after constructing 10 key words list:

===================================================

[A:3]

[job:3]

[R:3]

[has:4]

[value:3]

[jobs:3]

[SiftDown:3]

And output file containing key works list:

==================================================

[is:9]

[and:9]

[node:9]

[we:8]

[to:8]

[heap:7]

[priority:6]

[binary:6]

[will:5]

[max:4]

Finally, design a driver program to test your methods.( Or you can simply include the main method in the KeyWordList class to test methods)

//The given data

import java.util.*;

import java.io.*;

public abstract class KeyWordListAbstract{

public abstract Word[] wordsList(File fileForKeyWords, File fileForWordsToIgnore);//also exclude single letter, numbers, and punctuations (,.?!)

public abstract Word[] keyWordsList(File inputFile1, int N, File inputFile2);

}

//////////

public class Word implements Comparable{

private String word;

private int frequency;

public Word(String word){

this.word=word;

frequency=1;

}

public Word(String word, int frequency){

this.word=word;

this.frequency=frequency;

}

public String getWord(){

return word;

}

public int getFrequency(){

return frequency;

}

public void setWord(String word){

this.word = word;

}

public void incrementFrequency(){

frequency++;

}

public int compareTo(Word other){

return frequency-other.getFrequency();

}

public String toString(){

return String.format("["+word+ ":"+frequency+"]");

}

}

/////////

public abstract class WordHeapAbstract{

Word[] list;

int lastIndex=-1;

int maxIndex;

public WordHeapAbstract(int maxSize){

lastIndex=-1;

maxIndex=maxSize-1;

list=new Word[maxSize];

}

public boolean isEmpty(){

return lastIndex == -1;

}

public boolean isFull(){

return lastIndex == maxIndex;

}

public abstract void enqueue(Word data);

public abstract Word dequeue();

public abstract void display(); //display heap structure: either complete or full tree

}

////////

//Data files

//File1

This is the input file to create key words list.

If we ever want to know how background job works, fastest way to find k smallest elements in an array, how merging tables in database works behind the scenes, keep reading. Because in this article, we will discuss about priority queues and disjoint set. Both data structures are beautiful to solve these problems. In the end, we will solve these problems above. Happy reading Priority Queue is a Queue where each element is assigned a priority and elements com out in order by priority. Typical use case of priority queue is scheduling jobs. Each job has a priority and we process jobs in order of decreasing priority. While the current job is processed and new jobs may arrive. binary heap has 2 types: binary min heap and binary max heap. In this article, we will discuss about binary max heap. On the other side, binary min heap has the same way of implementation.

Binary max-heap is a binary tree, each node has zero, one or two children, where the value of each node is at least the values of its children.Root node R is the max value of max-heap. To pop out the root node, we just swap root node R to any leaf node A and remove node R. It may violate the property of heap. We will do SiftDown operation. As a parent node A, we want to SiftDown to children B and C, we will choose the max value of children B and C, and swap it to node A. We SiftDown until the property of heap is satisfied.

//File2

of the a in

on , . ?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Put Your Data To Work 52 Tips And Techniques For Effectively Managing Your Database

Authors: Wes Trochlil

1st Edition

0880343079, 978-0880343077

Students also viewed these Databases questions