Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Bigram Java, Eclipse CODES: public class Bigram { // TODO: add member fields! You may have more than one. // You will probably want to

image text in transcribed

Bigram Java, Eclipse

CODES:

public class Bigram {

// TODO: add member fields! You may have more than one.

// You will probably want to use some kind of Map!

/**

* Create a new bigram model based on the text given as a String argument.

* See the assignment for more details (and also check out the Wikipedia

* article on bigrams).

*

* @param s

* text

*/

public Bigram(String s) {

// TODO: implement me!

}

/**

* Check to see whether the sentence is possible according to the bigram

* model. A sentence is possible if each bigram in the sentence was seen in

* the text that was passed to the constructor.

*

* @param s

* Sentence

* @return true if possible, false if not possible (some transition does not

* exist in the model as constructed)

*/

public boolean check(String s) {

// TODO: implement me!

return false; // Fix this!

}

/**

* Generate an array of strings based on the model, start word, and count.

* You are given the start word to begin with. Each successive word should

* be generated as the most likely or common word after the preceding word

* according to the bigram model derived from the text passed to the

* constructor. If more than one word is most likely, pick the smallest one

* according to the natural String comparison order (compareTo order). Fewer

* than count words may be generated if a dead end is reached with no

* possibilities. If the start word never appears in the input text, only

* that word will be generated.

*

* @param start

* Start word

* @param count

* Number of words to generate (you may assume it's at least 1)

* @return Array of generated words which begins with the start word and

* will usually have the length of the count argument (less if there

* is a dead end)

*/

public String[] generate(String start, int count) {

// TODO: implement me!

return null; // Fix this! Your method should never return null!

}

}

TEST CODES:

import java.nio.file.Files;

import java.nio.file.Paths;

import java.security.MessageDigest;

import java.security.NoSuchAlgorithmException;

import java.util.Arrays;

public class BigramTest {

public static int test(String file, byte[] xmd5, String[] gen, String[] desired, String[] check, boolean[] truth)

throws NoSuchAlgorithmException {

System.out.println("Loading " + file + "...");

String text;

try {

text = new String(Files.readAllBytes(Paths.get(file)));

} catch (Exception e) {

System.out.println("Couldn't find '" + file

+ "'. Please place this file in the root directory of this project (next to JRE System Library, not indented).");

return 0;

}

MessageDigest md5 = MessageDigest.getInstance("MD5");

byte[] digest = md5.digest(text.replaceAll("\\s+", " ").getBytes());

//System.out.println(Arrays.toString(digest));

if (!Arrays.equals(digest, xmd5)) {

System.out.println("Your copy of " + file + " appears to contain errors! Please download it again.");

return 0;

}

System.out.println("Loaded " + file + ". Initializing Bigram object...");

long start = System.currentTimeMillis();

Bigram u = new Bigram(text);

System.out.println("Generating.");

int genScore = 0;

for (int i = 0; i

String[] foo = u.generate(gen[i], 10);

if (foo == null) {

System.out.println("For start word " + gen[i] + " with 10 words, you returned a null array!");

continue;

}

String gened = "";

for (int j = 0; j

gened = gened + foo[j] + (j

}

if (gened.equals(desired[i])) {

genScore += 10;

} else {

System.out.println("For start word " + gen[i] + " with 10 words, expected '" + desired[i] + "' got '"

+ gened + "'.");

}

}

System.out.println("Checking.");

int checkScore = 0;

for (int i = 0; i

boolean ck = u.check(check[i]);

if (ck == truth[i]) {

checkScore += 10;

} else {

System.out

.println("For phrase '" + check[i] + "' expected return value " + truth[i] + " but got " + ck);

}

}

long end = System.currentTimeMillis();

// Attempt at a benchmark...

Arrays.sort(text.toLowerCase().toUpperCase().split("\\s"));

Arrays.sort(text.toUpperCase().toLowerCase().toCharArray());

Arrays.sort(text.split("\\s"));

Arrays.sort(text.getBytes());

long sortime = System.currentTimeMillis();

//System.out.println((double)(end-start-5)/(sortime - end));

if ((double)(end - start - 5)/(sortime - end) > 8) {

System.out.println("Your program is taking a while! Try speeding it up for extra credit.");

} else if ((double)(end - start - 5)/(sortime - end) > 2) {

System.out.println("Fast, but could be faster! Takes "+(end-start)+" ms, try to get it below ~"+(2*(sortime - end)+5));

genScore += 1;

} else {

System.out.println("Super fast! Took "+(end - start)+" ms");

genScore += 1;

checkScore += 1;

}

return genScore * 100 + checkScore;

}

public static void main(String[] args) throws NoSuchAlgorithmException {

final byte[] dmd5 = { -61, 106, 118, -21, 62, -73, 33, 75, 68, -48, 38, 39, 108, 27, 95, -44 };

final byte[] gmd5 = { -59, 120, 53, -92, 81, 59, -34, 72, 56, 2, 112, -125, 127, 50, -42, 55 };

int checkScore = 0, genScore = 0;

try {

System.out.println("Trying 'Bob' example from homework.");

Bigram x = new Bigram("Bob likes dogs. Bill likes cats. Jane hates dogs.");

if (x.check("Bob likes cats.")) {

checkScore += 10;

} else {

System.out.println("First check failed.");

}

if (!x.check("Jane likes cats.")) {

checkScore += 10;

} else {

System.out.println("Second check failed.");

}

System.out.println("Trying 'Balloon' example from homework.");

Bigram y = new Bigram("The balloon was red. The balloon got bigger and bigger. The balloon popped.");

String[] g1 = y.generate("The", 3);

if (Arrays.equals(g1, new String[] { "The", "balloon", "got" })) {

genScore += 10;

} else {

System.out.println("First generate failed. Got " + Arrays.toString(g1));

}

String[] g2 = y.generate("popped.", 2);

if (Arrays.equals(g2, new String[] { "popped." })) {

genScore += 10;

} else {

System.out.println("Second generate failed. Got " + Arrays.toString(g2));

}

System.out.println("Testing with the Declaration of Independence...");

int dscores = test("decl.txt", dmd5, new String[] { "When" },

new String[] { "When in the most barbarous ages, and to the most" },

new String[] { "We have Petitioned for the rectitude of this Declaration,",

"instrument for pretended offences For abolishing" },

new boolean[] { true, true });

genScore += dscores / 100;

checkScore += dscores % 100;

System.out.println("Testing with Great Expectations...");

int gscores = test("gexp.txt", gmd5, new String[] { "Pip", "dozen" },

new String[] { "Pip and I had been a little while, and I",

"dozen yards of the same time to be a little" },

new String[] { "low leaden hue" }, new boolean[] { false });

genScore += gscores / 100;

checkScore += gscores % 100;

} finally {

System.out.println("Check: " + checkScore + " / 50");

System.out.println("Generate: " + genScore + " / 50");

System.out.println("Tentative total: " + (checkScore + genScore + " / 100"));

System.out.println("Violations of the academic honesty policy may affect this score.");

}

}

}

A bigram is a pair of adjacent words in a sequence. Bigrams overlap sothat in the sequence "a b. c d' the bigrams are ("a "b."), ("b.", "c''), ("c", "d"). You will write a simple parser which builds a bigram model based on input text and will allow checking and generating sentences. To do so, you should take advantage of Java's collection classes including Maps Create a class called Bigram. The class will have a constructor which takes a String. Use a Scanner with its default tokenization on the String. As long as hasNext() returns true, each call to next() will retrieve the next word. Note that some words will be capitalized differently or contain punctuation. T each of those differently (for example, "Dogs "dogs", and "dogs." are all different strings) Checking a phrase will consist of looking at each adjacent pair of adjacent words. If all adjacent pairs were seen in your input text, your code will return true, otherwise false Example Bigram x new Bigram ("Bob likes dogs. Bill likes cats. Jane hates dogs. x. check ("Bob likes cats.") returns true: "Bob likes" and "likes cats." both appear in the input text x.check ("Jane likes cats.") returns false: "Jane likes" does not appear in the input text Your phrase generation method will be given a start word and a count indicating the number of total words to generate (including the start word). It will generate the "most likely" or "most common phrase based on bigram counts. It will return an array of Strings with the words generated in order. It always starts by generating the start word. As you generate each word, the next word generated should be the one that appears most often in the input (constructor) text after the previous word generated. If you reach a dead end (either the previous word was never seen or there are no words ever seen after that word), end generation early and return a shorter array. If there is more than one "most common" choice seen in the input text, pick the one with the smallest word according to the String compareTo method (NOTE: Ordered Sets and OrderedMaps such as Tr and TreeMaps order eeSets their set (or set of keys) according to compare To.) Example Bigram y new Bigram ("The balloon was red. The balloon got bigger and bigger. The balloon popped. y.generate("The", 3) returns the String array ["The", "balloon'', "got y.generate ("popped 2) returns ["popped." A tester program will be released which will test multiple larger examples. Your code should be able to work with input text containing up to a million words

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

2. Define identity.

Answered: 1 week ago

Question

1. Identify three communication approaches to identity.

Answered: 1 week ago

Question

4. Describe phases of majority identity development.

Answered: 1 week ago