Question
For this assignment, you will use your knowledge of arrays and ArrayLists to write a Java program that will input a file of sentences and
For this assignment, you will use your knowledge of arrays and ArrayLists to write a Java program that will input a file of sentences and output a report showing the tokens and shingles (defined below) for each sentence.
Templates are provided below for implementing the program as two separate files: a test driver class containing the main() method, and a sentence utilities class that computes the tokens and shingles, and reports their values.
The test driver template already implements accepting the input file name as a command line argument to the program. This will allow the graders to test your program against several different input files. You will not know in advance the input file names that the graders will use.
It is your job to add the necessary code to the templates to make the program process the input to produce the required output. Detailed instructions for each class are provided below.
The input file will be in the format specified below. A sample input file that you can use for development is also described below. However, we will test the program with different input files with different names, so you should not hard code any file name in your program.
Program output must be to the console (screen) and must conform to the format of the sample output below, which corresponds to the sample input file that is provided to you.
Input File Format
The input file to the program will be a text file. However, the file name may or may not include a ".txt" file extension, so do not test whether the name includes ".txt", and do not add ".txt" if it is not present. Just take in the command argument and use it as the file name just the way it is.
There will be one sentence on each line. The sentences may include upper and lower case letters, numbers, punctuation marks, and special characters.
There may be one or more empty lines at the end of the file, which your program should be able to ignore.
You should use the cats.txt for development testing to make sure your program works correctly. If, for some reason the file is not compatible with your system, you can just create the file with the following contents:
If you create the file on your own, please be sure to create the file as a text file, not a "rich-text" file (".rtf") file.
Also, please note that the file contains an empty line after the third line. This is intentional. Your program should successfully ignore empty lines like this.
SentenceUtilsTest.java
This file contains the test driver class. The template code already reads in the input file name as a command line argument and instantiates a Scanner to read the file. This is the template for this source file:
Your job for this Java class is to create the file by typing in the code above, and then add the necessary Java statements to do the following:
use the scanner to read read the file and invoke the SentenceUtils constructor to create (instantiate) a SentenceUtils object for each sentence; you must be sure to ignore empty lines as you do this;
add the newly created object to the List of SentenceUtils objects called "slist" that is already declared and created as a class variable for you in the template; and
loop through the SentenceUtils objects in the list, and for each such object, output a sentence heading showing the sentence number, and invoke its "report()" instance method, which you will also write (see the next section, below).
Please note that the sentence numbering must be done in this test driver class, not in SentenceUtils, since a particular sentence has no way of knowing the order in which it appears in the "slist" list that is maintained by this SentenceUtilsTest class. The number of a sentence should be reported as the zero-based index of the sentence's entry in slist.
SentenceUtils.java
This file contains the sentence utilities class. The template code already specifies the class members and implements the constructor. The constructor already places the input sentence into the "sentence" variable (member). The template also contains empty stubs for the methods that you will need to implement. This is the template for this source file:
Your job for this Java class is to create the file by typing in the code above, and then add the necessary Java statements to do the following:
(1) You must implement the "generateTokens()" method to chop up a sentence into its tokens and to place these tokens into the String array "tokens". A "token" is any whitespace-separate character string, so you may use a Scanner on the sentence String to give you the tokens one-by-one.
(2) You must also implement the "generateShingles()" method to chop up the sentence into 2-character "shingles". A "shingle" is simply a String that contains two consecutive characters. They are called shingles because they must overlap, that is, the first character of the next shingle is the second character of the previous shingle.
For example, consider the String "banana split". The shingles for this string are:
'ba' 'an' 'na' 'an' 'na' 'a ' ' s' 'sp' 'pl' 'li' 'it'
Please note that this list include duplicates. It also includes the space character, which appears in the shingles 'a ' and ' s',
For our purposes, you should include all whitespace, punctuation, special characters, and numbers in our shingles, exactly as they appear. Also, you should not convert upper case letters to lower case, or vice versa.
(3) Finally, you should also implement the "report()" method. This method should output to the console the following information, as shown on the sample output: (a) the full sentence (all on one line); (b) the individual tokens, numbered as shown, one to each line; and (c) all of the shingles, ten (10) on each line, separated by spaces, as shown in the sample output below. Each shingle should be surrounded by single quotation marks to make it clear when shingles contain spaces, as in the example above and the sample output below.
Output Format:
Your program should present all output to the console (System.out). The output report should be in the following format:
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started