Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Can I get help coding this in JAVA? Input File The input file can be any ASCII file that contains text. This file can be

Can I get help coding this in JAVA?

image text in transcribed

image text in transcribed

image text in transcribed

image text in transcribed

Input File The input file can be any ASCII file that contains text. This file can be the contents of books downloaded from the Internet, documentation for software, email from your mom, anything you want. It is highly recommended that you test the application with several files of varying size, from very small to large. Note: You must be prepared to test your application with a file that is bigger than 20MB with more than 4 million tokens. Your application must process this file properly. I will provide this file in Slack. Be sure that you test your application with this file before code review time and compare your output to the expected output (this will also be shared in Slack). Generating the Tokens For the purposes of this project, you must use the regex, "IIW" to split the tokens out of each line in the input file: inputLine.split ("\W"); You will notice that the code above results in some empty tokens, or tokens with a length of zero. You will need to write a small bit of code too ignore any empty tokens, meaning, do not pass emtpy tokens to your analyzers for processing. Output Files Note: All output files must be written to the output/ directory. The distinct tokens file. This file will contain all the distinct tokens in the input file. The file will have one token on each line. There will not be any duplicates in the file. The file must be named distinct_tokens.txt. The summary file. This file must be named summary. txt . It will be a listing of summary information about the analysis of a document. It will contain the following information in this order: 1. The name of the application generating the report. Make up a name for your application. 2. The name of the author of the application and their section day and time. 3. The email address of the author. 4. The absolute path of the text file that was analyzed. 5. The date and time the file that was analyzed. 6. The last modified date of the analyzed file. 7. The size of the file in bytes. 8. The file URI of the analyzed file. 9. The total number of individual tokens in the document. Example: Application: File Magic Author: Josh Kays Author email: JoshKaysegmail.com File: /home/student/thomas-paine.txt Date of analysis: Thu Jan 11 16:21:28 CST 2018 Last Modified: Wed Jan 10 21:18:44 CST 2018 File Size: 2375590 File URI: file:/home/student/thomas-paine.txt Total Tokens: 397952 Hint: Review the java.io.File documentation for information on how to get some of the file-related values such as file size, absolute path, uri, etc. Exception Handling All places in this application which could encounter problems should use exception handling. All exceptions that happen during the running of this application must be caught and the stack trace displayed to the command window. One exception that should be tested is if the input files are not found on the disk. This can be easily tested by entering a file name that is not on your computer. The Driver Class The project will have a driver class that does the following: - The name of the class must be Driver. - The class will instantiate an instance of the project's main processing class. - The class will call the main processing method of the main class passing the command line arguments array to the method. No other code will be accepted for this class. The File Processing Class - The main controller class for the project will be named FileAnalysis. - The class will have a constant for the valid number of command-line arguments. - The class will have instance variable for each Analyzer class. These variables must be named summaryAnalyzer and distinctAnalyzer. - The class will have a method named analyze with the following signature: public void analyze(String[] arguments) - The analyze method will have the following: - The method will first test if the correct number of arguments have been entered by the user when running the application. For project 1 , this number will be 1 . If the correct number is not entered then the application must output a message to the command line asking for the right input and then terminate the program. - The method will then call other methods to perform these tasks: - Create an instance of each Analyzer class and assign each instance to their respective instance variables: summaryAnalyzer and distinctAnalyzer. - Open the input file. - Loop through all the lines of the input file and generate individual tokens. - Pass generated tokens to all Analyzer instances via the processToken() method. - Call the generateoutputFile( ) method for each Analyzer class in a method named write0utputFiles( ). - The class must use many methods to perform it's tasks. You can expect that the instructor will ask you to break up methods into more methods. Each method should perform only one task. All loops must be in their own method and no loops may be nested within other loops. Use method calls instead. The TokenAnalyzer Interface The project will have an interface named TokenAnalyzer that needs to be implemented by any class that performs an analysis. The interface will have two methods: void processToken(String token) void generate0utputFile(String inputFilePath, String outputFilePath) The Analyzer Classes For project 1 we will have two classes that analyze the input file. One, named DistinctTokensAnalyzer, will create the report of all distinct tokens. The other, named FileSummaryAnalyzer, will create the summary report. These classes must implement the TokenAnalyzer interface. The FileSummaryAnalyzer class. - This class must be named FileSummaryAnalyzer. - This class will implement the TokenAnalyzer interface. - The class will have a zero-parameter constructor. - The class will have the following instance variable and get method. Do not create a set method for the variable. // Only allowed instance variable private int totalTokensCount; public int getTotalTokensCount() \{ return totalTokensCount; - No other instance variables will be allowed for this class. The DistinctTokensAnalyzer class. - This class must be named DistinctTokensAnalyzer. - This class will implement the TokenAnalyzer interface. - The class will have a zero-parameter constructor. - The class will have the following instance variable and get method. Do not create a set method for the variable. // Only allowed instance variable private Set distinctTokens; public Set getDistinctTokens() \{ return distinctTokens; \} - No other instance variables will be allowed for this class. - In the zero-parameter constructor create an instance of a TreeSet and assigns it to the distinctTokens variable

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Application And Administration

Authors: Michael Mannino, Michael V. Mannino

2nd Edition

0072880678, 9780072880670

More Books

Students also viewed these Databases questions