Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Purpose Warm-up with Java programming. Get familiar with regular expression. Understand the wide application of regular expression. Assignment specification Your job is to count the

Purpose

Warm-up with Java programming. Get familiar with regular expression. Understand the wide application of regular expression.

Assignment specification

Your job is to count the number of identifiers in programs written in our Tiny language.

The Tiny Language Definition

Here is the definition for the Tiny language.

The lexicon of the Tiny language is defined as follows:

  • Keywords: WRITE READ IF ELSE RETURN BEGIN END MAIN STRING INT REAL
  • Single-character separators: ; , ( )
  • Single-character operators: + - * /
  • Multi-character operators: := == !=
  • Identifier: An identifier consists of a letter followed by any number of letters or digits. The following are examples of identifiers: x, x2, xx2, x2x, End, END2.Note that End is an identifier while END is a keyword. The following are not identifiers:
    • IF, WRITE, READ, .... (keywords are not counted as identifiers)
    • 2x (identifier can not start with a digit)
    • Strings in comments are not identifiers.
  • Number is a sequence of digits, or a sequence of digits followed by a dot, and followed by digits.
    Number -> Digits | Digits '.' Digits Digits -> Digit | Digit Digits Digit -> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
  • Comments: string between /** and **/. Comments can be longer than one line.

    The EBNF Grammar

    High-level program structures

    Program -> MethodDecl MethodDecl* Type -> INT | REAL |STRING MethodDecl -> Type [MAIN] Id '(' FormalParams ')' Block FormalParams -> [FormalParam ( ',' FormalParam )* ] FormalParam -> Type Id 

    Statements

    Block -> BEGIN Statement+ END Statement -> Block | LocalVarDecl | AssignStmt | ReturnStmt | IfStmt | WriteStmt | ReadStmt LocalVarDecl -> Type Id ';' | Type AssignStmt AssignStmt -> Id := Expression ';' | Id := QString ';' ReturnStmt -> RETURN Expression ';' IfStmt -> IF '(' BoolExpression ')' Statement | IF '(' BoolExpression ')' Statement ELSE Statement WriteStmt -> WRITE '(' Expression ',' QString ')' ';' ReadStmt -> READ '(' Id ',' QString ')' ';' QString is any sequence of characters except double quote itself, enclosed in double quotes. 

    Expressions

    Expression -> MultiplicativeExpr (( '+' | '-' ) MultiplicativeExpr)* MultiplicativeExpr -> PrimaryExpr (( '*' | '/' ) PrimaryExpr)* PrimaryExpr -> Num // Integer or Real numbers | Id | '(' Expression ')' | Id '(' ActualParams ')' BoolExpression -> Expression '==' Expression |Expression '!=' Expression ActualParams -> [Expression ( ',' Expression)*] 

    Sample program

     /** this is a comment line in the sample program **/ INT f2(INT x, INT y ) BEGIN INT z; z := x*x - y*y; RETURN z; END INT MAIN f1() BEGIN INT x; READ(x, "A41.input"); INT y; READ(y, "A42.input"); INT z; z := f2(x,y) + f2(y,x); WRITE (z, "A4.output"); END

You should pick out the identifiers from a text file, and write the output to a text file (named A1.output). Note that the output file should contain a line like "identifiers:5" . Here are the sample input and output files.The input will have multiple lines. Please note that in this sample program the following are not counted as identifiers:

INPUT FILE:

INT f2(INT x, INT y ) BEGIN z := x*x - y*y; RETURN z; END INT MAIN F1() BEGIN INT x; READ(x, "A41.input"); INT y; READ(y, "A42.input"); INT z; z := f2(x,y) + f2(y,x); WRITE (z, "A4.output"); END

OUTPUT FILE:

identifiers:5
  • A41, input, output: they are quoted hence they are not treated as identifiers;
  • INT, READ etc.: They are keywords used in our Tiny language hence they should not be picked up.

Here are the test cases for the assignment:

case 1

 INT f2(INT x, INT y ) BEGIN z := x*x - y*y; RETURN z; END INT MAIN F1() BEGIN INT x; READ(x, "A41.input"); INT y; READ(y, "A42.input"); INT z; z := f2(x,y) + f2(y,x); WRITE (z, "A4.output"); END

, case 2,

 BEGIN END INT MAIN AAA, X23Y, F1i A END

case 3,

aaa&bbb!ccc#ddd%eee&ffff

case 4,

 INT f2(INT x, INT y ) BEGIN z := x*x - y*y; RETURN z; END INT MAIN f1() BEGIN x2222x INT x1; READ(x, "A41. input"); INT y; READ(y, "A42.input"); INT z; z := f2(x,y) + f2(y,x); WRITE (z, "A4.output"); END

case 5,

 "A41.input" AA BB CCCC2DDD aaabbb if b c d IF 

case 6.

AA1 BBBB22222 CCC3333DDDDD A- B- C- xx_yy_~!@#$%^&*()_+}{}[]\|\":";'?><,./  zz BEGIN END 

(ID counts: 5 4 6 7 8 9).

If you can not pass test case 6, and you use [^a-zA-Z] as delimiter in Scanner, you can solve the problem by specifying the encoding as UTF-8, i.e., use

 new Scanner(yourFile, "UTF-8")... 

In this assignment you can suppose that there are no comments in the programs.

In the output file you should only write "identifiers:" followed by the number of identifiers. If there are multiple occurrences of an identifier in the input, you should only count it once. Don't write anything else into the output file.

You will write two different programs to do this:

  1. Program A11.java is not supposed to use regular expressions, not regex package, not the methods involvoing regular expression in String class or other classes. Your program can look at characters one by one, and write a loop to check whether they are quoted strings, identifiers, etc. `
  2. Program A12.java will use java.util.regex. One useful link to start with is a tutorial for Java regex. https://www.regular-expressions.info/java.html

Your programs should be able to run by typing:

 %javac A11.java %java A11 A1.tiny %javac A12.java %java A12 A1.tiny 

In this assignment, the output should be in a file called "A1.output". You should not use keyboard input. The input file name will be provided as the argument of the program, while the output file name is hard coded in your programs. i.e., your code regarding input and output can be like the following:

 ... new BufferedReader(new FileReader(args[0])); ... new BufferedWriter(new FileWriter("A1.output")); 

Your program should be tested on luna or bravo.

Please don't write unnecessarily long programs. The sample solutions for A11 and A12 consist of approximately 300 words altogether by PHP function str_word_count(), which are not written deliberately for short length and can be compacted into smaller sizes easily. Hence one mark is given if your wordcount is smaller than 300.

Marking Scheme

If your code is shorter or equal to 72 words according to our website (or 32 according to wc the word count command in unix), you will recieve one bonus mark. I.e., the total mark could be 5+1=6.

 yourMark=0; if (A11.java, A12.java are not sent properly) return; for (each of A11, A12) if (it is compiled correctly) yourMark+=0.2; for (each of A11, A12){ if (your java program reads A1.tiny && generates result file A1.output) for (each of the 6 tests cases) if (it is correct) yourMark+=0.3; if youCode.length() < average(length of A11 in the class) yourMark+=0.5; } for (each day of your late submission) yourMark=yourMark*0.8; One bonus mark for the shortest code among the class.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Genomes Browsers And Databases Data Mining Tools For Integrated Genomic Databases

Authors: Peter Schattner

1st Edition

0521711320, 978-0521711326

More Books

Students also viewed these Databases questions

Question

b. What groups were most represented? Why do you think this is so?

Answered: 1 week ago

Question

3. Describe phases of minority identity development.

Answered: 1 week ago

Question

5. Identify and describe nine social and cultural identities.

Answered: 1 week ago