Question
Building lexical analyzer (or scanner). Local definition: There are five types of tokens. The lexical analyzer (scanner) is a DFA recognizing these tokens. Keywords: make,
Building lexical analyzer (or scanner).
Local definition: There are five types of tokens. The lexical analyzer (scanner) is a DFA recognizing these tokens.
Keywords: make, programming, great, again, America, is, else, number, Boolean, if, as, long, tell, say, fact, lie, not, and, or, less, more, plus, times.
Identifier: any letter followed optionally by digits and/or letters.
Constants: any sequence of digits whose corresponding value is greater than 1,000,000.
Strings: any sequence of characters in a pair of and .
Special symbols: ,, ;, :, !, ?, (, ).
Write a program (in Java or any programming language of your choice) with subprocedures/classes SCANNER(), BOOKKEEPER(), and ERRORHANDLER() that handle the five types of tokens defined above.
Construct SCANNER() from a DFA accenting these tokens. A blank (many consecutive blanks are the same as a single blank), line break or special symbol separates two tokens. Symbols following # (up to a line break) are comments; # and these comment symbols must be ignored by the scanner. The symbols and used to define a string must also be ignored by the SCANNER(). Call SCANNER() from the main body of the program, once for each token to be recognized, until all symbols in the given input program are consumed. Thus the main body will contain a loop in which SCANNER() consists of blocks of codes for states of the DFA recognizing the five types of tokens, as discussed in class. Using a method not implementing a DFA will result in zero credit for this project.
Call BOOKKEEPER() from SCANNER() when an identifier, constant, or string is recognized. It is responsible for maintaining a symbol table SYMTAB (of size 100) to store tokens passed from SCANNER() and their attributes, i.e., classification as to identifier, constant, or string. Each identifier, constant or string must appear exactly once in SYMTAB.
Call ERRORHANDLER() from SCANNER() when an illegal token is recognized. It is responsible for producing appropriate error messages. There are three types of error messages; no information other than one of these three (such as the location of the error and/or possible error correction) should be printed.
[id] error: This is a country where we speak English.
[const] error: Im really rich, part of the beauty of me is Im very rich.
Any other error: Trump doesnt want to hear it. For output, print out the following:
Print the input program exactly as you stored in your input file.
For each token, if it is a legal one then print the token and its type. If illegal, then print the token and an error message.
Print the content of SYMTAB.
Your program must scan the whole source program, finding all legal and illegal tokens. Run your program on the following input:
Make programming great again # main body begins
Make x number make y1 z2zz 1w numbers Make a b Boolean
X is 1000000 y1 is 2000000 z is 123456789
A is fact b is lie
As long as, fact or lie;
:
Tell x y1 z2zz say continue
If, x plus (y) times 2000000 more z? ; : tell a b say stop ! else : make c Boolean ! C is not not not fact and x less z ? or lie
Tell a b c x y z
Say done # say done
!
America is great
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started