Question
In this programming assignment, you will be building a lexical analyzer for small programming language, called Simple Pascal-Like, and a program to test it. This
In this programming assignment, you will be building a lexical analyzer for small programming language, called Simple Pascal-Like, and a program to test it. This assignment will be followed by two other assignments to build a parser and an interpreter to the language. Although, we are not concerned about the syntax definitions of the language in this assignment, we intend to introduce it ahead of Programming Assignment 2 in order to determine the language terminals: reserved words, constants, identifier, and operators. The syntax definitions of the Simple Pascal-Like language are given below using EBNF notations. However, the details of the meanings (i.e. semantics) of the language constructs will be given later on.
1. Prog ::= PROGRAM IDENT ; DeclPart CompoundStmt . 2. DeclPart ::= VAR DeclStmt { ; DeclStmt } 3. DeclStmt ::= IDENT {, IDENT } : Type [:= Rxpr] 4. Type ::= INTEGER | REAL | BOOLEAN | STRING 5. Stmt ::= SimpleStmt | StructuredStmt 6. SimpleStmt ::= AssignStmt | WriteLnStmt | WriteStmt 7. StructuredStmt ::= IfStmt | CompoundStmt 8. CompoundStmt ::= BEGIN Stmt {; Stmt } END 9. WriteLnStmt ::= WRITELN (ExprList) 10. WriteStmt ::= WRITE (ExprList) 11. IfStmt ::= IF Expr THEN Stmt [ ELSE Stmt ] 12. AssignStmt ::= Var := Expr 13. Var ::= IDENT 14. ExprList ::= Expr { , Expr } 15. Expr ::= LogOrExpr ::= LogAndExpr { OR LogAndExpr } 16. LogAndExpr ::= RelExpr {AND RelExpr } 17. RelExpr ::= SimpleExpr [ ( = | < | > ) SimpleExpr ] 18. SimpleExpr :: Term { ( + | - ) Term } 19. Term ::= SFactor { ( * | / | DIV | MOD ) SFactor } 20. SFactor ::= [( - | + | NOT )] Factor 21. Factor ::= IDENT | ICONST | RCONST | SCONST | BCONST | (Expr)
Lexical Rules for Tokens to be Recognized: - Identifiers (IDENT) - IDENT := Letter {( Letter | Digit | _ | $ )} - Letter := [ a-z A-Z ] - Digit := [0-9] - Note that all identifiers are case sensitive. - Integer constants (ICONST) - ICONST := [0-9]+ - Real constants (RCONST) - RCONST := ([0-9]+)\.([0-9]*) - For example, real number constants such as 12.0, and 0.2, 2. are accepted as real constants, but .2, and 2.45.2 are not. Note that ".2" is recognized as a dot (CAT operator) followed by the integer constant 2. - String constants (SCONST) - String literals are defined as a sequence of characters delimited by single quotes, that should all appear on the same line. - For example: - 'Hello to CS 280.' is a string literals. - While, "Hello to CS 280." Or 'Hello to CS 280." are not.
- The reserved words of the language and their corresponding tokens are: Reserved Words | Tokens:
and | AND begin | BEGIN boolean | BOOLEAN div | IDIV end | END else | ELSE false | FALSE if | IF integer | INTEGER mod | MOD not | NOT or | OR program | PROGRAM real | REAL string | STRING write | WRITE writeln | WRITELN var | VAR
- The operators of the language and their corresponding tokens are: Operator Symbol/Keyword | Token | Description:
+ | PLUS | Arithmetic addition or concatention - | MINUS | Arithmetic subtraction * | MULT | Multiplication / | DIV | Division := | ASSOP | Assignment operator = | EQ | Equality < | LTHAN | Less than operator > | GTHAN | Greater then operator and | AND | Logical Anding (conjunction) or | OR | Logical Oring (disjunction) not | NOT | Logical Complement (negation) div | IDIV | Integer division (truncated) mod | MOD | Modulo
- The delimiters of the language are: Character | Token | Description:
, | COMMA | Comma ; | SEMICOL | Semi-colon ( | LPAREN | Left Parenthesis ) | RPAREN | Right parenthesis : | COLON | Colon . | DOT | Dot
- A comment is defined by all the characters following the character "{" as starting delimiter to the closing delimiter "}". Comments may overlap one line, as multi-line comments. A recognized comment is skipped and does not have a token. - An error will be denoted by the ERR token. - End of file will be denoted by the DONE token. - White spaces are skipped.
Lexical Analyzer Implementation: - You will write a lexical analyzer function, called getNextToken having the following signature: LexItem getNextToken (istream& in, int& linenumber); -First argument is a reference to an istream object that the function should read from (input file). -Second reference is an integer that contains the current line number of the line read from the input file. - getNextToken returns a LexItem object. A LexItem is a class that contains a token, a string
Given file: lex.h:
/* * lex.h * * CS280 * Fall 2023 */
#ifndef LEX_H_ #define LEX_H_
#include
//Definition of all the possible token types enum Token { // keywords OR RESERVED WORDS IF, ELSE, WRITELN, WRITE, INTEGER, REAL, BOOLEAN, STRING, BEGIN, END, VAR, THEN, PROGRAM,
// identifiers IDENT, TRUE, FALSE,
// an integer, real, and string constant ICONST, RCONST, SCONST, BCONST,
// the arithmetic operators, logic operators, relational operators PLUS, MINUS, MULT, DIV, IDIV, MOD, ASSOP, EQ, GTHAN, LTHAN, AND, OR, NOT, //Delimiters COMMA, SEMICOL, LPAREN, RPAREN, DOT, COLON, // any error returns this token ERR,
// when completed (EOF), return this token DONE, };
//Class definition of LexItem class LexItem { Token token; string lexeme; int lnum;
public: LexItem() { token = ERR; lnum = -1; } LexItem(Token token, string lexeme, int line) { this->token = token; this->lexeme = lexeme; this->lnum = line; }
bool operator==(const Token token) const { return this->token == token; } bool operator!=(const Token token) const { return this->token != token; }
Token GetToken() const { return token; } string GetLexeme() const { return lexeme; } int GetLinenum() const { return lnum; } };
extern ostream& operator<<(ostream& out, const LexItem& tok); extern LexItem id_or_kw(const string& lexeme, int linenum); extern LexItem getNextToken(istream& in, int& linenum);
#endif /* LEX_H_ */
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started