Question

1 Approved Answer

Posted on Sep 22, 2024

/** * Parses a return statement from the {@code statement} rule. This method * should only be called if the next tokens start a return

/** * Parses a return statement from the {@code statement} rule. This method * should only be called if the next tokens start a return statement, aka * {@code RETURN}. */ public Ast.Stmt.Return parseReturnStatement() throws ParseException { throw new UnsupportedOperationException(); //TODO }

(in java)

Here is some background/context for the extra practice task:

image text in transcribed

Here is how the code will be tested:

private static Stream testReturnStatement() { return Stream.of( Arguments.of("Return Statement", Arrays.asList( //RETURN expr; new Token(Token.Type.IDENTIFIER, "RETURN", 0), new Token(Token.Type.IDENTIFIER, "expr", 7), new Token(Token.Type.OPERATOR, ";", 11) ), new Ast.Stmt.Return(new Ast.Expr.Access(Optional.empty(), "expr")) ) ); }

Look out for my other questions like this one!

Parser Overview Recall that the job of the parser is to convert the tokens emitted by the lexer into an Abstract Syntax Tree (AST) which represents the structural meaning of the code. For example, the expressions 1 + 2 * 3 and 1 * 2 + 3 are similar, but their AST's are quite different due to operator precedence: 1 + 2 * 3 1 * 2 + 3 + + 1 * * 3 / 2 3 1 2 Our parser will be implemented using a method called recursive descent, which means each reference to another rule in our grammar corresponds with a call to the appropriate parse function. If the parser is unable to parse something successfully (for example, an unexpected token), then it will throw a ParseException. The index of the exception should be the index of the next token if present, otherwise the index after the previous token (which you will need to compute). You do not need to worry about the edge case of there not being a previous token (as our source rule allows empty input which should not cause an exception). Crafting Interpreters The Parsing section of Crafting Interpreters provides a good overview for the parsing process and was a starting point for the parser architecture we have provided. Their parser is slightly more complex as they submit more functionality, so make sure you only implement what is defined in our grammar. I highly recommend reading it to help with your understanding of the parsing process and this assignment. Grammar A grammar for our lexer is defined below, which is written in a specific form optimal for recursive descent parsing. You can view a graphical form of our grammar on the following website: https://www.bottlecaps.de/rr/ui e source ::= field* method field ::= 'LET' identifier ('=' expression)? ';' method ::= "DEF' identifier '' (identifier C,' identifier):)?')' 'DO' statement* 'END' statement ::= "LET' identifier ('=' expression)? ';' 'IF' expression 'DO' statement* ('ELSE' statement)? 'END | "FOR' identifier 'IN' expression 'DO' statement* 'END' | "WHILE' expression 'DO' statement* "END' | RETURN' expression ';' | expression ('=' expression)? ';' expression ::= logical_expression logical_expression ::= comparison_expression (('AND' | 'OR') comparison_expression)* comparison_expression ::= additive_expression ((''>' | '--' | '!=') additive_expression)* additive_expression ::- multiplicative_expression ('+' | '-') multiplicative_expression)* multiplicative_expression : := secondary_expression ((*** | '/') secondary_expression) secondary_expression ::= primary_expression ('.' identifier ('( (expression (','expression)*)?')'))* primary_expression ::= "NIL' | 'TRUE' | 'FALSE' | integer | decimal | character | string | '('expression) identifier ('( (expression (','expression)*)? '))? identifier ::- (A-Za-z_] [A-Za-Z0-9_1* number ::= [+- ] [0-9]+ ('.' [0-9]+)? character ::= ['] ([^\ | escape) [' string ::= ([^ \\ l escape)* *** escape ::= '[bnrt' operator ::= [!=] '='? 'any character whitespace ::= [ \b \t] o o AST Representation The Ast class contains subclasses representing the more specific elements of the AST and the code that is being parsed. There is not a one-to-one relationship between rules in our grammar and the AST. If you have not completed Homework 2, it may be helpful to do that first. The following classes are contained within Ast : Source : An entire source file (fields + methods). Field : Field declarations. Method : Method definitions. Stmt : Structural parts of the code that perform side effects like assigning variables or modifying control flow. Expression : A statement that is simply an expression (such as a function call). Don't confuse this with Ast. Expr ! Declaration : The declaration (and optional initialization) of variables. Assignment : The assignment of values. The receiver is an expression and not the name of a variable because assignment also applies to fields like x.y = z. If : An if statement with an optional else branch. The list of else statements is empty if not defined. While : A while statement. Return : A return statement. Note that a value must be provided, unlike languages like Java which have void return types. Expr Literal : Contains literal values, such as booleans, integers, or strings. You will need to convert values from the lexer into the appropriate type. Boolean values are represented with the Boolean class. . Integer values are represented with the BigInteger class e, which supports arbitrary precision. . Decimal values are represented with the BigDecimal classe, which supports arbitrary precision. . Character values are represented with the character class. You will need to remove the surrounding single quotes from the literal returned by the lexer and replace any escape characters (hint, see String#replace ). . String values are represented with the String class. You will need to remove the surrounding double quotes ") from the literal returned by the lexer and replace any escape characters (hint, see String#replace ). Group : A grouped expression (generally used for changing priority) Binary : A binary expression, including additive, multiplicative, comparison, and logical expressions from the grammar. For logical expressions, the operator will be the literal AND/OR strings used in the grammar. Access : An access expression, representing accessing a field or variable. The receiver is an Optional value e, which is present to represent field access on objects and empty for variables (effectively, the receiver is the current scope which has access to all defined variables). Function : A function call expression. The receiver is an Optional value c, which is present to represent method calls on objects and empty for function calls (effectively, the receiver is the current scope which has access to all defined functions). 0 Parser Overview Recall that the job of the parser is to convert the tokens emitted by the lexer into an Abstract Syntax Tree (AST) which represents the structural meaning of the code. For example, the expressions 1 + 2 * 3 and 1 * 2 + 3 are similar, but their AST's are quite different due to operator precedence: 1 + 2 * 3 1 * 2 + 3 + + 1 * * 3 / 2 3 1 2 Our parser will be implemented using a method called recursive descent, which means each reference to another rule in our grammar corresponds with a call to the appropriate parse function. If the parser is unable to parse something successfully (for example, an unexpected token), then it will throw a ParseException. The index of the exception should be the index of the next token if present, otherwise the index after the previous token (which you will need to compute). You do not need to worry about the edge case of there not being a previous token (as our source rule allows empty input which should not cause an exception). Crafting Interpreters The Parsing section of Crafting Interpreters provides a good overview for the parsing process and was a starting point for the parser architecture we have provided. Their parser is slightly more complex as they submit more functionality, so make sure you only implement what is defined in our grammar. I highly recommend reading it to help with your understanding of the parsing process and this assignment. Grammar A grammar for our lexer is defined below, which is written in a specific form optimal for recursive descent parsing. You can view a graphical form of our grammar on the following website: https://www.bottlecaps.de/rr/ui e source ::= field* method field ::= 'LET' identifier ('=' expression)? ';' method ::= "DEF' identifier '' (identifier C,' identifier):)?')' 'DO' statement* 'END' statement ::= "LET' identifier ('=' expression)? ';' 'IF' expression 'DO' statement* ('ELSE' statement)? 'END | "FOR' identifier 'IN' expression 'DO' statement* 'END' | "WHILE' expression 'DO' statement* "END' | RETURN' expression ';' | expression ('=' expression)? ';' expression ::= logical_expression logical_expression ::= comparison_expression (('AND' | 'OR') comparison_expression)* comparison_expression ::= additive_expression ((''>' | '--' | '!=') additive_expression)* additive_expression ::- multiplicative_expression ('+' | '-') multiplicative_expression)* multiplicative_expression : := secondary_expression ((*** | '/') secondary_expression) secondary_expression ::= primary_expression ('.' identifier ('( (expression (','expression)*)?')'))* primary_expression ::= "NIL' | 'TRUE' | 'FALSE' | integer | decimal | character | string | '('expression) identifier ('( (expression (','expression)*)? '))? identifier ::- (A-Za-z_] [A-Za-Z0-9_1* number ::= [+- ] [0-9]+ ('.' [0-9]+)? character ::= ['] ([^\ | escape) [' string ::= ([^ \\ l escape)* *** escape ::= '[bnrt' operator ::= [!=] '='? 'any character whitespace ::= [ \b \t] o o AST Representation The Ast class contains subclasses representing the more specific elements of the AST and the code that is being parsed. There is not a one-to-one relationship between rules in our grammar and the AST. If you have not completed Homework 2, it may be helpful to do that first. The following classes are contained within Ast : Source : An entire source file (fields + methods). Field : Field declarations. Method : Method definitions. Stmt : Structural parts of the code that perform side effects like assigning variables or modifying control flow. Expression : A statement that is simply an expression (such as a function call). Don't confuse this with Ast. Expr ! Declaration : The declaration (and optional initialization) of variables. Assignment : The assignment of values. The receiver is an expression and not the name of a variable because assignment also applies to fields like x.y = z. If : An if statement with an optional else branch. The list of else statements is empty if not defined. While : A while statement. Return : A return statement. Note that a value must be provided, unlike languages like Java which have void return types. Expr Literal : Contains literal values, such as booleans, integers, or strings. You will need to convert values from the lexer into the appropriate type. Boolean values are represented with the Boolean class. . Integer values are represented with the BigInteger class e, which supports arbitrary precision. . Decimal values are represented with the BigDecimal classe, which supports arbitrary precision. . Character values are represented with the character class. You will need to remove the surrounding single quotes from the literal returned by the lexer and replace any escape characters (hint, see String#replace ). . String values are represented with the String class. You will need to remove the surrounding double quotes ") from the literal returned by the lexer and replace any escape characters (hint, see String#replace ). Group : A grouped expression (generally used for changing priority) Binary : A binary expression, including additive, multiplicative, comparison, and logical expressions from the grammar. For logical expressions, the operator will be the literal AND/OR strings used in the grammar. Access : An access expression, representing accessing a field or variable. The receiver is an Optional value e, which is present to represent field access on objects and empty for variables (effectively, the receiver is the current scope which has access to all defined variables). Function : A function call expression. The receiver is an Optional value c, which is present to represent method calls on objects and empty for function calls (effectively, the receiver is the current scope which has access to all defined functions). 0