Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The problem I have is using BISON. Specifically need to create an abstract syntax tree (AST). Now the overall problem is stated below: To get

The problem I have is using BISON. Specifically need to create an abstract syntax tree (AST). Now the overall problem is stated below:

To get you started, you are provided with a Flex scanner specification in `scanner.l` and a Bison parser specification in `parser.y` that, together with the `main()` function in `main.cpp`, solve the problem defined in assignment 2. There is also a makefile that specifies compilation for the parser.

## 1. Implement one or more data structures for building an AST

An abstract syntax tree (AST) is a tree-based representation of the source program. It closely resembles a parse tree but is more compact because it eliminates or contracts many of the nodes corresponding to nonterminals. You can see visualizations of example ASTs in the `example_output/` directory.

Your first task in this assignment is to implement a set of one or more data structures that allow you to construct an AST for a source program written in the same subset of Python we worked with for assignment 2. To do this, you'll have to figure out what specific language constructs need to be represented by an AST node, what data might be associated with each of these nodes, and how each type of node connects to other nodes in the AST.

For example, you might write a class or structure to represent an AST node corresponding to a binary expression (e.g. `expr OP expr`) in the source language. Your class/structure might contain a field to represent the specific operator associated with the expression, and it might contain two pointers to other AST nodes, one representing the left-hand side of the binary expression and another representing the right-hand side of the binary expression. These LHS and RHS nodes might have additional children, or they could represent identifiers, floats, integers, etc. that have no children. Moreover, you might also have classes/structures for nodes representing higher-level language constructs, such as assignment statements. The binary expression node would be a child of one of these higher-level nodes.

## 2. Modify the parser to use your data structures to build an AST

Your next task is to modify the parser to build an AST using the data structures you defined above. The general idea here is to modify the action associated with each of the parser's grammar rules to return an AST node instead of returning a string of C++ code.

Though the end goal is different, the mechanics of this will be similar to generating a C++ translation. In particular, to generate a string containing C++ translation for the language construct on the left-hand side of a particular grammar rule, you assumed that you had C++ strings for the language constructs on the right-hand side of the rule (specifically in `$n`), and you concatenated these together (along with proper punctuation, etc.) to form the left-hand side's C++ string (i.e. `$$`).

Similarly, when building an AST node for a language construct on the left-hand side of a grammar rule, you should assume that you have AST nodes for the relevant language constructs on the right-hand side of the rule (in `$n`), and you can use these to generate the node for the left-hand side construct (i.e. `$$`). For example, you could pass the `$n`'s to a function or class constructor that generates a node for the left-hand-side language construct and then assign that generated node to `$$`.

You'll have to do a few other things in the parser to get this all working, as well. For example, the current parser uses the type `std::string*` for all grammar symbols. You'll instead have to use the appropriate AST node class/structure type for each nonterminal symbol. You'll likely want to continue to represent all terminals with `std::string*`. The scanner may need very minor changes to make this work.

Once you have the parser building your AST, save the root node of the AST in a global variable, similar to the way the current parser saves the entire translated program string in a global variable. If you'd like, you can modify the `main()` function to print out some information about the AST using this global variable.

## 3. Use your AST to generate a GraphViz specification to visualize the AST

Finally, to get practice generating code from an AST, implement functionality to use your AST to generate its own [GraphViz](http://www.graphviz.org/) specification. You should specifically write a specification that can be passed to the [`dot`](https://graphviz.gitlab.io/_pages/pdf/dotguide.pdf) program (which is installed on the ENGR servers) to generate a visualization of the AST.

GraphViz uses a very simple notation for representing trees and graphs. Here's an example of a simple GraphViz tree specification: ``` digraph G { a [label="Node A"]; b [label="Node B"]; c [label="Node C"]; d [label="Node D"]; e [label="Node E"]; f [label="Node F"]; g [label="Node G"]; h [label="Node H"];

a -> b; a -> c; a -> d; c -> e; d -> f; e -> g; e -> h; } ``` Assuming this file was named `tree.gv`, we could use it to generate a PNG image visualizing the specified tree with the following command: ``` dot -Tpng -otree.png tree.gv ``` This is the image that would be produced (in `tree.png`):

![Tree visualization](example_output/tree.png)

The GraphViz specification is flexible, and nodes and edges can be defined in any order. More info on additional visualization options is available in the documentation linked above.

To generate a GraphViz specification for your AST, you should write functions that generate GraphViz code for each of your AST node classes/structures (one function per class/structure). Each of these functions should generate the relevant GraphViz code for the node itself (e.g. a label for the node indicating the language construct it represents and specifications of the edges to the node's children). They should then recursively call other functions to generate GraphViz code for each of the node's children. The practice of recursively traversing the AST in this way to generate GraphViz code closely resembles the way assembly code is generated from an AST, e.g. using LLVM.

Two primary design considerations when writing your traversal/code generation functions will be how to ensure that each AST node is given a unique identifier in the GraphViz specification and how to concatenate together all of the GraphViz code produced for the individual AST nodes.

You should use your code generation functions by invoking them on the root node of the AST from your `main()` function, and you should output the generated GraphViz code to stdout, so it can be inspected or saved to a file.

## Testing your code

There are some simple Python programs you may use for testing your AST builder included in the `testing_code/` directory. Example outputs (both a `.png` visualization and a `.gv` GraphView specification) for these programs are included in the `example_output/` directory. Note that the ASTs your parser generates may be slightly different than the ones included here, depending on how you choose to represent nodes in the AST.

What I really need help with is where to create a class definition so I can use it in the prologue of parser.y and to be able to use it in %union {} within parser.y

Below is what I have so far:

Parser:

image text in transcribed

image text in transcribed

image text in transcribed

Main.cpp:

image text in transcribed

AST.h

image text in transcribed

Scanner.l I haven't posted here, but I've tested it and that doesn't seem to have any problems. The main problems I'm having are with parser.y and main.cpp.

The problems with the way I have it right now are: output of a node that n0 (root node) is linked to within the vector of nodes, returns an address of its label instead of a string. The biggest problems are: BISON and where to put class definitions (and how as well), working with vector of pointers that point to classes/objects, and iterating through a vector of class pointers.

2 Include ciostrear 3 4include cset 5 ficlude vector> std::string nar std::string" label; tditring" valve 12 std::vectorcAST>nodes; 13 ST lhs; 14 ST rhs 16 pubii:: 21 vid addsod(AST node)I nodes.psh backinode); std: :vector(A&M> Retode$(){ return rodes; void setassigrnent(AST lhs, AST rhs); vold SetExpressioN AST lhs, AST rhs); vold setoperator(AST" lhs AST" rhs); vwid setTdentifieristd::string z std::string RetLabelor returm labe1; 38 32 1rclude "parser, hpp" exter int yylex) void yyerror YYLT char err 38 4B tenerated, dni symbol, is + simple syntol table. 42 Std::strin taget progran Here, tergetprogrs" 15 string that will hold the target progra, being 43 Std: :setestd:istrirg syrbols; 6 std: strin" tep2 - new std:istri BLOCK 52 uion 53 std::strine str 4 int token; 57 ,x Enable location tracking. * All prtern "instructs 11 te represented as strings, specifically as 51 62 63 their corresponding , trans1ation. 66 57 tine (1.e. DEDENT Because the lexer can gene-ete re then one tocen et tokens), we'll use d push pdrser w yu %define api.pure full %defire api.push-pill push 2 Include ciostrear 3 4include cset 5 ficlude vector> std::string nar std::string" label; tditring" valve 12 std::vectorcAST>nodes; 13 ST lhs; 14 ST rhs 16 pubii:: 21 vid addsod(AST node)I nodes.psh backinode); std: :vector(A&M> Retode$(){ return rodes; void setassigrnent(AST lhs, AST rhs); vold SetExpressioN AST lhs, AST rhs); vold setoperator(AST" lhs AST" rhs); vwid setTdentifieristd::string z std::string RetLabelor returm labe1; 38 32 1rclude "parser, hpp" exter int yylex) void yyerror YYLT char err 38 4B tenerated, dni symbol, is + simple syntol table. 42 Std::strin taget progran Here, tergetprogrs" 15 string that will hold the target progra, being 43 Std: :setestd:istrirg syrbols; 6 std: strin" tep2 - new std:istri BLOCK 52 uion 53 std::strine str 4 int token; 57 ,x Enable location tracking. * All prtern "instructs 11 te represented as strings, specifically as 51 62 63 their corresponding , trans1ation. 66 57 tine (1.e. DEDENT Because the lexer can gene-ete re then one tocen et tokens), we'll use d push pdrser w yu %define api.pure full %defire api.push-pill push

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Next Generation Databases NoSQLand Big Data

Authors: Guy Harrison

1st Edition

1484213300, 978-1484213308

More Books

Students also viewed these Databases questions

Question

People who eat more lobster tend to live longer.

Answered: 1 week ago