Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The purpose of this exercise is to write a syntax generator for a subset of the C++ programming language that will write random C++ programs

The purpose of this exercise is to write a syntax generator for a subset of the C++ programming language that will write random C++ programs to a file. By writing these random syntactically correct programs, you will further develop your understanding of the difference between syntax and semantics.

Consider the following set of productions that define a subset of the C++ programming language:

::= int main() { return 0; }

::= |

::= | | |

|

::= { }

::= if ( )

| if ( ) | if ( ) else

| if ( ) else

| if ( ) else

| if ( ) else ::= while ( )

| while ( ) ::= = ;

::= ;

| ::= | | ::= + | - | * | /

::= int

| double

::=

::=

::= [empty]

|

|

::= [empty]

|

::= [A-Z] | [a-z] | _

::= [0-9]

You will notice that this grammar is a combined phrase and lexical grammar. This is for simplification purposes.

If you are interested in seeing an exhaustive BNF formulation of the C programming language, please view the following: http://www.cs.man.ac.uk/~pjj/bnf/c_syntax.bnf. This contains over 60 productions and may be a little much for the purposes of this exercise.

Problem 1. Write a program in C, C++, C#, Java, or Python (your choice) that starts with the root non-terminal and generates a random syntactically correct C++ program using the productions defined above. You should follow the examples we saw in class where we expand non-terminals recursively until we obtain a sentence consisting of terminal tokens only. In the case where a production contains more than one expansion (i.e., right-hand-side expressions), your program should select one randomly (preferably with non-uniform weighting based on which expansions are more likely to occur in C++). Your program should write the random C++ code to an output file.

Problem 2. Run your generator (possibly a few times) to generate a random syntactically correct C++ program. You will observe that, despite the correct syntax, there are other issues that prevent this program from being correct. Please discuss some of these errors. Suppose you wish to implement a checker for detecting these errors. What sort of approach or algorithm would you use to accomplish this? There is no need to write any code for this problem.

Hint 1: The following is a sample random program that was generated by my solution. I added some white spaces to make it more readable:

int main()

{

int F0Z = 0262;

if (22682 / 525)

double S1;

else

S = U;

while (8 - 594873)

{

while (97942 / 6871573097 * 7261055)

{

while (9307 * M / 4 / 2 + 4 - 7 / K)

{

double A;

}

}

}

return 0;

}

I would recommend that you include white space characters (as appropriate) into the production rules above to make your output random program more readable. Dont worry about indenting/tabs for the purposes of this assignment. I manually tabbed the code above.

Hint 2: I solved this problem both in C++ and Python. In Python, I was able to write the solution in less than 100 lines of code (it is a really great language). My C++ solution makes use of a class called Production whose instances represent the various productions that define the C/C++ grammar subset above. My definition of the class is shown below:

class Production

{

private:

string lhs;

vector rhs_options; // list of options for expansion

vector trans_probs; // list of probabilities associated // with each choice

public:

Production();

Production(string);

void add_rhs(string, double); // adds new rhs to the production

string expand() const; // returns one of the rhs choices using

// a random number generator

};

Hint 3: You should not choose your probabilities for selecting an RHS uniformly. For example, given the production rule:

::= | | |

|

If every time you hit a symbol, you select one of these choices uniformly at random (i.e., with probability 0.2), you will get a very long program, and in some cases, your program may not even terminate. The reason for this is because a non-terminal such as expands into something large, especially since can contain other s.

A better selection of probabilities might be to select with probability 0.10, with probability 0.10, with probability 0.10, with probability 0.35, and with probability 0.35. You can play with these probabilities to make your overall program longer or shorter.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions