, ] What do you want to generate (Enter to quit)? How many do you want me to generate? 10 a pretentious dog hit Elmo a green green big dog honored Fred the big child collapsed a subliminal dog kissed the subliminal television Sally laughed Fred wept Fred died the pretentious fat subliminal mother wept Elmo honored a faulty television Elmo honored Elmo Available symbols to generate are: [, , , , , , , , , ] What do you want to generate (Enter to quit)? RECURSIVE ALGORITHM You can generate elements of a grammar using a recursive algorithm. To generate a random occurrence of a symbol S: If S is a terminal symbol, there is nothing to do. If S is a non-terminal symbol, choose a random expansion rule R for S. For each of the symbols in the rule R, generate a random occurrence of that symbol. Below is an example of a sentence that can be generated from sentence.txt. The nonterminal can produce the sentence, Fred honored the green wonderful child, as indicated : Fred honored the green wonderful child To generate a non-terminal, pick one of its rules at random and then generate each part of that rule, which might involve more non-terminals to recursively generate. For each part, pick a rule at random and generate each of its part, etc. When you encounter a terminal, simply include it in your string. This becomes a base case of the process. YOUR INSTRUCTIONS You will be given a client program GrammarMain.java that does the file processing and user interaction. You are to write a class called Grammar Solver that manipulates a grammar. Grammar Main reads a BNF grammar input text file and passes its entire contents to you as a list of strings. For example, if your program were to examine the grammar on the previous page, your object would be passed a 10-element list of the entire contents of that grammar file. Your solver must break each string from the list into symbols and rules so that it can generate random elements of the grammar as output. NECESSARY METHODS PUBLIC GRAMMARSOLVER (LIST RULES) In this constructor you should initialize a new grammar solver over the given BNF grammar rules, where each rule corresponds to one line of text as shown in the file on the previous page. Your constructor should break apart the rules and store them into a Map so that you can later look up parts of the grammar efficiently. Do not modify the list passed. You should throw an IllegalArgumentException if the list is null, or empty (size 0). You should also throw an IllegalArgumentException if the grammar contains more than one line for the same non-terminal. For example, if two lines both specified rules for symbol "", this would be illegal and should result in the exception being thrown. PUBLIC BOOLEAN CONTAINS(STRING SYMBOL) In this method you should return true if the given symbol is a non-terminal in the grammar and false otherwise. For example, when using the grammar described previously, return true for a call of contains("") and false for a call of contains("") or contains("green") ("green" is a terminal in the language). You should throw an IllegalArgumentException if the string is null or has a length of O. PUBLIC SET GETSYMBOLS() In this method you should return all non-terminal symbols of your grammar as a sorted set of strings. This is the keyset of your map. For example, when using the previous grammar, getSymbols() would return a set containing the ten elements ["", "", "", "", "", "", "", "", ""]. PUBLIC STRING GENERATE(STRING SYMBOL) In this method you should use the grammar to generate a random occurrence of the given symbol and you should return it as a String. For example, when using the grammar described on the previous pages, a call of generate("") might potentially return the string, "the green wonderful child". If the string passed is not a non-terminal in your grammar, you should assume that it is a terminal symbol and simply return it. For example, a call of generate("green") should return "green". (Note there is not a space before/after "green".) For any nonterminal symbol, each of its rules should be applied with equal probability. Use the Random class in java.util to help you make random choices between rules. You should throw an IllegalArgumentException if the string is null or has a length of 0. DEVELOPMENTS STRATEGY AND HINTS The hardest method is generate(), so write it last. Look up the string method String.trim() and make heavy use of it. It will solve some problems with your output as you get deeper into the program. For your program the hard part is following the grammar rules to generate different parts of the grammar, so that is the place to use recursion. If your recursive method has a bug, try putting a debug println() that prints its parameter values, to see the calls being made. For this program you must store the contents of the grammar into a Map. As you know, maps keep track of key/value pairs, where each key is associated with a particular value. In our case, we want to store information about each nonterminal symbol. So the non-terminal symbols become keys and their rules become values. Notice that the getSymbols method requires that the non-terminals be listed in sorted order, which may affect what kind of map you use. Other than the Map requirement, you are allowed to use whatever constructs you want from the Java class libraries STRINGS One problem you will have to deal with early in this program is breaking strings into various parts. There are several ways to do this, but I strongly recommend that you use the String's split method. The split method breaks a large string into an array of smaller string tokens. To use split, you must specify a delimiter that indicates where one token ends and the next begins. The delimiter strings passed to split are called regular expressions, which are strings that use a particular syntax to indicate patterns of text. They can be confusing, but learning about regular expressions is helpful for computer scientists and programmers. Many Unix/Linux tools, for example, use regular expressions as input. To split a string by ::= characters you simply pass those characters to split. To split by whitespace, we want our delimiter to be a sequence of one or more spaces and/or tabs. This can be accomplished by putting a space and a tab inside [] brackets and putting a + plus sign after the brackets to indicate "1 or more". To split on a pipe character, we can't just pass the pipe character as a String as we did with the ::= because has a special meaning in regular expressions. So we must enclose it in [ ] brackets as well. The following examples show these regular expressions: String s1 = "example ::= foo bar | baz"; String[] parts1 = s1.split("::="); // ["example", "foo bar baz" ] String s2 = "the quick brown fox"; String[] parts2 = s2.split("[ \t]+"); // ["the", "quick", "brown", "fox"] String s3 = "foo bar|baz quux mumble"; String[] parts3 = s3.split("[l]"); // ["foo bar", "baz ", "quux mumble"] If the string you split begins with a space, you will get an empty string at the front of your array, so use the String.trim() method as needed. Also, the parts of a rule will be separated by whitespace, but once you've split the rule by spaces, all spaces are gone. If you want spaces between words when generating strings to return, you must include those yourself. STYLE GUIDELINES AND GRADING: Part of your grade will come from appropriately utilizing recursion to implement your algorithm as described previously. I will also grade on the elegance of your recursive algorithm; don't create special cases in your recursive code if they are not necessary. Redundancy is another major grading focus; you should avoid repeated logic as much as possible. Your class may have other methods besides those specified, but any other methods you add should be private. You should follow good general style guidelines such as: making fields private and avoiding unnecessary fields; declaring collection variables using interface types; appropriately using control structures like loops and if/else; properly using indentation, good variable names and types; and not having any lines of code longer than 100 characters. Comment your code descriptively in your own words at the top of your class, each method, and on complex sections of your code. Comments should explain each method's behavior, parameters, return, pre/post-conditions, and exceptions. For reference, a possible solution is than 100 lines long including blank lines (75 "substantive" lines) PROGRAM DESCRIPTION In this assignment you will complete a program that reads an input file with a grammar in Backus-Naur Form and allows the user to randomly generate elements of the grammar. You will use recursion to implement the core of your algorithm. Your program should exactly reproduce the format and general behavior demonstrated in this log. Because your program involves randomness (described below), the exact phrases generated may not match. SAMPLE EXECUTION Welcome to the random sentence generator! What is the name of the grammar file? sentence.txt Available symbols to generate are: [, , , , , , , , , ] What do you want to generate (Enter to quit)? How many do you want me to generate? 3 the the Available symbols to generate are: [, , , , , , , , , ] What do you want to generate (Enter to quit)? How many do you want me to generate? 5 a wonderful father the faulty man Spot the subliminal university Sally Available symbols to generate are: [, , , , , , , , , ] What do you want to generate (Enter to quit)? How many do you want me to generate? 10 a pretentious dog hit Elmo a green green big dog honored Fred the big child collapsed a subliminal dog kissed the subliminal television Sally laughed Fred wept Fred died the pretentious fat subliminal mother wept Elmo honored a faulty television Elmo honored Elmo Available symbols to generate are: [, , , , , , , , , ] What do you want to generate (Enter to quit)? RECURSIVE ALGORITHM You can generate elements of a grammar using a recursive algorithm. To generate a random occurrence of a symbol S: If S is a terminal symbol, there is nothing to do. If S is a non-terminal symbol, choose a random expansion rule R for S. For each of the symbols in the rule R, generate a random occurrence of that symbol. Below is an example of a sentence that can be generated from sentence.txt. The nonterminal can produce the sentence, Fred honored the green wonderful child, as indicated : Fred honored the green wonderful child To generate a non-terminal, pick one of its rules at random and then generate each part of that rule, which might involve more non-terminals to recursively generate. For each part, pick a rule at random and generate each of its part, etc. When you encounter a terminal, simply include it in your string. This becomes a base case of the process. YOUR INSTRUCTIONS You will be given a client program GrammarMain.java that does the file processing and user interaction. You are to write a class called Grammar Solver that manipulates a grammar. Grammar Main reads a BNF grammar input text file and passes its entire contents to you as a list of strings. For example, if your program were to examine the grammar on the previous page, your object would be passed a 10-element list of the entire contents of that grammar file. Your solver must break each string from the list into symbols and rules so that it can generate random elements of the grammar as output. NECESSARY METHODS PUBLIC GRAMMARSOLVER (LIST RULES) In this constructor you should initialize a new grammar solver over the given BNF grammar rules, where each rule corresponds to one line of text as shown in the file on the previous page. Your constructor should break apart the rules and store them into a Map so that you can later look up parts of the grammar efficiently. Do not modify the list passed. You should throw an IllegalArgumentException if the list is null, or empty (size 0). You should also throw an IllegalArgumentException if the grammar contains more than one line for the same non-terminal. For example, if two lines both specified rules for symbol "", this would be illegal and should result in the exception being thrown. PUBLIC BOOLEAN CONTAINS(STRING SYMBOL) In this method you should return true if the given symbol is a non-terminal in the grammar and false otherwise. For example, when using the grammar described previously, return true for a call of contains("") and false for a call of contains("") or contains("green") ("green" is a terminal in the language). You should throw an IllegalArgumentException if the string is null or has a length of O. PUBLIC SET GETSYMBOLS() In this method you should return all non-terminal symbols of your grammar as a sorted set of strings. This is the keyset of your map. For example, when using the previous grammar, getSymbols() would return a set containing the ten elements ["", "", "", "", "", "", "", "", ""]. PUBLIC STRING GENERATE(STRING SYMBOL) In this method you should use the grammar to generate a random occurrence of the given symbol and you should return it as a String. For example, when using the grammar described on the previous pages, a call of generate("") might potentially return the string, "the green wonderful child". If the string passed is not a non-terminal in your grammar, you should assume that it is a terminal symbol and simply return it. For example, a call of generate("green") should return "green". (Note there is not a space before/after "green".) For any nonterminal symbol, each of its rules should be applied with equal probability. Use the Random class in java.util to help you make random choices between rules. You should throw an IllegalArgumentException if the string is null or has a length of 0. DEVELOPMENTS STRATEGY AND HINTS The hardest method is generate(), so write it last. Look up the string method String.trim() and make heavy use of it. It will solve some problems with your output as you get deeper into the program. For your program the hard part is following the grammar rules to generate different parts of the grammar, so that is the place to use recursion. If your recursive method has a bug, try putting a debug println() that prints its parameter values, to see the calls being made. For this program you must store the contents of the grammar into a Map. As you know, maps keep track of key/value pairs, where each key is associated with a particular value. In our case, we want to store information about each nonterminal symbol. So the non-terminal symbols become keys and their rules become values. Notice that the getSymbols method requires that the non-terminals be listed in sorted order, which may affect what kind of map you use. Other than the Map requirement, you are allowed to use whatever constructs you want from the Java class libraries STRINGS One problem you will have to deal with early in this program is breaking strings into various parts. There are several ways to do this, but I strongly recommend that you use the String's split method. The split method breaks a large string into an array of smaller string tokens. To use split, you must specify a delimiter that indicates where one token ends and the next begins. The delimiter strings passed to split are called regular expressions, which are strings that use a particular syntax to indicate patterns of text. They can be confusing, but learning about regular expressions is helpful for computer scientists and programmers. Many Unix/Linux tools, for example, use regular expressions as input. To split a string by ::= characters you simply pass those characters to split. To split by whitespace, we want our delimiter to be a sequence of one or more spaces and/or tabs. This can be accomplished by putting a space and a tab inside [] brackets and putting a + plus sign after the brackets to indicate "1 or more". To split on a pipe character, we can't just pass the pipe character as a String as we did with the ::= because has a special meaning in regular expressions. So we must enclose it in [ ] brackets as well. The following examples show these regular expressions: String s1 = "example ::= foo bar | baz"; String[] parts1 = s1.split("::="); // ["example", "foo bar baz" ] String s2 = "the quick brown fox"; String[] parts2 = s2.split("[ \t]+"); // ["the", "quick", "brown", "fox"] String s3 = "foo bar|baz quux mumble"; String[] parts3 = s3.split("[l]"); // ["foo bar", "baz ", "quux mumble"] If the string you split begins with a space, you will get an empty string at the front of your array, so use the String.trim() method as needed. Also, the parts of a rule will be separated by whitespace, but once you've split the rule by spaces, all spaces are gone. If you want spaces between words when generating strings to return, you must include those yourself. STYLE GUIDELINES AND GRADING: Part of your grade will come from appropriately utilizing recursion to implement your algorithm as described previously. I will also grade on the elegance of your recursive algorithm; don't create special cases in your recursive code if they are not necessary. Redundancy is another major grading focus; you should avoid repeated logic as much as possible. Your class may have other methods besides those specified, but any other methods you add should be private. You should follow good general style guidelines such as: making fields private and avoiding unnecessary fields; declaring collection variables using interface types; appropriately using control structures like loops and if/else; properly using indentation, good variable names and types; and not having any lines of code longer than 100 characters. Comment your code descriptively in your own words at the top of your class, each method, and on complex sections of your code. Comments should explain each method's behavior, parameters, return, pre/post-conditions, and exceptions. For reference, a possible solution is than 100 lines long including blank lines (75 "substantive" lines)<>
<>
<>
<>
<>
<>
<>
<>
<>