Question
Using code below.. Implement a scanner for the language over this grammar using a state transition table. Before you start coding, you must create a
Using code below..
Implement a scanner for the language over this grammar using a state transition table. Before you start coding, you must create a state transition table that will drive the scanner as discussed in the lectures. You will encode the transition table in a configuration file that will be used to initialize the scanner.
Please note that the grammar has two extensions that require modification to the distributed code:
identifiers may include digits as long as the first character is a letter, and numbers may include a dot.
The main() function opens a configuration file with the state transition table. If the scanner is invoked without a file name as a command line argument, the code logic will accept the configuration supplied through the standard input. The actual input program to scan is always expected from the standard input. To simplify the input management, the file handle for the standard input is modified to point to the configuration file temporarily (if provided), and then is changed back to refer to the regular standard input. Check the man pages for freopen(), dup(), and dup2() to help you to understand the code logic.
The actual initialization of the scanner is done in the scanner initialization function scanInit(). Your code in that function must create a structure for the transition table that includes the number of input symbol classes, the number of states, an array of input symbol classes, and the actual state transition table. You must use the following structure to hold the transition table:
typedef struct { int numberOfStates; int numberOfClasses; char **inputSymbolClasses; } TRANS_TABLE_TYPE;
The configuration file should start with a line that specifies the number of states and the number of classes in your state transition table. Note that you need to add two columns in your state transition table: one for the "other" input symbols class (i.e., any character that does not belong to a specific input symbol class) , and another one (the last) for holding the type of the token for a given state (row).
The following line describes input symbol classes by specifying all input symbols that are members of the class. Each class is separated by a comma as shown here:
;,(,),+,-,*,/,%,0123456789,abcdefghijklmnopqrstuvwxyz,=, \t
Each element of the table inputSymbolClasses should be a pointer to a string that contains all symbols in a given class. You should use strsep() function to tokenize the class line using ',' as a delimiter (read the man page for details). You need to write a related lookup function findIndexToClass() that finds a class for every input symbol. For example, class of ";" is 0, class of "(" is 1, class for any digit is 8, and class for letters is 9. Note that the last class lists white space symbols that include special characters "\t" (a tab) and " " (new line). These special characters are not printable, so inserting them literally into the configuration file and subsequent managing is difficult. Therefore, they are specified as two-character strings (i.e., for example " "). Your code will have to replace these strings with a one character representation of the special character (i.e., "\t" with '\t', and " " with ' '). As stated earlier, remember that you need another implied class for "other input symbol".
Each subsequent line in the configuration file should correspond to one row of the state transition table. All entries in the line of the configuration file should be separated by spaces. You must create an internal representation of the state transition table as a two-dimensional packed array of chars and set a pointer to it in the field table in the state transition structure. As before, use strsep() to tokenize each of the lines.
Use the following simple encoding scheme: character a should stand for accept, e for error, and an integer for the new state. If there is an a in any row, then the last column should indicate which token is accepted in the final state corresponding to the row.
NEED TO DO TODO COMMENTS!!!!!
state = 1 while (TRUE) { currChar = readNextChar() state = table[state][currChar] if state terminal return token collectedChars += currChar } // // Lab5 // scanner_transition_table.c
#include "scanner.h"
TOKEN *ungottenToken = NULL; // may be used by parser in the next lab
// // return token to the input, so it can be analyzed again // void ungetToken(TOKEN **token) { ungottenToken = *token; *token = NULL; }
// // clean up the token structure // void freeToken(TOKEN **token) { if (*token == NULL) return;
if ((*token)->strVal != NULL) free((*token)->strVal);
free(*token);
*token = NULL; }
// // check if a collected sequence of characters is a keyword // void updateTypeIfKeyword(TOKEN *token) { // TODO Implement the function }
TRANS_TABLE_TYPE *scanInit() { TRANS_TABLE_TYPE *returnTable = NULL;
// TODO Implement the function
return returnTable; }
int findIndexToClass(TRANS_TABLE_TYPE *transitionTable, char c) { int class;
// TODO Implement the function
return class; }
TOKEN *scanner(TRANS_TABLE_TYPE *transitionTable) { TOKEN *token = NULL;
// TODO Implement the functions
return token; }
// Lab5 // scanner.h //
#ifndef __SCANNER_H #define __SCANNER_H
#include
typedef enum { INVALID_TOKEN = 0, NUMBER_TOKEN, //1 IDENT_TOKEN, //2 ASSIGNMENT, //3 SEMICOLON, //4 LPAREN, //5 RPAREN, //6 PLUS, //7 MINUS, //8 MULT, //9 DIV, //10 MOD, //11 REPEAT, PRINT, END_OF_INPUT_TOKEN } TOKEN_TYPE;
typedef struct token { TOKEN_TYPE type; char *strVal; } TOKEN;
typedef struct { int numberOfStates; int numberOfClasses; char **inputSymbolClasses; char **table; } TRANS_TABLE_TYPE;
TRANS_TABLE_TYPE *scanInit(); void updateTypeIfKeyword(TOKEN *token); int findIndexToClass(TRANS_TABLE_TYPE *transitionTable, char c); TOKEN* scanner(TRANS_TABLE_TYPE *outputTable);
void ungetToken(TOKEN **); void freeToken(TOKEN **);
#define BUF_SIZE 128 #define MAX_LINE_LENGTH 256
#endif
// Lab5 // main.c (driver for a scanner test) //
#include "scanner.h" #include
int main(int argc, char** argv) { TOKEN *token = NULL; char *token2str[] = {"INVALID", "NUMBER", "IDENT", "ASSIGNMENT", "SEMICOLON", "LPAREN", "RPAREN", "PLUS", "MINUS", "MULT", "DIV", "MOD", "REPEAT", "PRINT", "END_OF_INPUT"};
int savedstdin = dup(0); // save stdin file descriptor
freopen(argv[1], "r", stdin); // possibly switch stdin to get the table from the file
TRANS_TABLE_TYPE *transitionTable = scanInit();
dup2(savedstdin, 0); // restore the original stdin, so we can read the input data from stdin
while ((token = scanner(transitionTable)) != NULL) { if ( token->strVal == NULL) printf("{%s} ", token2str[token->type]); else printf("{%s, %s} ", token2str[token->type], token->strVal); freeToken(&token); fflush(stdout); } printf(" "); }
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started