Question
Write C or C++ Program! Write a tokenizer. The tokenizer should input a stream of ASCII characters and output the token, its line number in
Write C or C++ Program!
Write a tokenizer. The tokenizer should input a stream of ASCII characters and output the token, its line number in the stream, its type, and its value. You must write the code from scratch without the use of any lexicographical parsing libraries or utilities.
Tokenizer should recognize the following:
1. A few keywords: "if", "else", "for", "while"
2. A few single-character symbols: '&', '|', '+', '*', ':', ';'
3. Labels (alpha-numerical strings)
4. Integers (including negative integers)
5. Floating-point numbers in radix notation (such as pi, i.e. 3.14159265)
6. Floating-point numbers in exponential notation (such as Avogadro's number, i.e. 6.022140857E23)
Max 100 lines of code and I/O example, I/O has to be read from file!
Code example down
#include
// --------------------------------------------------------------------- // this trivial program reads a stream and outputs tokens // valid tokens: // 0: error // 1: whitespace (blank, tab, CR, LF) // 2: word (lowercase only) // 3: number (unsigned decimal integer) // --------------------------------------------------------------------- #define STATE_ERROR 0 #define STATE_WHITESPACE 1 #define STATE_WORD 2 #define STATE_NUMBER 3
char szWhiteSpace[]=" \t "; char szWord[]="abcdefghijklmnopqrstuvwxyz"; char szNumber[]="0123456789"; char *szStates[]={"ERROR", "WHITESPACE", "WORD", "NUMBER"};
int main(void){ char c; char szToken[256]; int nTokenSize=0; int nChars=0, nTokens=0, nLines=1; int nCurState=STATE_ERROR, nNextState=STATE_ERROR;
while((c=getc(stdin))) { if(c==EOF) break; nChars++;
if(strchr(szWord, c)) nNextState=STATE_WORD; else if(strchr(szNumber, c)) nNextState=STATE_NUMBER; else if(strchr(szWhiteSpace, c)) nNextState=STATE_WHITESPACE; else nNextState=STATE_ERROR;
if(nChars==1) nCurState=nNextState;
// uncomment the following line to debug // printf("line %4d, char %4d: %c [%02X] (%d -> %d) ", nLines, nChars, c, c, nCurState, nNextState);
if(nNextState==nCurState) { szToken[nTokenSize++]=c; if(c==' ') nLines++; continue; }
szToken[nTokenSize]=0; if((nCurState==STATE_WORD)||(nCurState==STATE_NUMBER)) printf("token %2d, line %2d: %10s (%s) ", ++nTokens, nLines, szToken, szStates[nCurState]);
nTokenSize=0; szToken[nTokenSize++]=c; nCurState=nNextState; if(c==' ') nLines++; }
return(nTokens); } // --------------------------------------------------------------------- // ---------------------------------------------------------------------
------------------------------------------------------------------------ SAMPLE input: ------------------------------------------------------------------------ one two three
1 11 123 13456
one 1 two 2 ------------------------------------------------------------------------ corresponding output: (./cs305_lex < in.txt) ------------------------------------------------------------------------ token 1, line 1: one (WORD) token 2, line 2: two (WORD) token 3, line 3: three (WORD) token 4, line 5: 1 (NUMBER) token 5, line 5: 11 (NUMBER) token 6, line 5: 123 (NUMBER) token 7, line 5: 13456 (NUMBER) token 8, line 7: one (WORD) token 9, line 7: 1 (NUMBER) token 10, line 8: two (WORD) token 11, line 8: 2 (NUMBER)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started