For the remainder of the semester we will be building programs that interpret a small language. The language will have constants, a small number of keywords, and some operators.

computer science

Description

For the remainder of the semester we will be building programs that interpret a small language. The language will have constants, a small number of keywords, and some operators.


The remainder of the semester will be broken into three pieces: 

Assignment 2 - Lexical analyzer 

Assignment 3 - Parser Assignment 

4 - Interpreter


For Assignment 2, the lexical analyzer, you will be provided with a description of the lexical syntax of the language. You will produce a lexical analysis function, and a program to test it.


The first argument to getTok is a reference to an istream that the function should read from. The second argument to getTok is a reference to an integer that contains the current line number. getTok should update this integer every time it reads a newline. getTok returns a Tok. A Tok is a class that contains a Token, a string for the lexeme, and the line number that the Tok was found on. 


A header file, lex.h, will be provided for you. It contains a declaration for the Tok class, and a declaration for all of the Token values. You MUST use the header file that is provided. You may NOT change it. 


The lexical rules of the language are as follows: 

1. The language has identifiers, which are defined to be a letter followed by zero or more letters or numbers. This will be the Token IDENT. 

2. The language has integer constants, which are defined to be one or more digits with an optional leading sign. This will be the Token ICONST. 

3. The language has string constants, which are a double-quoted sequence of characters, all on the same line. This will be the Token SCONST. 

4. A string constant can include escape sequences: a backslash followed by a character. The sequence \n should be interpreted as a newline. The sequence \\ should be interpreted as a backslash. All other escapes should simply be interpreted as the character after the backslash. 

5. The language has reserved the keywords print, println, repeat, begin, end. They will be Tokens PRINT PRINTLN REPEAT BEGIN END. 

6. The language has several operators. They are + - * = / ( ) which will be Tokens PLUS MINUS STAR SLASH EQ LPAREN RPAREN


Related Questions in computer science category