LANGUAGE PROCESSORS LABORATORY

Last updated on 3rd March 2003.


Objective: The students will undertake a sequence of experiments aimed at the design and implementation of the various phases of a compiler for the Pascal like programming language given below. This subset includes a sufficiently rich collection of data types and control structures.

Here is the grammar of the source language.


Experiment No. 1

Date of evaluation: 30th January 2003

Title: Design and implementation of a scanner

Guidelines on how to present the results Answers to queries



(Romil)

1. In the grammer specified two rules are there:

type definition ::= identifier = type
relational operator:= = | <> | < | > | <= | >=

On encountering '=' what do we return its tokentype as? RELOP or something else? How do we decide?

Return the keyword as relop. In the parser, verify that the relop has the specific value '=' when used in the context of type definition.


2.  factor ::= variable | constant | ( expression ) | function call
     constant ::= integer constant | real constant

Since string constant is not there in definition, I assume that presence of string in the code will be an error. Is my assumption correct, or the string constant was not mentioned by mistake.

I have added string constants to the grammar. Later on I shall add a couple of predefined functions with which you can do meaningful things.



(Nikhil Sethi)

1. To implement boolean we should have two keywords true and false.

I have added these to the grammar.


2. Should we treat operators "and" and "or" as boolean operator (in the grammar they are named as addition and multiplication ops respectively).

They are additive and multiplicative operators. This means that they are analogous to '+' and '*'. However they are boolean operators.


3. Enumerated types are not defined in the grammar.

You seem unhappy! Add them if you want.  However they are not a must.


4. There is no unary minus in the grammar. In the lab we discussed this with you and decided it was not feasible to detect negative numbers.   So we return -6+7 as four tokens "-","6","+","7" and not as "-6", "+", "7".  To implement this there should be a unary minus in the grammar.

This is an interesting point. You have seen a Fortran DO whose detection requires a trailing context. A token is a DO if it is follwed some time later by a comma. Negative numbers seem to reqire a leading context.  - 43 is a negative number if it is not preceded by an identifier or a number. However LEX does not have provisions of defining a token with a leading context.

So dont detect negative numbers. Make a number negative with a unary minus which I have added to the grammar.


5. Are we supposed to implement strings/characters in the grammar.

Strings, yes. Characters, no.



Experiment No. 2

Date of evaluation: 13th February 2003

Title: Design and implementation of a skeletal parser

The objective of this assignment is familiarization with parser generators and development of a LALR parser for the source language. For this assignement you will use the parser generator  YACC (alternatively Bison) and the grammar specification given above.

Some changes are required in the structure of your compiler when you interface your scanner with the parser. The symbol table routines should now become a part of the parser. The parser makes a call to the scanner whenever it needs to look at the next token. After returning from the scanner, the parser must update the appropriate table. For example an identifier should be entered in the symbol table, a constant in the constants' table, etc.

Guidelines on how to present the results




 
 

Experiment No. 3

Date of evaluation: 20th March 2003

Title: Declaration processing, scope and type analysis.

The objective of this assignment is to perform semantic analysis such as type and scope analysis and declaration processing, and integrate such analyses with the parser.

Notes on how to conduct the experiments:

Guidelines on how to present the results
 





 
 

Experiment No. 4

Date of evaluation: 10th April 2003

Title: Code Generation.

For the original grammar, Generate code for a restricted subset of C the original grammar. You could base your code generation strategy on the naive code generation strategy given in ASU, or the simple code generator outlined in the slides. There is no dearth of material on x86 assembly language. The specific assembly language that you have to generate code for is the GNU assembly language (GAS). Here is a short reference on GAS. You can get further information by doing 'info as' on  Linux systems.  Here is another reference on the SunOS assembly language.

Guidelines on how to present the results: