Plov003 : first attempt for a parser

After having made a first version of the symbol table and a means to process symbols, here is the first attempt to create a parser for Plov. At this moment, I think this is a dead end, but since this website is not only about successes, I want to share this failure with you as well.

Plov002 : attempting to generate a parser

Below is the stepwise refined source of Plov002. This source will most likely not compile. At this moment, I have the impression that this is not the right way to get the parsing engine running. See for yourself:

MODULE Plov002;

(*  Plov is the PL/0 compiler as modified by Verhoeven				*)
(*  Ver	  Comment						:     Date  	*)
(*  ----- -----------------------------------------------------   ------------ 	*)
(*  0.01  First version: read symbols and store them in a list	  01 Oct 2007	*)
(*  0.02  Introduce a limited symbol processor					*)

IMPORT	InOut, MemPools, Strings, SYSTEM;


TYPE	Identifier	= ARRAY [0..31] OF CHAR;
	SymbolType	= (Program, Procedure, Constant, Variable);
	SymbolPtr	= POINTER TO SymbolNode;
	SymbolNode	= RECORD
			    Name	: Identifier;
			    previous,
			    next	: SymbolPtr;
			    type	: SymbolType;
			    value,
			    address	: CARDINAL
			  END;

VAR	string		: Identifier;
	token		: SymbolType;
	Symbols		: MemPools.MemPool;
	firstSymbol,
	lastSymbol,
	thisSymbol	: SymbolPtr;
	currentType	: SymbolType;
	currentValue	: CARDINAL;


PROCEDURE ErrorMessage (n  : CARDINAL);

BEGIN
   CASE  n  OF
      1 : InOut.WriteString ("Undefined keyword : ");
      	  InOut.WriteString (token)					|
      2 : InOut.WriteString ("PROGRAM name already defined")		|
      3 : InOut.WriteString ("END name does not match PROGRAM name")
   ELSE
      InOut.WriteString ("Unknown error detected.")
   END;
   InOut.WriteLn
END ErrorMessage;


PROCEDURE StoreSymbol (str   : Identifier) : BOOLEAN;

VAR	thisOne, nextOne	: SymbolPtr;

BEGIN
   thisOne := firstSymbol;
   LOOP
      IF  Strings.StrEq (thisOne^.Name, str)  THEN  RETURN FALSE  END;
      IF  thisOne^.next = NIL  THEN  EXIT  END;
      thisOne := thisOne^.next
   END;
   MemPools.PoolAllocate (Symbols, nextOne, SYSTEM.TSIZE (SymbolNode));
   thisOne^.next := nextOne;
   WITH  nextOne^  DO
      Name := str;
      type := currentType;
      next := NIL;
      previous := thisOne
   END;
   lastSymbol := nextOne;
   RETURN TRUE
END StoreSymbol;

(*
PROCEDURE PrintNames;

VAR	  thisOne    		: SymbolPtr;

BEGIN
   thisOne := firstSymbol;
   LOOP
      InOut.WriteString (thisOne^.Name);		InOut.WriteLn;
      IF  thisOne^.next = NIL  THEN
         EXIT
      ELSE
         thisOne := thisOne^.next
      END
   END
END PrintNames;
*)

PROCEDURE Program;

BEGIN
   InOut.ReadString (string);
   IF  FirstSymbol^.Name = "|"  THEN
      WITH  FirstSymbol^  DO
	 Name := string;
	 type := Program
      END
   ELSE
      ErrorMessage (2);
      HALT
   END
END Program;


PROCEDURE Constant;

BEGIN
END Constant;


PROCEDURE Variable;

BEGIN
END Variable;


PROCEDURE Procedure;

BEGIN
END Procedure;


PROCEDURE Loop;

BEGIN
END Loop;


PROCEDURE If;

BEGIN
END If;


PROCEDURE Exit;

BEGIN
END Exit;


PROCEDURE Return;

BEGIN
END Return;


PROCEDURE Begin;

BEGIN
   INC (beginDepth);
END Begin;


PROCEDURE End;

BEGIN
   InOut.ReadString (string);
   DEC (beginDepth);
   IF  beginDepth = 0  THEN
      Exhausted := TRUE;
      IF  Strings.StrEq (string, firstSymbol^.Name) = FALSE  THEN
         ErrorMessage (3);
	 HALT
      END
   END;
   
END End;


PROCEDURE Condition;

BEGIN
END Condition;


PROCEDURE Assignment;

BEGIN
END Assignment;


PROCEDURE GetSymbol;

BEGIN
   InOut.ReadString (string);
   IF	  Strings.StrEq (string, "PROGRAM")	THEN	Program
   ELSIF  Strings.StrEq (string, "CONSTANT")	THEN	Constant
   ELSIF  Strings.StrEq (string, "VARIABLE")	THEN	Variable
   ELSIF  Strings.StrEq (string, "PROCEDURE")	THEN	Procedure
   ELSIF  Strings.StrEq (string, "BEGIN")	THEN	Begin
   ELSIF  Strings.StrEq (string, "LOOP")	THEN	Loop
   ELSIF  Strings.StrEq (string, "IF")		THEN	If
   ELSIF  Strings.StrEq (string, "EXIT")	THEN	Exit
   ELSIF  Strings.StrEq (string, "RETURN")	THEN	Return
   ELSIF  Strings.StrEq (string, "END")		THEN	End
   ELSE
      Assignment
   END
END GetSymbol;


PROCEDURE Init;

BEGIN
   MemPools.NewPool (Symbols);
   MemPools.PoolAllocate (Symbols, firstSymbol, SYSTEM.TSIZE (SymbolNode));
   WITH  firstSymbol^  DO
      Name := "|";
      next := NIL
   END;
   currentType := Program;
   beginDepth  := 0
END Init;


BEGIN
   Init;

   REPEAT  
      GetSymbol
   UNTIL  Exhausted;

   MemPools.KillPool (Symbols)
END Plov002.
   

The new symbol table

As you can see, I extended the symbol table with a new field: 'Previous' which enables going back in the list. This is the preferred method for searching for symbols.

Plov001   Plov002
RECORD
   Name		: Identifier;
   next		: SymbolPtr;
   type		: SymbolType;
   value,
   address	: CARDINAL
END;
      
 RECORD
    Name	: Identifier;
    previous,
    next	: SymbolPtr;
    type	: SymbolType;
    value,
    address	: CARDINAL
END;
      

The parsing engine

My parser will not be looking ahead one character. It will look ahead one SYMBOL. So I came up with the following for a first parsing engine:

PROCEDURE GetSymbol;

BEGIN
   InOut.ReadString (string);
   IF	  Strings.StrEq (string, "PROGRAM")	THEN	Program
   ELSIF  Strings.StrEq (string, "CONSTANT")	THEN	Constant
   ELSIF  Strings.StrEq (string, "VARIABLE")	THEN	Variable
   ELSIF  Strings.StrEq (string, "PROCEDURE")	THEN	Procedure
   ELSIF  Strings.StrEq (string, "BEGIN")	THEN	Begin
   ELSIF  Strings.StrEq (string, "LOOP")	THEN	Loop
   ELSIF  Strings.StrEq (string, "IF")		THEN	If
   ELSIF  Strings.StrEq (string, "EXIT")	THEN	Exit
   ELSIF  Strings.StrEq (string, "RETURN")	THEN	Return
   ELSIF  Strings.StrEq (string, "END")		THEN	End
   ELSE
      Assignment
   END
END GetSymbol;
   
I get a symbol from the source and then see how it fits in with the language. In the big IF statement, all the words are listed which may be at the beginning of a line. If such a keyword is recognized, the compiler issues a command with a similar name as the keyword and then that procedure handles the syntax checks. This sounds familiar when you read the Wirth books. It smells like recursive descent.

The parsing loop itself can be kept very simple:

Init;

REPEAT  
   GetSymbol
UNTIL  Exhausted;

MemPools.KillPool (Symbols)
   
It can't get much simpler! Symbols are fetched from the source file and after the last symbol was read, the 'Exhausted' flag is raised and execution stops. The problem now is, how to get the order of the program in it.... I managed to prorcess the PROGRAM keyword by checking for the first entry of the symbol table. If the '|' token is still there, there was not yet a PROGRAM keyword. But this is a trick.
With this scheme, I need to set progress variables and these will complicate making of the program. And it will virtually rule out serious modifications and code maintenance.

In his books, Wirth uses procedures with names similar to the keywords. And he uses the keywords in EBNF sequence in a parsing procedure. That will be the next step.

This is the reason for not completing this source. And hence: if you try to compile Plov002 you will get lots of work to get it accepted by Mocka.

Page created on 2 October 2007 and

Page equipped with FroogleBuster technology