Plov003 : first attempt for a parser
After having made a first version of the symbol table and a means to process symbols, here is the first attempt to create a parser for Plov. At this moment, I think this is a dead end, but since this website is not only about successes, I want to share this failure with you as well.
Plov002 : attempting to generate a parser
Below is the stepwise refined source of Plov002. This source will most likely not compile. At this moment, I have the impression that this is not the right way to get the parsing engine running. See for yourself:
MODULE Plov002;
(* Plov is the PL/0 compiler as modified by Verhoeven *)
(* Ver Comment : Date *)
(* ----- ----------------------------------------------------- ------------ *)
(* 0.01 First version: read symbols and store them in a list 01 Oct 2007 *)
(* 0.02 Introduce a limited symbol processor *)
IMPORT InOut, MemPools, Strings, SYSTEM;
TYPE Identifier = ARRAY [0..31] OF CHAR;
SymbolType = (Program, Procedure, Constant, Variable);
SymbolPtr = POINTER TO SymbolNode;
SymbolNode = RECORD
Name : Identifier;
previous,
next : SymbolPtr;
type : SymbolType;
value,
address : CARDINAL
END;
VAR string : Identifier;
token : SymbolType;
Symbols : MemPools.MemPool;
firstSymbol,
lastSymbol,
thisSymbol : SymbolPtr;
currentType : SymbolType;
currentValue : CARDINAL;
PROCEDURE ErrorMessage (n : CARDINAL);
BEGIN
CASE n OF
1 : InOut.WriteString ("Undefined keyword : ");
InOut.WriteString (token) |
2 : InOut.WriteString ("PROGRAM name already defined") |
3 : InOut.WriteString ("END name does not match PROGRAM name")
ELSE
InOut.WriteString ("Unknown error detected.")
END;
InOut.WriteLn
END ErrorMessage;
PROCEDURE StoreSymbol (str : Identifier) : BOOLEAN;
VAR thisOne, nextOne : SymbolPtr;
BEGIN
thisOne := firstSymbol;
LOOP
IF Strings.StrEq (thisOne^.Name, str) THEN RETURN FALSE END;
IF thisOne^.next = NIL THEN EXIT END;
thisOne := thisOne^.next
END;
MemPools.PoolAllocate (Symbols, nextOne, SYSTEM.TSIZE (SymbolNode));
thisOne^.next := nextOne;
WITH nextOne^ DO
Name := str;
type := currentType;
next := NIL;
previous := thisOne
END;
lastSymbol := nextOne;
RETURN TRUE
END StoreSymbol;
(*
PROCEDURE PrintNames;
VAR thisOne : SymbolPtr;
BEGIN
thisOne := firstSymbol;
LOOP
InOut.WriteString (thisOne^.Name); InOut.WriteLn;
IF thisOne^.next = NIL THEN
EXIT
ELSE
thisOne := thisOne^.next
END
END
END PrintNames;
*)
PROCEDURE Program;
BEGIN
InOut.ReadString (string);
IF FirstSymbol^.Name = "|" THEN
WITH FirstSymbol^ DO
Name := string;
type := Program
END
ELSE
ErrorMessage (2);
HALT
END
END Program;
PROCEDURE Constant;
BEGIN
END Constant;
PROCEDURE Variable;
BEGIN
END Variable;
PROCEDURE Procedure;
BEGIN
END Procedure;
PROCEDURE Loop;
BEGIN
END Loop;
PROCEDURE If;
BEGIN
END If;
PROCEDURE Exit;
BEGIN
END Exit;
PROCEDURE Return;
BEGIN
END Return;
PROCEDURE Begin;
BEGIN
INC (beginDepth);
END Begin;
PROCEDURE End;
BEGIN
InOut.ReadString (string);
DEC (beginDepth);
IF beginDepth = 0 THEN
Exhausted := TRUE;
IF Strings.StrEq (string, firstSymbol^.Name) = FALSE THEN
ErrorMessage (3);
HALT
END
END;
END End;
PROCEDURE Condition;
BEGIN
END Condition;
PROCEDURE Assignment;
BEGIN
END Assignment;
PROCEDURE GetSymbol;
BEGIN
InOut.ReadString (string);
IF Strings.StrEq (string, "PROGRAM") THEN Program
ELSIF Strings.StrEq (string, "CONSTANT") THEN Constant
ELSIF Strings.StrEq (string, "VARIABLE") THEN Variable
ELSIF Strings.StrEq (string, "PROCEDURE") THEN Procedure
ELSIF Strings.StrEq (string, "BEGIN") THEN Begin
ELSIF Strings.StrEq (string, "LOOP") THEN Loop
ELSIF Strings.StrEq (string, "IF") THEN If
ELSIF Strings.StrEq (string, "EXIT") THEN Exit
ELSIF Strings.StrEq (string, "RETURN") THEN Return
ELSIF Strings.StrEq (string, "END") THEN End
ELSE
Assignment
END
END GetSymbol;
PROCEDURE Init;
BEGIN
MemPools.NewPool (Symbols);
MemPools.PoolAllocate (Symbols, firstSymbol, SYSTEM.TSIZE (SymbolNode));
WITH firstSymbol^ DO
Name := "|";
next := NIL
END;
currentType := Program;
beginDepth := 0
END Init;
BEGIN
Init;
REPEAT
GetSymbol
UNTIL Exhausted;
MemPools.KillPool (Symbols)
END Plov002.
The new symbol table
As you can see, I extended the symbol table with a new field: 'Previous' which enables going back in the list. This is the preferred method for searching for symbols.
| Plov001 | Plov002 | |
|---|---|---|
RECORD
Name : Identifier;
next : SymbolPtr;
type : SymbolType;
value,
address : CARDINAL
END;
|
RECORD
Name : Identifier;
previous,
next : SymbolPtr;
type : SymbolType;
value,
address : CARDINAL
END;
|
The parsing engine
My parser will not be looking ahead one character. It will look ahead one SYMBOL. So I came up with the following for a first parsing engine:
PROCEDURE GetSymbol;
BEGIN
InOut.ReadString (string);
IF Strings.StrEq (string, "PROGRAM") THEN Program
ELSIF Strings.StrEq (string, "CONSTANT") THEN Constant
ELSIF Strings.StrEq (string, "VARIABLE") THEN Variable
ELSIF Strings.StrEq (string, "PROCEDURE") THEN Procedure
ELSIF Strings.StrEq (string, "BEGIN") THEN Begin
ELSIF Strings.StrEq (string, "LOOP") THEN Loop
ELSIF Strings.StrEq (string, "IF") THEN If
ELSIF Strings.StrEq (string, "EXIT") THEN Exit
ELSIF Strings.StrEq (string, "RETURN") THEN Return
ELSIF Strings.StrEq (string, "END") THEN End
ELSE
Assignment
END
END GetSymbol;
I get a symbol from the source and then see how it fits in with the language. In the big IF statement, all the
words are listed which may be at the beginning of a line. If such a keyword is recognized, the compiler issues
a command with a similar name as the keyword and then that procedure handles the syntax checks. This sounds
familiar when you read the Wirth books. It smells like recursive descent.
The parsing loop itself can be kept very simple:
Init; REPEAT GetSymbol UNTIL Exhausted; MemPools.KillPool (Symbols)It can't get much simpler! Symbols are fetched from the source file and after the last symbol was read, the 'Exhausted' flag is raised and execution stops. The problem now is, how to get the order of the program in it.... I managed to prorcess the PROGRAM keyword by checking for the first entry of the symbol table. If the '|' token is still there, there was not yet a PROGRAM keyword. But this is a trick.
In his books, Wirth uses procedures with names similar to the keywords. And he uses the keywords in EBNF sequence in a parsing procedure. That will be the next step.
This is the reason for not completing this source. And hence: if you try to compile Plov002 you will get lots of work to get it accepted by Mocka.
Page created on 2 October 2007 and
Page equipped with FroogleBuster technology