PLOV : Pitfalls
During the making of the PLOV compiler I ran into many a pitfall. Most pits were rather shallow but still: you need to get out again and get on your feet again. This section is about the pitfalls I recollect as such and I am sharing them with the world.
When I found myself in yet another pit, I was very lucky to have read the Wirth books about compiler construction. While reading these books I understood WHAT Niklaus explained but not allways WHY. In the bottom of the pit, you also get an understanding of the latter aspect: the WHY of certain measures, taken in the act of building a compiler...
Compiletime versus runtime
When going through a source text you have to distinguish between compile-time and run-time. The compiled source will be used in runtime and the code generator needs to take this into account to a certain degree. Yet, the semantic checker is not interested at all in runtime. It only deals with compiletime. One example: the LOOP/END and the ways to get out of it.
LOOP
Kelvin := Kelvin + ee
LOOP
Fahr := ee + x
IF Fahr > 12 THEN EXIT END
Fahr := Fahr - 1
END
IF Kelvin = 10000 THEN EXIT END
Kelvin := ee
END
A nested LOOP. And two EXIT statements. A LOOP/END is in fact a simple 'GOTO LOOPstart'. The trick is to get
out again. And there are two methods. When used inside a PROCEDURE you can eject through the RETURN
instruction. In most cases however you only have the EXIT keyword to leave a LOOP. And the EXIT instruction
must be coded as a JUMP TO LOOP-END. But LOOP-END is not known yet! And it changes in the nested loop!
So on entry, a LOOPcount counter is incremented to identify all lables dealing with LOOP's. With unnested
LOOPs this is sufficient. But with nested LOOPs you also need a LOOPdepth counter incremented. And decremented
again when the LOOP is abandoned.
Note the word 'abandoned' here. This is a runtime word. In compiletime you just go through the source and you
do not leave a LOOP. In compiletime, LOOP's are just terminated by an END just because the EBNF dictates it.
But my mind was in glitch mode again so I came up with the criteria that involved the LOOPdepth count. So in my parser I had the line
LOOPdepth := 0in the procedure that dealt with the RETURN keyword... Nonsense of course. The parser just walks through the source! And for the parser a LOOP ends at the corresponding END! Even when an EXIT statement is found, the parser is still INSIDE the LOOP until it finds the END.
So this is what the LOOP keyword boils down to:
PROCEDURE isLOOP;
VAR LC : CARDINAL;
BEGIN
INC (LOOPcount);
LC := LOOPcount;
CG ("LABEL LOOP-"); TextIO.PutCard (outFile, LC, 1);
CG ("");
By using the LC counter, which is local to THIS OCCURRENCE of LOOP, I have the LOOP entry point coupled to the
LOOP exit point. When a new LOOP is entered, it also works with its own local version of LC.
INC (LOOPdepth); EXITcount [LOOPdepth] := LOOPcount;EXITcount stores the numbers of the exit points. LC is local to isLOOP but isEXIT isn't so it doesn't see the LC variable....
GetSymbol;
StatementSequence;
IF Strings.StrEq (token, "END") THEN
DEC (LOOPdepth);
CG ("JUMP TO LOOP-"); TextIO.PutCard (outFile, LC, 1);
CG ("")
ELSE
ErrorMessage (8) (* END expected *)
END;
CG ("LABEL XLOOP-"); TextIO.PutCard (outFile, LC, 1);
CG ("");
GetSymbol
END isLOOP;
PROCEDURE isEXIT;
BEGIN
IF LOOPdepth = 0 THEN
ErrorMessage (11) (* EXIT without LOOP *)
ELSE
CG ("JUMP TO XLOOP-"); TextIO.PutCard (outFile, EXITcount [LOOPdepth], 1);
CG ("")
END;
GetSymbol
END isEXIT;
PROCEDURE isRETURN;
BEGIN
IF PROCdepth = 0 THEN
ErrorMessage (19) (* RETURN without PROCEDURE *)
END;
CG ("RETURN"); CG ("");
GetSymbol
END isRETURN;
The IF THEN reversal
Let's have a look at a simple IF construct:
IF Kelvin = 300 THEN Kelvin := 10 ENDThe code generator was kind enough to translate this into the following PALO
FETCH VALUE OF Kelvin STORE 300 IF NOT EQUAL JUMP TO LABEL-4 STORE ADDRESS OF Kelvin STORE 10 SAVE RESULT LABEL-4Note the reversal of the condition. My PLOV source says 'IF Kelvin equals 300' and the compiler changes this to IF NOT EQUAL JUMP TO LABEL-4. Which is logical. The most logical thing is to commence execution after the condition. So program must be conditionally deflected but that only makes sense with the inverted condition...
For the time being it works. But I need to rework this in a similar manner as the LOOP/END, with a dedicated IFcount variable, to easily allow nested IF statements.
Page created on 24 August 2008 and
Page equipped with FroogleBuster technology