Reading from file

My favorite method of reading from a file is by redirecting input, using InOut.RedirectInput. This way, all the handy InOut type handling routines are available for reading from the file. Think of functions like

but big boys may stumble into situations where they need to read from more than 1 file. So then redirecting input runs out of luck and you definitely need the FileSystem functions to have several files open.

If you have read the 'Strings' section, you may have found that I was not too pleasantly surprised to find out that file writes were done with Java Unicode tokens. Not ASCII characters. And my own limited mind is too stupid to handle Unicode tokens. But then, I'm not from India, so what could you expect? Not much!
So I did a test and was (again) unpleasantly surprised.


dum1: copying a file to screen

I wrote a small source to copy a small file to screen. A sort of cat function. Here's the source:

MODULE dum1;

IMPORT	FileSystem, InOut;

VAR	inF	: FileSystem.File;
	ch	: CHAR;

BEGIN
  FileSystem.Lookup (inF, "dum1.mod", FALSE);
  IF  inF.res = FileSystem.done   THEN
    InOut.WriteString ("File 'dum1.mod' opened");	InOut.WriteLn;
    FileSystem.SetRead (inF);
    FileSystem.ReadChar (inF, ch);
    WHILE  inF.eof = FALSE  DO
      InOut.Write (ch);
      FileSystem.ReadChar (inF, ch)
    END;
    FileSystem.Close (inF)
  ELSE
    InOut.WriteString ("File not found");
    InOut.WriteLn
  END;
  InOut.WriteLn
END dum1.
   
The line in red was added later. Not that it mattered though. As you can see, the program tries to read its own source code and display it on screen. This is the output:
jan@Beryllium:~/modula/mhc/3$ java dum1
File 'dum1.mod' opened
????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????
jan@Beryllium:~/modula/mhc/3$ java dum1 | wc
      2       4     288
jan@Beryllium:~/modula/mhc/3$ ls -l dum1.mod
-rw-r--r-- 1 jan users 529 2010-07-17 22:02 dum1.mod
   
That's NOT my source code. It's a bunch of question marks. And although the source file was 529 bytes long, wc only counts 288 - 23 = 265 question marks. Now, 265 is almost half of 529. So apparently, each of the original TWO characters were replaced by one question mark. This is an indication that, again, the function tries to read UniCode tokens. Not what I want. I'm not from India, so I cannot do that.


Looking up the sources

So it's time again for a look in the source code of FileSystem.mod (the irrelevant parts are removed):

PROCEDURE ReadChar( VAR f : File; VAR ch : CHAR );
BEGIN
  SYSTEM.INLINE
  (
    ch.val = (char)0;
    if (f.id instanceof java.io.InputStreamReader)
    ...
    else if (f.id instanceof java.io.OutputStreamWriter)
    ...
    else if (f.id instanceof java.io.RandomAccessFile)
    ...
    else
  );
END ReadChar;
   
InputStreamReader. My bully. The function is called 'ReadChar' but should be called 'ReadUnicodeChar'. It reads a word and then does some magic. And then it kicks me in the butt. Hmm. What would the function look like in InOut.Read?
PROCEDURE Read( VAR ch : CHAR );
BEGIN
  FileSystem.ReadChar( in, ch );
  Done := (in.res = FileSystem.done) AND NOT in.eof;
  termCH := ch;
END Read;
   
Again, like in the previous case, Read just uses the very same procedure that has fooled me. Apparently, the way a file is opened determines what the definition of a character is.


How are read access files openend?

InOut.Read does it like this:

PROCEDURE RedirectInput( from : ARRAY OF CHAR );
BEGIN
  ...
  SYSTEM.INLINE
  ( try
    { java.lang.String s = new java.lang.String( from, 0, i );
      in.id = new java.io.FileReader( s );
      Done = true;
    }
    catch (java.io.FileNotFoundException e)
    { ... }
  );
  IF Done THEN
  ...
  END;
END RedirectInput;
   
This is the good news. InOut.read uses java.io.FileReader (the input equivalent of java.io.FileWriter). Now lets see what FileSystem.Lookup uses:
PROCEDURE Lookup( VAR f : File; FileName : ARRAY OF CHAR; isNew : BOOLEAN );
BEGIN
  ...
  SYSTEM.INLINE  
  ( java.io.File handle = new java.io.File( new java.lang.String( FileName ) );
    ...
    if (handle != null)
    { try
      { f.id = new java.io.RandomAccessFile( handle, "rw" ); }
      ...
  );
END Lookup;
   
It just seems to open the file in Random Access Mode. Not very worrying. Apparently, however, I need to create a new function, called OpenReader, that opens a file as a java.io.FileReader function (or mode, or whatever it's called today).

If you are familiar with the Oberon language (the successor of Modula-2) you may be surprised by the similarities between Oberon and Java naming conventions. And Oberon pre-dates Java... So it looks like the Java designers were to a certain degree influenced by Wirth's Oberon language. But Wirth was not an american so he was ignored. And now we have to make do with this monster, called Java.


The work around

I can't explain execatly what I needed to show (every one has this, once in a while and today was my turn again) but I applied a small hack to the source of dum1.mod:

MODULE dum1;

IMPORT	FileSystem, InOut, SYSTEM;

VAR	inF	: FileSystem.File;
	ch	: CHAR;

BEGIN
(*  FileSystem.Lookup (inF, "dum1.mod", FALSE);	*)
  SYSTEM.INLINE (
    try
    {  inF.id = new java.io.FileReader ("dum1.mod");  }
    catch (java.io.FileNotFoundException e)
    {  inF.id = 0;  }
  );
  IF  inF.res = FileSystem.done   THEN
    InOut.WriteString ("File 'dum1.mod' opened");	InOut.WriteLn;
    FileSystem.SetRead (inF);
    FileSystem.ReadChar (inF, ch);
    WHILE  inF.eof = FALSE  DO
      InOut.Write (ch);
      FileSystem.ReadChar (inF, ch)
    END;
    FileSystem.Close (inF)
  ELSE
    InOut.WriteString ("File not found");
    InOut.WriteLn
  END;
  InOut.WriteLn
END dum1.
   
It's all in the inline section. I feel like 10 years younger again. Programming in FST Modula-2 and doing the naughty things in 8086 assembly. Now doing similar things in Java, which is the true universal assembler of this century. I will tell you how this source faired: it compiled to java bytecode. But only after I added some Java ornamentation (packaged the lot in a try/catch pair and added a constant value for stdin (I just hope it was really '0'..)) but now botch the mhc and javac did not object.

See it run:
jan@Beryllium:~/modula/mhc/3$ java dum1
File 'dum1.mod' opened
MODULE dum1;

IMPORT  FileSystem, InOut, SYSTEM;

VAR     inF     : FileSystem.File;
        ch      : CHAR;

BEGIN
  (*  FileSystem.Lookup (inF, "dum1.mod", FALSE); *)
  SYSTEM.INLINE
  (
    try
    {
      inF.id = new java.io.FileReader ("dum1.mod");
    }
    catch (java.io.FileNotFoundException e)
    {
      inF.id = 0;
    }
  );
  IF  inF.res = FileSystem.done   THEN
    InOut.WriteString ("File 'dum1.mod' opened");       InOut.WriteLn;
    FileSystem.SetRead (inF);
    FileSystem.ReadChar (inF, ch);
    WHILE  inF.eof = FALSE  DO
      InOut.Write (ch);
      FileSystem.ReadChar (inF, ch)
    END;
    FileSystem.Close (inF)
  ELSE
    InOut.WriteString ("File not found");
    InOut.WriteLn
  END;
  InOut.WriteLn
END dum1.
jan@Beryllium:~/modula/mhc/3$
   
Apparently, FileSystem.done is TRUE by default. Not sure this is the true Modula-2 approach. But we have the source of this module so it's quite easy to adapt it. For the time being, things will not be changed though. Only an addition will be made, a procedure named OpenReader which opens a file in FileReader access mode.

Page created on 17 July 2010 and

Page equipped with FroogleBuster technology