CGI and datafiles.
Simple CGI executables need only some data handed over to them via the methods of POST or GET and all results are returned via the standard output channel (back into the webserver for further processing). But slighlty more complex CGI's need to have some memory. To be able to retrieve or reconstruct a previous state. Like a page-counter. So the bigger CGI programs need to be able to access files on disk. That's the subject of this topic.
Case 1: See if we can access files.
The first program is about being able to open a file. And if the file does not exist, report it to the user. We will use the by now familiar method of converting the webbrowser screen to a VT-100 terminal. Here's the source of the first program (called 'rooster' which is a dutch word):
MODULE rooster;
IMPORT ASCII, InOut, TextIO, cgi;
PROCEDURE Init;
VAR File : TextIO.File;
ch : CHAR;
BEGIN
TextIO.OpenInput (File, "data.test");
IF NOT TextIO.Done () THEN
InOut.WriteString ("Cannot open file. Aborting.");
InOut.WriteLn;
HALT
ELSE
REPEAT
TextIO.GetChar (File, ch);
InOut.Write (ch)
UNTIL TextIO.EOF (File) = TRUE;
InOut.WriteBf
END;
TextIO.Close (File)
END Init;
BEGIN
cgi.InformServer (cgi.Text);
Init;
END rooster.
It's a simple program and it compiles (with the 9905m compiler to which I have converted completely). As you
can see: only qualified imports. Some more typing efforts, but the program gets another look. Perhaps the C++
guru's will have less problems reading this kind of source...
The output of rooster
Below is a copy of the screen of the Mocka IDE for compiling the module and to see what it does. Remember: we are still in a dry run. The CGI executable is running in console mode, without any attachments to a webbrowser or -server. As I told before: I converted to Mocka 9905m, which was modified by Dr Maurer.
jan@beryllium:~/modula/cgi$ mocka Mocka 9905m >> i rooster >> p rooster .. Compiling Program Module rooster I/0002 II/0002 .. Linking rooster >> rooster Content-type:text/plain Hi there, This is the content of the file 'data.test'. See if we get to see it or not.... >> mv data.test Data.test >> rooster Content-type:text/plain Cannot open file. Aborting. >> q jan@beryllium:~/modula/cgi$A short file and a short outcome. But it works, on the command line. Now see if we can get it to work via the webserver. And it does! It's cool and cruel to see these silly magic words on the VT-100 screen. It took some effort and research to get things done, but now it works.
The tricks needed to get the thing working:
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
AllowOverride None
Options ExecCGI -MultiViews +SymLinksIfOwnerMatch
Order allow,deny
Allow from all
</Directory>
Look for the line containing the words 'ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/' or something similar. In my
case, the CGI executables need to be in '/usr/lib/cgi-bin'.
The rest was easy. Copy the files, make sure they have the right privileges and start the file as usual.
You need a line, similar to
<a href="/cgi-bin/rooster" Target = "main">Try rooster</a>as I did in the navigator (rightclick the mouse in the navigatorframe, choose 'This frame' and 'View frame source' to see this same line) or could have done in just about any HTML file.
Case 2: Creating files.
The second test is trying to write files to disk. I want to open a file, display the contents and then write back something else to the same file. So that, if we read the file next time, we can see that things have changed in between. So I made the following executable:
MODULE rooster2;
IMPORT ASCII, InOut, TextIO, cgi;
PROCEDURE Init;
VAR File : TextIO.File;
ch, Ch : CHAR;
count, n : CARDINAL;
BEGIN
TextIO.OpenInput (File, "Counter.data");
IF NOT TextIO.Done () THEN
InOut.WriteString ("Cannot open file. Aborting.");
InOut.WriteLn;
HALT
END;
count := 0;
n := 0;
REPEAT
TextIO.GetChar (File, ch);
InOut.Write (ch);
IF count = 0 THEN Ch := ch END;
INC (count)
UNTIL TextIO.EOF (File) = TRUE;
TextIO.Close (File);
InOut.WriteLn;
TextIO.OpenOutput (File, "Counter.data");
IF Ch = 'Z' THEN Ch := 'A' ELSE INC (Ch) END;
REPEAT
TextIO.PutChar (File, Ch);
INC (n)
UNTIL n > count;
TextIO.Close (File)
END Init;
BEGIN
cgi.InformServer (cgi.Text);
Init;
END rooster2.
This executable compiles errorfree and it does what it was intended to do:
jan@beryllium:~/modula/cgi$ mocka Mocka 9905m >> i rooster2 >> p rooster2 >> .. Compiling Program Module rooster2 I/0002 II/0002 .. Linking rooster2 >> rooster2 Content-type:text/plain AAAAAAAAAAAAAAAAAAAA >> rooster2 Content-type:text/plain BBBBBBBBBBBBBBBBBBBBBB >>As you can see, on the command line, this program works. Now see if it runs through Apache.
beryllium:/usr/lib/cgi-bin# chmod 666 Counter.datathings were settled. Program 'rooster2' just didn't have the right permissions... In the meantime, we have proven that CGI programs can open files for reading and writing in the '/cgi-bin' directory. I also built-in a safety precaution for the file 'Counter.data'. If the character becomes 'Z', it wraps back to 'A', but the count keeps on incrementing.
Case 3: A simple counter.
Now that we can read and write data to files, it becomes interesting to write some more useful data to a file. So I made the 'log' module as shown below. It reads a number (in ASCII) from a file and processes it:
MODULE log;
IMPORT cgi, InOut, TextIO, Strings;
VAR File : TextIO.File;
number : CARDINAL;
BEGIN
cgi.InformServer (cgi.Text);
TextIO.OpenInput (File, 'CounterFile');
IF TextIO.Done () = FALSE THEN
number := 0
ELSE
TextIO.GetCard (File, number);
TextIO.Close (File)
END;
INC (number);
TextIO.OpenOutput (File, 'CounterFile');
TextIO.PutCard (File, number, 15);
TextIO.Close (File);
InOut.WriteString ("Access counter now is :");
InOut.WriteCard (number, 15);
InOut.WriteLn;
InOut.WriteBf
END log.
If run in a text console, without a webserver in between, it works as could be expected:
jan@beryllium:~/modula/cgi$ mocka Mocka 9905m >> i log >> p log >> .. Compiling Program Module log I/0002 II/0002 .. Linking log >> rm CounterFile >> log Content-type:text/plain Access counter now is : 1 >> log Content-type:text/plain Access counter now is : 2 >>This is promising. Combine this with the previous experiences and we have to:
o <a href="/cgi-bin/log" Target = "main">log</a>and now do clicka-di-click on the word 'log' in your navigator frame. Each time you click, a new VT-100 console is presented in your webbrowser screen. Isn't life wonderful?
I do have to admit one thing: I was unable to let the 'log' CGI executable create the file 'CounterFile' by itself. On the commandline it will do as I programmed. But when in the CGI directory, 'log' simply will not create the file in '/cgi-bin'. Most probable this is caused by some weird permission property I have yet overlooked. For the time being, let's be cheerful that we can make a very simple and fast pagecounter.
Case 4: the CGI executable outside /cgi-bin.
If I cannot have the CGI executable create a file in '/cgi-bin', the logical solution is to put my CGI program
outside the cgi directory and run it there, where the datafiles live. So I parked the 'log' program in a
random directory and made an HTML reference to it.
On the right, you see what happens when you try to run that program.
It won't run at all since it is not in the sole directory that is allowed to contain executables. Your webbrowser (in this case FireFox) will show a window in which you have two choices, but we are looking for the third option, which seems to be prohibited.
The unsafe workaround.
Based on the previous tests, I managed to come up with a workaround that works on my own computer which has an
Apache webserver running by default. Do not try this on the server of a commercial webhost.
The datafile cannot be created due to permission problems. My executable is able to run, but that's all. It
needs to be more in control, so I gave it some. As user 'root' I set the SUID bit to make the program aquire
God-status in the Unix world. Below you see what I did. For this moment I ask you to test this case on your
own private server.
beryllium:/usr/lib/cgi-bin# chmod 4755 log beryllium:/usr/lib/cgi-bin# ls -lh total 28K -rw-rw-rw- 1 root root 15 2006-06-29 22:47 CounterFile -rwsr-xr-x 1 root root 28K 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin# rm CounterFile beryllium:/usr/lib/cgi-bin# ls -lh total 28K -rwsr-xr-x 1 root root 28K 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin#At this point, I ran the 'log' program via the navigator frame. It produced the correct result (1) so it worked and now I'm curious whether it also created the file 'CounterFile'. I won't keep you in suspense:
beryllium:/usr/lib/cgi-bin# ls -lh total 28K -rw-r--r-- 1 root www-data 15 2006-06-30 01:35 CounterFile -rwsr-xr-x 1 root root 28K 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin#As you can see, raising the permissions of the executable enabled it to create the required file. But the 'ls' command also tells us that the file is not only owned by root. The group it belongs to is 'www-data'. Now, perhaps, this has some potential for alternatives:
beryllium:/usr/lib/cgi-bin# chmod 755 log beryllium:/usr/lib/cgi-bin# ls -lh total 28K -rw-r--r-- 1 root www-data 15 2006-06-30 01:35 CounterFile -rwxr-xr-x 1 root root 28K 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin# rm CounterFile beryllium:/usr/lib/cgi-bin# chown root:www-data log beryllium:/usr/lib/cgi-bin# ls -lh total 28K -rwxr-xr-x 1 root www-data 28K 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin#At this point, I ran the 'log' program via the navigator frame. It produced a blank screen...
beryllium:/usr/lib/cgi-bin# ls -l total 28K -rwxr-xr-x 1 root www-data 27784 2006-06-29 22:21 log beryllium:/usr/lib/cgi-bin#Not succesfull. The executable definitely needs the SUID bit. This brings me in a troublesome situation. I now have the key to enable the CGI executable to enable creating files. But I doubt if I can do something similar at the webhost where this website is running. I'm just a user there (UID = 'frutt').
As usual, the easiest way to see if it is possible is to just try it. No guts, no glory. So I started an ncftp
session and changed directory to /cgi-bin. The system did not object when I issued a 'chmod 4755 log' command
and a following 'ls -lh' showed that the file 'log' really was 'SetUID' to user 'frutt'.
But when I ran the file, through a webbrowser, I got an internal server error. Apparently this is not the way
to do this thing.
Case 5: Brute force.
If everything else fails, apply brute force. So I did. I made the 'llog' program which is a source for experimenting with file locations. It is very similar to 'log'. Here is the source of 'llog':
MODULE llog;
IMPORT cgi, InOut, TextIO, Strings;
VAR File : TextIO.File;
number : CARDINAL;
BEGIN
cgi.InformServer (cgi.Text);
TextIO.OpenInput (File, '/CounterFile');
IF TextIO.Done () = FALSE THEN
number := 0
ELSE
TextIO.GetCard (File, number);
TextIO.Close (File)
END;
INC (number);
TextIO.OpenOutput (File, '/CounterFile');
TextIO.PutCard (File, number, 15);
TextIO.Close (File);
InOut.WriteString ("Access counter now is :");
InOut.WriteCard (number, 15);
InOut.WriteLn;
InOut.WriteBf
END llog.
The main difference is, that now I added a '/' slash in front of the filename. And after compiling, I set the
SUID bit of the executable. Then I ran it through Apache. Here's the result:
beryllium:/usr/lib/cgi-bin# cp /home/jan/modula/cgi/bin/llog . beryllium:/usr/lib/cgi-bin# chmod 4755 llogAt this point, I ran the 'llog' program via the navigator frame. It showed that the counter was incremented.
beryllium:/usr/lib/cgi-bin# updatedb beryllium:/usr/lib/cgi-bin# locate CounterFile /CounterFile /home/jan/internet/fruttenboel/cgi/CounterFile /home/jan/modula/cgi/CounterFile beryllium:/usr/lib/cgi-bin# ls / -lh total 21K drwxr-xr-x 2 root root 2.2K 2006-06-10 14:15 bin drwxr-xr-x 3 root root 384 2006-06-10 13:52 boot lrwxrwxrwx 1 root root 11 2006-06-09 20:27 cdrom -> media/cdrom -rw-r--r-- 1 root www-data 15 2006-06-30 21:52 CounterFileIt's clear: the new file 'CounterFile' has been created in the root directory, just as I instructed it to do. This is a strong hint, that the paths in file specifications are treated in an absolute way, rather than relative to the DocumentRoot (as I had hoped).
I changed 'llog' so that the filename was "~/Counterfile", recompiled, put the executable in /cgi-bin and ran the CGI program. Result: a blank screen. Apparently the file didn't run and no file was created. Time to go to my friends at Google again, this time with a very cunning searchphrase.
Case 6: Gentle force.
On the net I found a page that sheds some light, be it a dim light: at http://www.xcf.berkeley.edu/help-sessions/cgi/x220.html you can read the following:
Another important note regarding user identities as they relate to CGI programs is the fact that programs run as the user which started the process. So, a CGI program which is started by the web server will run with the identity of the web server, not with the identity of the user who created it. This is especially important for write-access to files. If a CGI program, written by the user "luser", relies on a file "cgidata" which luser has in her HTML-file directory, when it is run by the web server [with user-identity "www"], it will not have access to the "cgidata" file, unless luser made the file world-readable. If the CGI program is to write to the file, things get even worse. luser would not want to make the file world-writeable, since any other user on the machine could write to the file, which is a Bad Thing. But since the CGI program runs as www and not as luser, that is the only way to write to the file. Since most CGI application will want to write to files, there is a problem.
Fortunately, there exists a better solution. Unix has a special permission for programs called "setuid". This means that when the program is run, it runs with the identity of the user who owns the file, not the identity of the user executing the program. Thus, when the web server [user "www"] executes luser's program, it runs with luser's permissions. luser can then make the file user-write-able, and the program would have the ability to write to it, but other users would not. Use of the setuid bit is somewhat dangerous since it allows access to otherwise private files, but with care can be used to great advantage.
This seems to be the key to our solution. By applying a correct SetUID to the CGI executable, it can access files which are in the diskspace which is allocated to that specific user. It's worth some further investigation.
MODULE llog;
IMPORT cgi, InOut, TextIO, Strings;
VAR File : TextIO.File;
number : CARDINAL;
BEGIN
cgi.InformServer (cgi.Text);
TextIO.OpenInput (File, '/home/jan/CounterFile');
IF TextIO.Done () = FALSE THEN
number := 0
ELSE
TextIO.GetCard (File, number);
TextIO.Close (File)
END;
INC (number);
TextIO.OpenOutput (File, '/home/jan/CounterFile');
TextIO.PutCard (File, number, 15);
TextIO.Close (File);
InOut.WriteString ("Access counter now is :");
InOut.WriteCard (number, 15);
InOut.WriteLn;
InOut.WriteBf
END llog.
I changed the ownership and privileges of the executables in the /cgi-bin as follows:
jan@beryllium:~/data$ ls -l ~ -rw-r--r-- 1 jan www-data 15 2006-07-01 00:34 CounterFileThe file is created, 'by me' as a member from another group ('www-data'). Thanks to the SetUID feature I can leave the file permissions to the default (and safer) 644: rw-r--r--. Quite a relief.
Case 7: Does 'Tilde' work after SetUID?
Now that we come so far the question is: does a program that is owned by me, and who's UID is forced to mine, have the brains to also use my home directory structure for storing files?
To check this, I changed llog as follows (only the changed lines are listed):
TextIO.OpenInput (File, '~/CounterFile'); TextIO.OpenOutput (File, '~/CounterFile');After compiling and chmodding the deception follows: the white screen appears. And it remains white. In short: it doesn't work.
Conclusions:
Experimental results:
Page created on 28 June 2006 and
Page equipped with FroogleBuster technology