Working on login and ticket system
[CGIscriptor.git] / CGIscriptor.html
blobc4b6cfbd7c8bf7cd1705aad3fa130aa45405bb3c
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2 <HTML>
4 <HEAD>
6 <TITLE>CGIscriptor 2.0 Manual</TITLE>
9 </HEAD>
11 <BODY>
13 <H1 ALIGN="CENTER">
14 <I>CGIscriptor 2.0</I>: An implementation of integrated server side CGI scripts
15 </H1>
17 <UL>
18 <P>
19 <LI><A HREF="#HYPE">HYPE</A>
20 <LI><A HREF="#HOWITWORKS">THIS IS HOW IT WORKS</A>
21 <LI><A HREF="#HTML4">HTML 4 COMPLIANCE</A>
22 <LI><A HREF="#SECURITY">SECURITY</A>
23 </P>
25 <P>
26 <LI><A HREF="#MANUAL">USER MANUAL</A>
27 <UL>
28 <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
29 <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
30 <LI><A HREF="#BINFILES">NON-HTML FILES</A>
31 <LI><A HREF="#META">THE META TAG</A>
32 <LI><A HREF="#DIV">THE DIV/INS TAG</A>
33 <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
34 <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
35 <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
36 <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
37 <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
38 <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
39 <LI><A HREF="#APPLIC">APPLICATION MIME TYPES</A>
40 <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
41 <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
42 <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
43 <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
44 <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
45 <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
46 <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
47 <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
48 </UL>
49 <LI><A HREF="#license">license</A>
50 </P>
52 </UL>
54 <A NAME="HYPE"><H2 ALIGN="CENTER">HYPE</H2></A>
56 <P>
57 CGIscriptor merges plain ASCII HTML files transparantly and safely
58 with CGI variables, in-line PERL code, shell commands, and executable
59 scripts in many languages (on-line and real-time). It combines the
60 "ease of use" of HTML files with the versatillity of specialized
61 scripts and PERL programs. It hides all the specifics and
62 idiosyncrasies of correct output and CGI coding and naming. Scripts
63 do not have to be aware of HTML, HTTP, or CGI conventions just as HTML
64 files can be ignorant of scripts and the associated values. CGIscriptor
65 complies with the W3C HTML 4.0 recommendations.
66 </P>
68 <P>
69 In addition to its use as a WWW embeded CGI processor, it can
70 be used as a command-line document preprocessor (text-filter).
71 </P>
73 <A NAME="HOWITWORKS"><H2 ALIGN="CENTER">THIS IS HOW IT WORKS</H2></A>
75 <P>
76 The aim of CGIscriptor is to execute "plain" scripts inside a text file
77 using any required CGIparameters and environment variables. It
78 is optimized to transparantly process HTML files inside a WWW server.
79 The native language is Perl, but many other scripting languages
80 can be used.
81 </P>
83 <P>
84 CGIscriptor reads text files from the requested input file (i.e., from
85 $YOUR_HTML_FILES$PATH_INFO) and writes them to &lt;STDOUT&gt; (i.e., the client
86 requesting the service) preceded by the obligatory
87 "Content-type: text/html\n\n" or "Content-type: text/plain\n\n" string
88 (except for "raw" files which supply their own Content-type message
89 and only if the SERVER_PROTOCOL contains HTTP, FTP, GOPHER, MAIL, or MIME).
90 </P>
92 <P>
93 When CGIscriptor encounters an embedded script, indicated by an HTML4 tag
94 </P>
96 <PRE>
97 &lt;SCRIPT TYPE="text/ssperl" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
98 PERL script
99 &lt;/SCRIPT&gt;
100 </PRE>
104 <PRE>
105 &lt;SCRIPT TYPE="text/osshell" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
106 OS Shell script
107 &lt;/SCRIPT&gt;
108 </PRE>
111 construct (anything between []-brackets is optional, other MIME-types are
112 supported), the embedded script is removed and both the contents of the
113 source file (i.e., "do 'ScriptSource'") AND the script are evaluated as a
114 PERL program (i.e., by eval()), a shell script (i.e., by a "safe" version
115 of `Command`, qx) or an external interpreter. The output of the eval()
116 function takes the place of the original &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
117 construct in the output string. Any CGI parameters declared by the CGI
118 attribute are available as simple perl variables, and can subsequently
119 be made available as variables to other scripting languages (e.g., bash,
120 python, or lisp).
121 </P>
124 Example: printing "Hello World"
125 </P>
127 <PRE>
128 &lt;HTML>&lt;HEAD>&lt;TITLE>Hello World&lt;/TITLE&gt;
129 &lt;BODY&gt;
130 &lt;H1&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;"Hello World"&lt;/SCRIPT&gt;&lt;/H1&gt;
131 &lt;/BODY&gt;&lt;/HTML&gt;
132 </PRE>
135 Save this in a file, hello.html, in the directory you indicated with
136 $YOUR_HTML_FILES and access http://your_server/SHTML/hello.html
137 (or to whatever name you use as an alias for CGIscriptor.pl).
138 This is realy ALL you need to do to get going.
139 </P>
142 You can use any values that are delivered in CGI-compliant form (i.e.,
143 the "?name=value" type URL additions) transparently as "$name" variables
144 in your scripts IFF you have declared them in a META or SCRIPT tag before e.g.:
145 </P>
147 <PRE>
148 &lt;META CONTENT="text/ssperl; CGI='$name = `default value`'
149 [SRC='ScriptSource']"&gt;
150 </PRE>
152 <PRE>
153 &lt;SCRIPT TYPE=text/ssperl CGI="$name = 'default value'"
154 [SRC='ScriptSource']&gt;
155 </PRE>
158 After such a 'CGI' attribute, you can use $name as an ordinary PERL variable
159 (the ScriptSource file is immediately evaluated with "do 'ScriptSource'").
160 The CGIscriptor script allows you to write ordinary HTML files which will
161 include dynamic CGI aware (run time) features, such as on-line answers
162 to specific CGI requests, queries, or the results of calculations.
163 </P>
166 For example, if you wanted to answer questions of clients, you could write
167 a Perl program called "Answer.pl" with a function "AnswerQuestion()"
168 that prints out the answer to requests given as arguments. You then write
169 a HTML page "Respond.html" containing the following fragment:
170 </P>
172 <hr>
173 <PRE>
174 &lt;CENTER&gt;
175 The Answer to your question
176 &lt;META CONTENT="text/ssperl; CGI='$Question'"&gt;
177 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;$Question&lt;/SCRIPT&gt;&lt;/h3&gt;
179 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./PATH/Answer.pl"&gt;
180 AnswerQuestion($Question);
181 &lt;/SCRIPT&gt;&lt;/h3&gt;
182 &lt;CENTER&gt;
183 &lt;FORM ACTION=Respond.html METHOD=GET&gt;
184 Next question: &lt;INPUT NAME="Question" TYPE=TEXT SIZE=40&gt;&lt;br&gt;
185 &lt;INPUT TYPE=SUBMIT VALUE="Ask"&gt;
186 &lt;/FORM&gt;
187 </PRE>
188 <hr>
191 The output could look like the following (in HTML-speak):
192 </P>
194 <hr>
195 <PRE>
196 <CENTER>
197 The Answer to your question
198 <h3>What is the capital of the Netherlands?</h3>
200 <h3>Amsterdam</h3>
201 </CENTER>
202 <FORM ACTION=Respond.html METHOD=GET>
203 Next question: <INPUT NAME="Question" TYPE=TEXT SIZE=40><br>
204 <INPUT TYPE=SUBMIT VALUE="Ask">
205 </PRE>
206 <hr>
209 Note that the function "Answer.pl" does know nothing about CGI or HTML,
210 it just prints out answers to arguments. Likewise, the text has no
211 provisions for scripts or CGI like constructs. Also, it is completely
212 trivial to extend this "program" to use the "Answer" later in the page
213 to call up other information or pictures/sounds. The final text never
214 shows any cue as to what the original "source" looked like, i.e.,
215 where you store your scripts and how they are called.
216 </P>
219 There are some extra's. The argument of the files called in a SRC= tag
220 can access the CGI variables declared in the preceding META tag from
221 the @ARGV array. Executable files are called as:
222 `file '$ARGV[0]' ... ` (e.g., `Answer.pl \'$Question\'`;)
223 The files called from SRC can even be (CGIscriptor) html files which are
224 processed in-line. Furthermore, the SRC= tag can contain a perl block
225 that is evaluated. That is,
226 </P>
228 <PRE>
229 &lt;META CONTENT="text/ssperl; CGI='$Question' SRC='{$Question}'"&gt;
230 </PRE>
233 will result in the evaluation of "print do {$Question};" and the VALUE
234 of $Question will be printed. Note that these "SRC-blocks" can be
235 preceded and followed by other file names, but only a single block is
236 allowed in a SRC= tag.
237 </P>
240 One of the major hassles of dynamic WWW pages is the fact that several
241 mutually incompatible browsers and platforms must be supported. For example,
242 the way sound is played automatically is different for Netscape and
243 Internet Explorer, and for each browser it is different again on
244 Unix, MacOS, and Windows. Realy dangerous is processing user-supplied
245 (form-) values to construct email addresses, file names, or database
246 queries. All Apache WWW-server exploits reported in the media are
247 based on faulty CGI-scripts that didn't check their user-data properly.
248 </p>
251 There is no panacee for these problems, but a lot of work and problems
252 can be safed by allowing easy and transparent control over which
253 &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks are executed on what CGI-data. CGIscriptor
254 supplies such a method in the form of a pair of attributes:
255 IF='...condition..' and UNLESS='...condition...'. When added to a
256 script tag, the whole block (including the SRC attribute) will be
257 ignored if the condition is false (IF) or true (UNLESS).
258 For example, the following block will NOT be evaluated if the value
259 of the CGI variable FILENAME is NOT a valid filename:
260 </p>
262 <pre>
263 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME' IF='CGIscriptor::CGIsafeFileName($FILENAME)'&gt;
264 .....
265 &lt;/SCRIPT&gt;
266 </pre>
269 (the function CGIsafeFileName(String) returns an empty string ("")
270 if the String argument is not a valid filename).
271 The UNLESS attribute is the mirror image of IF.
272 </p>
275 A user manual follows the HTML 4 and security paragraphs below.
276 </P>
279 <A NAME="HTML4"><H2 ALIGN="CENTER">HTML 4 COMPLIANCE</H2></A>
282 In general, CGIscriptor.pl complies with the HTML 4 recommendations of
283 the W3C. This means that any software to manage Web sites will be able
284 to handle CGIscriptor files, as will web agents.
285 </P>
288 All script code should be placed between &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, the
289 script type is indicated with TYPE="mime-type", the LANGUAGE
290 feature is ignored, and a SRC feature is implemented. All CGI specific
291 features are delegated to the CGI attribute.
292 </P>
295 However, the behavior deviates from the W3C recommendations at some
296 points. Most notably:
297 </P>
299 <DL>
300 <dt>0- The scripts are executed at the server side, invisible to the
301 client (i.e., the browser)
302 <dt>1- The mime-types are personal and idiosyncratic, but can be adapted.
303 <dt>2- Code in the body of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tag-pair is still evaluated
304 when a SRC feature is present.
305 <dt>3- The SRC feature reads a list of files.
306 <dt>4- The files in a SRC feature are processed according to file type.
307 <dt>5- The SRC feature evaluates inline Perl code.
308 <dt>6- Processed META, INS, and DIV tags are removed from the output document.
309 <dt>7- All attributes of the processed META tags, except CONTENT, are ignored
310 (i.e., deleted from the output).
311 <dt>8- META tags can be placed ANYWHERE in the document.
312 <dt>9- Through the SRC feature, META tags can have visible output in the
313 document.
314 <dt>10- The CGI attribute that declares CGI parameters, can be used
315 inside the &lt;SCRIPT&gt; tag.
316 <dt>11- Use of an extended quote set, i.e., '', "", ``, (), {}, []
317 and their \-slashed combinations: \'\', \"\", \`\`, \(\),
318 \{\}, \[\].
319 <dt>12- IF and UNLESS attributes to &lt;SCRIPT&gt;, &lt;META&gt;,
320 &lt;INS&gt;, and &lt;DIV&gt; tags.
321 <dt>13- &lt;DIV&gt; tags cannot be nested, &lt;DIV&gt; tags are not
322 rendered with new-lines.
323 <dt>14- The XML style &lt;TAG .... /&gt; is recognized and handled correctly.
324 (i.e., no content is processed)
325 </DL>
328 The reasons for these choices are:
329 </P>
332 You can still write completely HTML4 compliant documents. CGIscriptor
333 will not force you to write "deviant" code. However, it allows you to
334 do so (which is, in fact, just as bad). The prime design principle
335 was to allow users to include plain Perl code. The code itself should
336 be "enhancement free". Therefore, extra features were needed to
337 supply easy access to CGI and Web site components. For security
338 reasons these have to be declared explicitly. The SRC feature
339 transparently manages access to external files, especially the safe
340 use of executable files.
341 </P>
344 The CGI attribute handles the declarations of external (CGI) variables
345 in the SCRIPT and META tag's.<BR>
346 EVERYTHING THE CGI ATTRIBUTE AND THE META TAG DO CAN BE DONE INSIDE
347 A &lt;SCRIPT&gt;&lt;/SCRIPT&gt; TAG CONSTRUCT.
348 </P>
351 The reason for the IF, UNLESS, and SRC attributes (and its Perl code evaluation)
352 were build into the META and SCRIPT tags is part laziness, part security. The SRC
353 blocks allows more compact documents and easier debugging. The values of the
354 CGI variables can be immediately screened for security by IF or UNLESS
355 conditions, and even SRC attributes (e.g., email addresses and file names), and
356 a few commands can be called without having to add another Perl TAG pair.
357 This is especially important for documents that require the use of other
358 (restricted) "scripting" languages that lag transparent control structures.
359 </P>
362 <A NAME="SECURITY"><H2 ALIGN="CENTER">SECURITY</H2></A>
365 Your WWW site is a few keystrokes away from a few hundred million internet
366 users. A fair percentage of these users knows more about your computer
367 than you do. And some of these just might have bad intentions.
368 </P>
371 To ensure uncompromized operation of your server and platform, several
372 features are incorporated in CGIscriptor.pl to enhance security.
373 First of all, you should check the source of this program. No security
374 measures will help you when you download programs from anonymous sources.
375 If you want to use THIS file, please make sure that it is uncompromized.
376 The best way to do this is to contact the source and try to determine
377 whether s/he is reliable (and accountable).
378 </P>
381 BE AWARE THAT ANY PROGRAMMER CAN CHANGE THIS PROGRAM IN SUCH A WAY THAT
382 IT WILL SET THE DOORS TO YOUR SYSTEM WIDE OPEN
383 </P>
386 I would like to ask any user who finds bugs that could compromise
387 security to report them to me (and any other bug too,
388 Email: R.J.J.H.vanSon@uva.nl or ifa@hum.uva.nl).
389 </P>
391 <H2 ALIGN="CENTER">Security features</H2>
393 <dl>
394 <dt>1 Invisibility
395 <dd>The inner workings of the HTML source files are completely hidden
396 from the client. Only the HTTP header and the ever changing content
397 of the output distinguish it from the output of a plain, fixed HTML
398 file. Names, structures, and arguments of the "embedded" scripts
399 are invisible to the client. Error output is suppressed except
400 during debugging (user configurable).
402 <dt>2 Separate directory trees
403 <dd>Directories containing Inline text and script files can reside on
404 separate trees, distinct from those of the HTTP server. This means
405 that NEITHER the text files, NOR the script files can be read by
406 clients other than through CGIscriptor.pl, UNLESS they are
407 EXPLICITELY made available.
409 <dt>3 Requests are NEVER "evaluated"
410 <dd>All client supplied values are used as literal values (''-quoted).
411 Client supplied ''-quotes are ALWAYS removed. Therefore, as long as the
412 embedded scripts do NOT themselves evaluate these values, clients CANNOT
413 supply executable commands. Be sure to AVOID scripts like:
415 <PRE>
416 &lt;META CONTENT="text/ssperl; CGI='$UserValue'"&gt;
417 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 $UserValue`;&lt;/SCRIPT&gt;
418 </PRE>
421 These are a recipe for disaster. However, the following quoted
422 form should be save (but is still not adviced):
423 </P>
425 <PRE>
426 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 \'$UserValue\'`;&lt;/SCRIPT&gt;
427 </PRE>
430 A special function, SAFEqx(), will automatically do exactly this,
431 e.g., SAFEqx('ls -1 $UserValue') will execute `ls -1 \'$UserValue\'`
432 with $UserValue interpolated. I recommend to use SAFEqx() instead
433 of backticks whenever you can. The OS shell scripts inside
434 </P>
436 <PRE>
437 &lt;SCRIPT TYPE="text/osshell"&gt;ls -1 $UserValue&lt;/SCRIPT&gt;
438 </PRE>
441 are handeld by SAFEqx and automatically ''-quoted.
442 </P>
444 <dt>4 Logging of requests
445 <dd>All requests can be logged separate from the Host server. The level of
446 detail is user configurable: Including or excluding the actual queries.
447 This allows for the inspection of (im-) proper use.
449 <dt>5 Access control: Clients
450 <dd>The Remote addresses can be checked against a list of authorized
451 (i.e., accepted) or non-authorized (i.e., rejected) clients. Both
452 REMOTE_HOST and REMOTE_ADDR are tested so clients without a proper
453 HOST name can be (in-) excluded by their IP-address. Client patterns
454 containing all numbers and dots are considered IP-addresses, all others
455 domain names. No wild-cards or regexp's are allowed, only partial
456 addresses.<br>
457 Matching of names is done from the back to the front (domain first,
458 i.e., $REMOTE_HOST =~ /\Q$pattern\E$/is), so including ".edu" will
459 accept or reject all clients from the domain EDU. Matching of
460 IP-addresses is done from the front to the back (domain first, i.e.,
461 $REMOTE_ADDR =~ /^\Q$pattern\E/is), so including "128." will (in-)
462 exclude all clients whose IP-address starts with 128.
463 There are two special symbols: "-" matches HOSTs with no name and "*"
464 matches ALL HOSTS/clients.<br>
467 For those needing more expressional power, lines starting with
468 "-e" are evaluated by the perl eval() function. E.g.,
469 '-e $REMOTE_HOST =~ /\.edu$/is;' will accept/reject clients from the
470 domain '.edu'.
471 </P>
473 <dt>6 Access control: Files
474 <dd>In principle, CGIscriptor could read ANY file in the directory
475 tree as discussed in 1. However, for security reasons this is
476 restricted to text files. It can be made more restricted by entering
477 a global file pattern (e.g., ".html"). This is done by default.
478 For each client requesting access, the file pattern(s) can be made
479 more restrictive than the global pattern by entering client specific
480 file patterns in the Access Control files (see 5).
481 For example: if the ACCEPT file contained the lines
483 <PRE>
484 * DEMO
485 .hum.uva.nl LET
486 145.18.230.
487 </PRE>
490 Then all clients could request paths containing "DEMO" or "demo", e.g.
491 "/my/demo/file.html" ($PATH_INFO =~ /\Q$pattern\E/), Clients from
492 *.hum.uva.nl could also request paths containing "LET or "let", e.g.
493 "/my/let/file.html", and clients from the local cluster
494 145.18.230.[0-9]+ could access ALL files.
495 Again, for those needing more expressional power, lines starting with
496 "-e" are evaluated. For instance:
497 '-e $REMOTE_HOST =~ /\.edu$/is && $PATH_INFO =~ m@/DEMO/@is;'
498 will accept/reject requests for files from the directory "/demo/" from
499 clients from the domain '.edu'.
500 </P>
503 <dt>7 Query length limiting
504 <dd>The length of the Query string can be limited. If CONTENT_LENGTH is larger
505 than this limit, the request is rejected. The combined length of the
506 Query string and the POST input is checked before any processing is done.
507 This will prevent clients from overloading the scripts.
508 The actual, combined, Query Size is accessible as a variable through
509 $CGI_Content_Length.
510 </P>
513 <dt>8 Illegal filenames, paths, and protected directories
514 <dd>One of the primary security concerns in handling CGI-scripts is the
515 use of "funny" characters in the requests that con scripts in executing
516 malicious commands. Examples are inserting ';', null bytes, or &lt;newline&gt; characters
517 in URL's and filenames, followed by executable commands. A special
518 variable $FileAllowedChars stores a string of all allowed characters.
519 Any request that translates to a filename with a character OUTSIDE
520 this set will be rejected.<br>
521 In general, all (readable files) in the ServerRoot tree are accessible.
522 This might not be what you want. For instance, your ServerRoot directory
523 might be the working directory of a CVS project and contain sensitive
524 information (e.g., the password to get to the repository). You can block
525 access to these subdirectories by adding the corresponding patterns to
526 the $BlockPathAccess variable. For instance, $BlockPathAccess = '/CVS/'
527 will block any request that contains '/CVS/' or:<br>
528 <pre>
529 die if $BlockPathAccess && $ENV{'PATH_INFO'} =~ m@$BlockPathAccess@;
530 </pre>
531 </P>
534 <dt>9 The execution of code blocks can be controlled in a transparent way
535 by adding IF or UNLESS conditions in the tags themselves.
536 <dd>That is, a simple check of the validity of filenames or email
537 addresses can be done before any code is executed.
538 </p>
540 </dl>
542 <hr>
544 <A NAME="MANUAL"><H1 ALIGN="CENTER">USER MANUAL</H1></A>
546 <UL>
547 <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
548 <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
549 <LI><A HREF="#BINFILES">NON-HTML FILES</A>
550 <LI><A HREF="#META">THE META TAG</A>
551 <LI><A HREF="#DIV">THE DIV/INS TAG</A>
552 <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
553 <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
554 <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
555 <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
556 <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
557 <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
558 <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
559 <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
560 <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
561 <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
562 <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
563 <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
564 <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
565 <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
566 </UL>
568 <A NAME="INTRODUCTION"><H2 ALIGN="CENTER">INTRODUCTION</H2></A>
571 CGIscriptor removes embedded scripts, indicated by an HTML 4 type
572 &lt;SCRIPT TYPE='text/ssperl'&gt; &lt;/SCRIPT&gt; or &lt;SCRIPT TYPE='text/osshell'&gt;
573 &lt;/SCRIPT&gt; constructs. The contents of the directive are executed by
574 the PERL eval() and `` functions (in a separate name space). The
575 result of the eval() function replaces the &lt;SCRIPT&gt; &lt;/SCRIPT&gt; construct
576 in the output file. You can use the values that are delivered in
577 CGI-compliant form (i.e., the "?name=value&.." type URL additions)
578 transparently as "$name" variables in your directives after they are
579 defined in a &lt;META&gt; or &lt;SCRIPT&gt; tag.
580 If you define the variable "$CGIscriptorResults" in a CGI attribute, all
581 subsequent &lt;SCRIPT&gt; and &lt;META&gt; results (including the defining
582 tag) will also be pushed onto a stack: @CGIscriptorResults. This list
583 behaves like any other, ordinary list and can be manipulated.
584 </P>
587 Both GET and POST requests are accepted. These two methods are treated
588 equal. Variables, i.e., those values that are determined when a file is
589 processed, are indicated in the CGI attribute by $&lt;name&gt; or
590 $&lt;name&gt;=&lt;default&gt; in which &lt;name&gt; is the name of the
591 variable and &lt;default&gt; is the value used when there is NO current CGI
592 value for &lt;name&gt; (you can use white-spaces in
593 $&lt;name&gt;=&lt;default&gt; but really DO make sure that the default
594 value is followed by white space or is quoted). Names can contain any
595 alphanumeric characters and _ (i.e., names match /[\w]+/).<br>
596 If the <i>Content-type:</i> is 'multipart/*', the input is treated as a
597 MIME multipart message and automatically delimited. CGI variables get the
598 "raw" (i.e., undecoded) body of the corresponding message part.
599 </P>
602 Variables can be CGI variables, i.e., those from the QUERY_STRING,
603 environment variables, e.g., REMOTE_USER, REMOTE_HOST, or REMOTE_ADDR,
604 or predefined values, e.g., CGI_Decoded_QS (The complete, decoded,
605 query string), CGI_Content_Length (the length of the decoded query
606 string), CGI_Year, CGI_Month, CGI_Time, and CGI_Hour (the current
607 date and time).
608 </P>
611 All these are available when defined in a CGI attribute. All environment
612 variables are accessible as $ENV{'name'}. So, to access the REMOTE_HOST
613 and the REMOTE_USER, use, e.g.:
614 </P>
616 <PRE>
617 &lt;SCRIPT TYPE='text/ssperl'&gt;
618 ($ENV{'REMOTE_HOST'}||"-")." $ENV{'REMOTE_USER'}"
619 &lt;/SCRIPT&gt;
620 </PRE>
623 (This will print a "-" if REMOTE_HOST is not known)
624 Another way to do this is:
625 </P>
627 <PRE>
628 &lt;META CONTENT="text/ssperl; CGI='$REMOTE_HOST = - $REMOTE_USER'"&gt;
629 &lt;SCRIPT TYPE='text/ssperl'&gt;"$REMOTE_HOST $REMOTE_USER"&lt;/SCRIPT&gt;
630 </PRE>
634 <PRE>
635 &lt;META CONTENT='text/ssperl; CGI="$REMOTE_HOST = - $REMOTE_USER"
636 SRC={"$REMOTE_HOST $REMOTE_USER\n"}'&gt;
637 </PRE>
640 This is possible because ALL environment variables are available as
641 CGI variables. The environment variables take precedence over CGI
642 names in case of a "name clash". For instance:
643 </P>
645 <PRE>
646 &lt;META CONTENT="text/ssperl; CGI='$HOME' SRC={$HOME}"&gt;
647 </PRE>
650 Will print the current HOME directory (environment) irrespective whether
651 there is a CGI variable from the query
652 (e.g., Where do you live? &lt;INPUT TYPE="TEXT" NAME="HOME"&gt;)
653 THIS IS A SECURITY FEATURE. It prevents clients from changing
654 the values of defined environment variables (e.g., by supplying
655 a bogus $REMOTE_ADDR). Although $ENV{} is not changed by the META tags,
656 it would make the use of declared variables insecure. You can still
657 access CGI variables after a name clash with
658 CGIscriptor::CGIparseValue(&lt;name&gt;).
659 </P>
662 Some CGI variables are present several times in the query string
663 (e.g., from multiple selections). These should be defined as
664 @VARIABLENAME=default in the CGI attribute. The list @VARIABLENAME
665 will contain ALL VARIABLENAME values from the query, or a single
666 default value. If there is an ENVIRONMENT variable of the
667 same name, it will be used instead of the default AND the query
668 values. The corresponding function is
669 CGIscriptor::CGIparseValueList(&lt;name&gt;)
670 </P>
673 CGI variables collected in a @VARIABLENAME list are unordered.
674 When more structured variables are needed, a hash table can be used.
675 A variable defined as %VARIABLE=default will collect all
676 CGI-parameter values whose name start with 'VARIABLE' in a hash table
677 with the remainder of the name as a key. For instance, %PERSON will
678 collect PERSONname='John Doe', PERSONbirthdate='01 Jan 00', and
679 PERSONspouse='Alice' into a hash table %PERSON such that
680 $PERSON{'spouse'} equals 'Alice'. Any default value or environment
681 value will be stored under the "" key. If there is an ENVIRONMENT
682 variable of the same name, it will be used instead of the default
683 AND the query values. The corresponding function is
684 CGIscriptor::CGIparseValueHash(&lt;name&gt;)
685 </P>
688 This method of first declaring your environment and CGI variables
689 before being able to use them in the scripts might seem somewhat
690 clumsy, but it protects you from inadvertedly printing out the values of
691 system environment variables when their names coincide with those used
692 in the CGI forms. It also prevents "clients" from supplying CGI parameter
693 values for your private variables.
694 THIS IS A SECURITY FEATURE!
695 </P>
697 <A NAME="NON-HTML"><H2 ALIGN="CENTER">NON-HTML CONTENT TYPES</H2></A>
700 Normally, CGIscriptor prints the standard "Content-type: text/html\n\n"
701 message before anything is printed. This has been extended to include
702 plain text (.txt) files, for which the Content-type (MIME type)
703 'text/plain' is printed. In all other respects, text files are treated as
704 HTML files (this can be switched off by removing '.txt' from the
705 $FilePattern variable). When the content type should be something else,
706 e.g., with multipart files, use the $RawFilePattern (.xmr, see also next
707 item). CGIscriptor will not print a Content-type message for this file type
708 (which must supply its OWN Content-type message). Raw files must still
709 conform to the &lt;SCRIPT&gt;&lt;/SCRIPT&gt; and &lt;META&gt; tag
710 specifications.
711 </P>
713 <A NAME="BINFILES"><H2 ALIGN="CENTER">NON-HTML FILES</H2></A>
716 CGIscriptor is intended to process HTML and text files only. You can
717 create documents of any mime-type on-the-fly using "raw" text files, e.g.,
718 with the .xmr extension. However, CGIscriptor will not process binary files
719 of any type, e.g., pictures or sounds. Given the sheer number of formats, I
720 do not have any intention to do so. However, an escape route has been
721 provided. You can construct a genuine raw (.xmr) text file that contains
722 the perl code to service any file type you want. If the global
723 $BinaryMapFile variable contains the path to this file (e.g.,
724 /BinaryMapFile.xmr), this file will be called whenever an unsupported
725 (non-HTML) file type is requested. The path to the requested binary file
726 is stored in $ENV('CGI_BINARY_FILE') and can be used like any other
727 CGI-variable. Servicing binary files then becomes supplying the correct
728 Content-type (e.g., print "Content-type: image/jpeg\n\n";) and reading the
729 file and writing it to STDOUT (e.g., using sysread() and syswrite()).
730 </P>
732 <A NAME="META"><H2 ALIGN="CENTER">THE META TAG</H2></A>
735 All attributes of a META tag are ignored, except the
736 CONTENT='text/ssperl; CGI=" ... " [SRC=" ... "]' attribute. The string
737 inside the quotes following the CONTENT= indication (white-space is
738 ignored, "'` (){}[]-quotes are allowed, plus their \ versions) MUST
739 start with any of the CGIscriptor mime-types (e.g.: text/ssperl or
740 text/osshell) and a comma or semicolon.
741 The quoted string following CGI= contains a white-space separated list
742 of declarations of the CGI (and Environment) values and default values
743 used when no CGI values are supplied by the query string.
744 </P>
747 If the default value is a longer string containing special characters,
748 possibly spanning several lines, the string must be enclosed in quotes.
749 You may use any pair of quotes or brackets from the list '', "", ``, (),
750 [], or {} to distinguish default values (or preceded by \, e.g., \(...\)
751 is different from (...)). The outermost pair will always be used and any
752 other quotes inside the string are considered to be part of the string
753 value, e.g.,
754 </P>
756 <PRE>
757 $Value = {['this'
758 "and" (this)]}
759 </PRE>
762 will result in $Value getting the default value
763 </P>
765 <PRE>
766 ['this'
767 "and" (this)]
768 </PRE>
771 (NOTE that the newline is part of the default value!).
772 </P>
775 Internally, for defining and initializing CGI (ENV) values, the META
776 and SCRIPT tags use the function "defineCGIvariable($name, $default)"
777 (scalars) and "defineCGIvariableList($name, $default)" (lists).
778 These functions can be used inside scripts as
779 "CGIscriptor::defineCGIvariable($name, $default)" and
780 "CGIscriptor::defineCGIvariableList($name, $default)".
781 </P>
784 The CGI attribute will be processed exactly identical when used inside
785 the &lt;SCRIPT&gt; tag. However, this use is not according to the
786 HTML 4.0 specifications of the W3C.
787 </P>
789 <A NAME="DIV"><H2 ALIGN="CENTER">THE DIV/INS TAG</H2></A>
792 There is a problem when constructing html files containing
793 server-side perl scripts with standard HTML tools. These
794 tools will refuse to process any text between
795 &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
796 tags. This is quite annoying when you want to use large
797 HTML templates where you will fill in values.
798 </P>
801 For this purpose, CGIscriptor will read the neutral
802 &lt;DIV CLASS="ssperl" ID="varname"&gt;&lt;/DIV&gt;
803 &lt;INS CLASS="ssperl" ID="varname"&gt;&lt;/INS&gt;
804 tag (in Cascading Style Sheet manner) Note that "varname" has
805 NO '$' before it, it is a bare name. Any text between
806 these &lt;DIV ...&gt;&lt;/DIV&gt; or
807 &lt;INS ...&gt;&lt;/INS&gt; tags will be assigned
808 to '$varname' as is (e.g., as a literal). No
809 processing or interpolation will be performed.
810 There is also NO nesting possible. Do NOT nest
811 &lt;/DIV&gt; inside a &lt;DIV&gt;&lt;/DIV&gt;!
812 Moreover, DIV tags do NOT ensure a block structure in
813 the final rendering (i.e., no empty lines).
814 </P>
817 Note that &lt;DIV CLASS="ssperl" ID="varname"/&gt;
818 is handled the XML way. No content is processed,
819 but varname is defined, and any SRC directives are
820 processed.
821 </P>
824 You can use $varname like any other variable name.
825 However, $varname is NOT a CGI variable and will be
826 completely internal to your script. There is NO
827 interaction between $varname and the outside world.
828 </P>
831 To interpolate a DIV derived text, you can use:
832 <pre>
833 $varname =~ s/([\]])/\\\1/g; # Mark ']'-quotes
834 $varname = eval("qq[$varname]"); # Interpolate all values
835 </pre>
836 </P>
839 The DIV tag will process IF, UNLESS, CGI and SRC attributes.
840 The SRC files will be pre-pended to the body
841 text of the tag.
842 </p>
844 <A NAME="IFUNLESS"><H2 ALIGN="CENTER">
845 CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES
846 </H2></A>
849 It is often necessary to include code-blocks that should be executed
850 conditionally, e.g., only for certain browsers or operating system.
851 Furthermore, quite often sanity and security checks are necessary
852 before user (form) data can be processed, e.g., with respect to
853 email addresses and filenames.
854 </p>
857 Checks added to the code are often difficult to find, interpret or
858 maintain and in general mess up the code flow. This kind of confussion
859 is dangerous. Also, for many of the supported "foreign" scripting
860 languages, adding these checks is cumbersome or even impossible.
861 </p>
864 As a uniform method for asserting the correctness of "context", two
865 attributes are added to all supported tags: IF and UNLESS.
866 They both evaluate their value and block execution when the
867 result is &lt;FALSE&gt; (IF) or &lt;TRUE&gt; (UNLESS) in Perl, e.g.,
868 UNLESS='$NUMBER \&gt; 100;' blocks execution if $NUMBER &lt;= 100. Note that
869 the backslash in the '\&gt;' is removed and only used to differentiate
870 this conditional '&gt;' from the tag-closing '&gt;'. For symmetry, the
871 backslash in '\&lt;' is also removed. Inside these conditionals,
872 ~/ and ./ are expanded to their respective directory root paths.
873 </p>
876 For example, the following tag will be ignored when the filename is
877 invalid:
878 </p>
880 <pre>
881 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME'
882 IF='CGIscriptor::CGIsafeFileName($FILENAME);'&gt;
884 &lt;/SCRIPT&gt;
885 </pre>
888 The IF and UNLESS values must be quoted. The same quotes are supported
889 as with the other attributes. The SRC attribute is ignored when IF and
890 UNLESS block execution.
891 </p>
893 <A NAME="SRC"><H2 ALIGN="CENTER">
894 THE MAGIC SOURCE ATTRIBUTE (SRC=)</H2></A>
897 The SRC attribute inside tags accepts a list of filenames and URL's
898 separated by "," comma's (or ";" semicolons).
899 </P>
902 ALL the variable values defined in the CGI attribute are available in
903 @ARGV as if the file was executed from the command line, in
904 the exact order in which they were declared in the preceding CGI
905 attribute.
906 </P>
909 First, a SRC={}-block will be evaluated as if the code inside the
910 block was part of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; construct, i.e.,
911 "print do { code };'';" or `code` (i.e., SAFEqx('code)).
912 Only a single block is evaluated. Note that this is processed less
913 efficiently than &lt;SCRIPT&gt; &lt;/SCRIPT&gt; blocks. Type of evaluation
914 depends on the content-type: Perl for text/ssperl and OS shell for
915 text/osshell. For other mime types (scripting languages), anything in
916 the source block is put in front of the code block "inside" the tag.
917 </P>
920 Second, executable files (i.e., -x filename != 0) are evaluated as:
921 print `filename \'$ARGV[0]\' \'$ARGV[1]\' ...`
922 That is, you can actually call executables savely from the SRC tag.
923 </P>
926 Third, text files that match the file pattern, used by CGIscriptor to
927 check whether files should be processed ($FilePattern), are
928 processed in-line (i.e., recursively) by CGIscriptor as if the code
929 was inserted in the original source file. Recursions, i.e., calling
930 a file inside itself, are blocked. If you need them, you have to code
931 them explicitely using "main::ProcessFile($file_path)".
932 </P>
935 Fourth, Perl text files (i.e., -T filename != 0) are evaluated as:
936 "do FileName;'';".
937 </P>
940 Last, URL's (i.e., starting with 'HTTP://', 'FTP://', 'GOPHER://', 'TELNET://',
941 'WHOIS://' etc.) are loaded and printed. The loading and handling of &lt;BASE&gt;
942 and document header is done by main::GET_URL($URL [, 0]). You can enter your own
943 code (default is <i>curl</i>, <i>snarf</i>, or <i>wget</i> and some
944 post-processing to add a &lt;BASE&gt; tag).
945 </P>
948 There are two pseudo-file names: PREFIX and POSTFIX. These implement
949 a switch from prefixing the SRC code/files (PREFIX, default) before the content of
950 the tag to appending the code after the content of the tag (POSTFIX). The switches
951 are done in the order in which the PREFIX and POSTFIX labels are encountered.
952 You can mix PREFIX and POSTFIX labels in any order with the SRC files.
953 Note that the ORDER of file execution is determined for prefixed and
954 postfixed files seperately.
958 File paths can be preceded by the URL protocol prefix "file://". This
959 is simply STRIPPED from the name.
960 </P>
963 Example:
964 </P>
967 The request
968 "http://cgi-bin/Action_Forms.pl/Statistics/Sign_Test.html?positive=8&negative=22
969 will result in printing "${SS_PUB}/Statistics/Sign_Test.html"
970 With QUERY_STRING = "positive=8&negative=22"
971 </P>
974 on encountering the lines:
975 </P>
977 <PRE>
978 &lt;META CONTENT="text/osshell; CGI='$positive=11 $negative=3'"&gt;
979 &lt;b&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./Statistics/SignTest.pl"&gt;
980 &lt;/SCRIPT&gt;&lt;/b&gt;&lt;p&gt;"
981 </PRE>
983 This line will be processed as:
985 <PRE>
986 "&lt;b&gt;`${SS_SCRIPT}/Statistics/SignTest.pl '8' '22'`&lt;/b&gt;&lt;p&gt;"
987 </PRE>
990 In which "${SS_SCRIPT}/Statistics/SignTest.pl" is an executable script,
991 This line will end up printed as:
992 </P>
994 <PRE>
995 "&lt;b&gt;p &lt;= 0.0161&lt;/b&gt;&lt;p&gt;"
996 </PRE>
999 Note that the META tag itself will never be printed, and is invisible to
1000 the outside world.
1001 </P>
1004 The SRC files in a DIV/INS tag will be added (pre-pended) to the body
1005 of the &lt;DIV&gt;&lt;/DIV&gt; tag. Blocks are NOT executed!
1006 </P>
1008 <A NAME="ROOT"><H2 ALIGN="CENTER">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</H2></A>
1011 Inside &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, filepaths starting
1012 with "~/" are replaced by "$YOUR_HTML_FILES/", this way files in the
1013 public directories can be accessed without direct reference to the
1014 actual paths. Filepaths starting with "./" are replaced by
1015 "$YOUR_SCRIPTS/" and this should only be used for scripts.
1016 The "$YOUR_SCRIPTS" directory is added to @INC so, e.g., the
1017 'require' command will load from the "$YOUR_SCRIPTS" directory.
1018 </P>
1021 <b>Note:</b> this replacement can seriously affect Perl scripts. Watch
1022 out for constructs like $a =~ s/aap\./noot./g, use
1023 $a =~ s@aap\.@noot.@g instead.
1024 </P>
1027 CGIscriptor.pl will assign the values of $SS_PUB and $SS_SCRIPT
1028 (i.e., $YOUR_HTML_FILES and $YOUR_SCRIPTS) to the environment variables
1029 $SS_PUB and $SS_SCRIPT. These can be accessed by the scripts that are
1030 executed. The "$SS_SCRIPT" ($YOUR_SCRIPTS) directory is added to
1031 @INC so, e.g., the 'require' command will load from the "$SS_SCRIPT"
1032 directory.<br>
1033 Values not preceded by $, ~/, or ./ are used as literals
1034 </P>
1036 <A NAME="OSSHELL"><H2 ALIGN="CENTER">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</H2></A>
1039 OS scripts are executed by a "safe" version of the `` operator (i.e.,
1040 SAFEqx(), see also below) and any output is printed. CGIscriptor will
1041 interpolate the script and replace all user-supplied CGI-variables by
1042 their ''-quoted values (actually, all variables defined in CGI attributes are
1043 quoted). Other Perl variables are interpolated in a simple fasion, i.e.,
1044 $scalar by their value, @list by join(' ', @list), and %hash by their
1045 name=value pairs. Complex references, e.g., @$variable, are all
1046 evaluated in a scalar context. Quotes should be used with care.
1047 NOTE: the results of the shell script evaluation will appear in the
1048 @CGIscriptorResults stack just as any other result.
1049 </P>
1052 All occurrences of $@% that should NOT be interpolated must be
1053 preceeded by a "\". Interpolation can be switched off completely by
1054 setting $CGIscriptor::NoShellScriptInterpolation = 1
1055 (set to 0 or undef to switch interpolation on again)
1056 i.e.,
1057 </P>
1059 <PRE>
1060 &lt;SCRIPT TYPE="text/ssperl"&gt;
1061 $CGIscriptor::NoShellScriptInterpolation = 1;
1062 &lt;/SCRIPT&gt;
1063 </PRE>
1065 <A NAME="TRANSLATIONS">
1066 <H2 ALIGN="CENTER">RUN TIME TRANSLATION OF INPUT FILES</h2>
1069 Allows general and global conversions of files using Regular Expressions.
1070 Very handy (but costly) to rewrite legacy pages to a new format.
1071 Select files to use it on with <br>
1072 my $TranslationPaths = 'filepattern';<br>
1073 This is costly. For efficiency, define:<br>
1074 $TranslationPaths = ''; when not using translations.<br>
1075 Accepts general regular expressions: [$pattern, $replacement]
1076 </p>
1079 Define:</p>
1080 <pre>
1081 my $TranslationPaths = 'filepattern'; # Pattern matching PATH_INFO
1083 push(@TranslationTable, ['pattern', 'replacement']);
1084 # e.g. (for Ruby Rails):
1085 push(@TranslationTable, ['&lt;%=', '&lt;SCRIPT TYPE="text/ssruby"&gt;']);
1086 push(@TranslationTable, ['%&gt;', '&lt;/SCRIPT&gt;']);
1088 # Runs:
1089 my $currentRegExp;
1090 foreach $currentRegExp (@TranslationTable)
1092 my ($pattern, $replacement) = @$currentRegExp;
1093 $$text =~ s!$pattern!$replacement!msg;
1095 </pre>
1097 <A NAME="LANGUAGES">
1098 <H2 ALIGN="CENTER">EVALUATION OF OTHER SCRIPTING LANGUAGES</H2>
1099 </A>
1102 Adding a MIME-type and an interpreter command to
1103 %ScriptingLanguages automatically will catch any other
1104 scripting language in the standard
1105 &lt;SCRIPT TYPE="[mime]"&gt;&lt;/SCRIPT&gt; manner.
1106 E.g., adding: $ScriptingLanguages{'text/sspython'} = 'python';
1107 will actually execute the folowing code in an HTML page
1108 (ignore 'REMOTE_HOST' for the moment):
1109 </P>
1111 <PRE>
1112 &lt;SCRIPT TYPE="text/sspython"&gt;
1113 # A Python script
1114 x = ["A","real","python","script","Hello","World","and", REMOTE_HOST]
1115 print x[4:8] # Prints the list ["Hello","World","and", REMOTE_HOST]
1116 &lt;/SCRIPT&gt;
1117 </PRE>
1120 The script code is NOT interpolated by perl, EXCEPT for those
1121 interpreters that cannot handle variables themselves.
1122 Currently, several interpreters are pre-installed:
1123 </P>
1125 <PRE>
1126 Perl test - "text/testperl" =&gt; 'perl',
1127 Python - "text/sspython" =&gt; 'python',
1128 Ruby - "text/ssruby" =&gt; 'ruby',
1129 Tcl - "text/sstcl" =&gt; 'tcl',
1130 Awk - "text/ssawk" =&gt; 'awk -f-',
1131 Gnu Lisp - "text/sslisp" =&gt; 'rep | tail +5 '.
1132 # "| egrep -v '&gt; |^rep. |^nil\\\$'",
1133 Gnu Prolog- "text/ssprolog" =&gt; 'gprolog',
1134 M4 macro's- "text/ssm4" =&gt; 'm4',
1135 Born shell- "text/sh" =&gt; 'sh',
1136 Bash - "text/bash" =&gt; 'bash',
1137 C-shell - "text/csh" =&gt; 'csh',
1138 Korn shell- "text/ksh" =&gt; 'ksh',
1139 Praat - "text/sspraat" =&gt; "praat - | sed 's/Praat &gt; //g'",
1140 R - "text/ssr" =&gt; "R --vanilla --slave | sed 's/^[\[0-9\]*] //g'",
1141 REBOL - "text/ssrebol" =&gt;
1142 "rebol --quiet|egrep -v '^[&gt; ]* == '|sed 's/^\s*\[&gt; \]* //g'",
1143 PostgreSQL- "text/postgresql" =&gt; 'psql 2&gt;/dev/null',
1144 (psql)
1145 </PRE>
1148 Note that the "value" of $ScriptingLanguages{mime} must be a command
1149 that reads Standard Input and writes to standard output. Any extra
1150 output of interactive interpreters (banners, echo's, prompts)
1151 should be removed by piping the output through 'tail', 'grep',
1152 'sed', or even 'awk' or 'perl'.
1153 </P>
1156 For access to CGI variables there is a special hashtable:
1157 %ScriptingCGIvariables.
1158 CGI variables can be accessed in three ways.
1159 <dl>
1160 <dt>1. If the mime type is not present in %ScriptingCGIvariables,
1161 nothing is done and the script itself should parse the relevant
1162 environment variables.
1163 <dt>2. If the mime type IS present in %ScriptingCGIvariables, but it's
1164 value is empty, e.g., $ScriptingCGIvariables{"text/sspraat"} = '';,
1165 the script text is interpolated by perl. That is, all $var, @array,
1166 %hash, and \-slashes are replaced by their respective values.
1167 <dt>3. In all other cases, the CGI and environment variables are added
1168 in front of the script according to the format stored in
1169 %ScriptingCGIvariables. That is, the following (pseudo-)code is
1170 executed for each CGI- or Environment variable defined in the CGI-tag:
1171 printf(INTERPRETER, $ScriptingCGIvariables{$mime}, $CGI_NAME, $CGI_VALUE);
1172 </dl>
1173 </P>
1176 For instance, "text/testperl" =&gt; '$%s = "%s";' defines variable
1177 definitions for Perl, and "text/sspython" =&gt; '%s = "%s"' for Python
1178 (note that these definitions are not save, the real ones contain '-quotes).
1179 </P>
1182 THIS WILL NOT WORK FOR @VARIABLES, the (empty) $VARIABLES will be used
1183 instead.
1184 </P>
1187 The $CGI_VALUE parameters are "shrubed" of all control characters
1188 and quotes (by &shrubCGIparameter($CGI_VALUE)). Control characters
1189 are replaced by \0&lt;octal ascii value&gt; and quotes by their HTML character
1190 value (&#8217; -&gt; &amp;#8217; &#8216; -&gt; &amp;#8216;
1191 &quot; -&gt; &amp;quot;). For example:
1192 if a client would supply the string value (in standard perl)
1193 </P>
1196 <PRE>"/dev/null';\nrm -rf *;\necho '"</PRE>
1197 it would be processed as
1198 <PRE>'/dev/null&amp;#8217;;\015rm -rf *;\015echo &amp;#8217;'</PRE>
1199 (e.g., sh or bash would process the latter more according to your
1200 intentions).<br>
1201 If your intepreter requires different protection measures, you will
1202 have to supply these in %main::SHRUBcharacterTR (string =&gt; translation),
1203 e.g.,
1205 <PRE>
1206 $SHRUBcharacterTR{"\'"} = "&amp;#8217;";
1207 </PRE>
1208 </P>
1211 Currently, the following definitions are used:
1212 </P>
1214 <PRE>
1215 %ScriptingCGIvariables = (
1216 "text/testperl" =&gt; "\$\%s = '\%s';", # Perl $VAR = 'value' (for testing)
1217 "text/sspython" =&gt; "\%s = '\%s'", # Python VAR = 'value'
1218 "text/ssruby" =&gt; '@%s = "%s"', # Ruby @VAR = "value"
1219 "text/sstcl" =&gt; 'set %s "%s"', # TCL set VAR "value"
1220 "text/ssawk" =&gt; '%s = "%s";', # Awk VAR = "value";
1221 "text/sslisp" =&gt; '(setq %s "%s")', # Gnu lisp (rep) (setq VAR "value")
1222 "text/ssprolog" =&gt; '', # Gnu prolog (interpolated)
1223 "text/ssm4" =&gt; "define(`\%s', `\%s')", # M4 macro's define(`VAR', `value')
1224 "text/sh" =&gt; "\%s='\%s';", # Born shell VAR='value';
1225 "text/bash" =&gt; "\%s='\%s';", # Born again shell VAR='value';
1226 "text/csh" =&gt; "\$\%s = '\%s';", # C shell $VAR = 'value';
1227 "text/ksh" =&gt; "\$\%s = '\%s';", # Korn shell $VAR = 'value';
1228 "text/sspraat" =&gt; '', # Praat (interpolation)
1229 "text/ssr" =&gt; '%s &lt;- "%s";', # R VAR &lt;- "value";
1230 "text/ssrebol" =&gt; '%s: copy "%s"', # REBOL VAR: copy "value"
1231 "text/postgresql" =&gt; '', # PostgreSQL (interpolation)
1232 "" =&gt; ""
1234 </PRE>
1237 Four tables allow fine-tuning of interpreter with code that should be
1238 added before and after each code block:
1239 </P>
1242 Code added before each script block
1243 </P>
1245 <PRE>
1246 %ScriptingPrefix = (
1247 "text/testperl" =&gt; "\# Prefix Code;", # Perl script testing
1248 "text/ssm4" =&gt; 'divert(0)' # M4 macro's (open STDOUT)
1250 </PRE>
1253 Code added at the end of each script block
1254 </P>
1256 <PRE>
1257 %ScriptingPostfix = (
1258 "text/testperl" =&gt; "\# Postfix Code;", # Perl script testing
1259 "text/ssm4" =&gt; 'divert(-1)' # M4 macro's (block STDOUT)
1261 </PRE>
1264 Initialization code, inserted directly after opening (NEVER interpolated)
1265 </P>
1267 <PRE>
1268 %ScriptingInitialization = (
1269 "text/testperl" =&gt; "\# Initialization Code;", # Perl script testing
1270 "text/ssawk" =&gt; 'BEGIN {', # Server Side awk scripts
1271 "text/sslisp" =&gt; '(prog1 nil ', # Lisp (rep)
1272 "text/ssm4" =&gt; 'divert(-1)' # M4 macro's (block STDOUT)
1274 </PRE>
1277 Cleanup code, inserted before closing (NEVER interpolated)
1278 </P>
1280 <PRE>
1281 %ScriptingCleanup = (
1282 "text/testperl" =&gt; "\# Cleanup Code;", # Perl script testing
1283 "text/sspraat" =&gt; 'Quit',
1284 "text/ssawk" =&gt; '};', # Server Side awk scripts
1285 "text/sslisp" =&gt; '(princ "\n" standard-output)).' # Closing print to rep
1286 "text/postgresql" =&gt; '\q',
1288 </PRE>
1291 The SRC attribute is NOT magical for these interpreters. In short,
1292 all code inside a source file or {} block is written verbattim
1293 to the interpreter. No (pre-)processing or executional magic is done.
1294 </P>
1297 A serious shortcomming of the described mechanism for handling other
1298 (scripting) languages, with respect to standard perl scripts
1299 (i.e., 'text/ssperl'), is that the code is only executed when
1300 the pipe to the interpreter is closed. So the pipe has to be
1301 closed at the end of each block. This means that the state of the
1302 interpreter (e.g., all variable values) is lost after the closing of
1303 the next &lt;/SCRIPT&gt; tag. The standard 'text/ssperl' scripts retain
1304 all values and definitions.
1305 </P>
1308 <A NAME="APPLIC"><H2 ALIGN="CENTER">APPLICATION MIME TYPES</H2></A>
1311 To ease some important auxilliary functions from within the
1312 html pages I have added them as MIME types. This uses
1313 the mechanism that is also used for the evaluation of
1314 other scripting languages, with interpolation of CGI
1315 parameters (and perl-variables). Actually, these are
1316 defined exactly like any other "scripting language".
1317 </P>
1319 <dl>
1320 <dt>text/ssdisplay:
1321 <dd>display some (HTML) text with interpolated
1322 variables (uses `cat`).
1323 <dt>text/sslogfile:
1324 <dd>write (append) the interpolated block to the file
1325 mentioned on the first, non-empty line
1326 (the filename can be preceded by 'File: ',
1327 note the space after the ':',
1328 uses `awk .... &gt;&gt; &lt;filename&gt;`).
1329 <dt>text/ssmailto:
1330 <dd>send email directly from within the script block.
1331 The first line of the body must contain
1332 To:Name@Valid.Email.Address
1333 (note: NO space between 'To:' and the email adres)
1334 For other options see the mailto man pages.
1335 It works by directly sending the (interpolated)
1336 content of the text block to a pipe into the
1337 Linux program 'mailto'.
1338 </dl>
1341 In these script blocks, all Perl variables will be
1342 replaced by their values. All CGI variables are cleaned before
1343 they are used. These CGI variables must be redefined with a
1344 CGI attribute to restore their original values.
1345 In general, this will be more secure than constructing
1346 e.g., your own email command lines. For instance, Mailto will
1347 not execute any odd (forged) email address, but just stops
1348 when the email address is invalid and awk will construct
1349 any filename you give it (e.g. '&lt;File;rm\\\040-f' would end up
1350 as a "valid" UNIX filename). Note that it will also gladly
1351 store this file anywhere (/../../../etc/passwd will work!).
1352 Use the CGIscriptor::CGIsafeFileName() function to clean the
1353 filename.
1354 </P>
1356 <A NAME="PIPES"><H2 ALIGN="CENTER">SHELL SCRIPT PIPING</H2></A>
1359 If a shell script starts with the UNIX style "#! &lt;shell command&gt; \n"
1360 line, the rest of the shell script is piped into the indicated command,
1361 i.e.,
1362 open(COMMAND, "| command");print COMMAND $RestOfScript;
1363 </P>
1366 In many ways this is equivalent to the MIME-type profiling for
1367 evaluating other scripting languages as discussed above. The
1368 difference breaks down to convenience. Shell script piping is a
1369 "raw" implementation. It allows you to control all aspects of
1370 execution. Using the MIME-type profiling is easier, but has a
1371 lot of defaults built in that might get in the way. Another
1372 difference is that shell script piping uses the SAFEqx() function,
1373 and MIME-type profiling does not.
1374 </P>
1377 Execution of shell scripts is under the control of the Perl Script blocks
1378 in the document. The MIME-type triggered execution of <SCRIPT></SCRIPT>
1379 blocks can be simulated easily. You can switch to a different shell, e.g. tcl,
1380 completely by executing the following Perl commands inside your document:
1381 </P>
1383 <PRE>
1384 &lt;SCRIPT TYPE="text/ssperl"&gt;
1385 $main::ShellScriptContentType = "text/ssTcl"; # Yes, you can do this
1386 CGIscriptor::RedirectShellScript('/usr/bin/tcl'); # Pipe to Tcl
1387 $CGIscriptor::NoShellScriptInterpolation = 1;
1388 &lt;/SCRIPT&gt;
1389 </PRE>
1392 After this script is executed, CGIscriptor will parse scripts of
1393 TYPE="text/ssTcl" and pipe their contents into '|/usr/bin/tcl'
1394 WITHOUT interpolation (i.e., NO substitution of Perl variables).
1395 The crucial function is :
1396 </P>
1398 <PRE>
1399 CGIscriptor::RedirectShellScript('/usr/bin/tcl')
1400 </PRE>
1403 After executing this function, all shell scripts AND all
1404 calls to SAFEqx()) are piped into '|/usr/bin/tcl'. If the argument
1405 of RedirectShellScript is empty, e.g., '', the original (default)
1406 value is reset.
1407 </P>
1410 The standard output, STDOUT, of any pipe is send to the client.
1411 Currently, you should be carefull with quotes in such a piped script.
1412 The results of a pipe is NOT put on the @CGIscriptorResults stack.
1413 As a result, you do not have access to the output of any piped (#!)
1414 process! If you want such access, execute
1415 </P>
1417 <PRE>
1418 &lt;SCRIPT TYPE="text/ssperl"&gt;echo "script"|command&lt;/SCRIPT&gt;
1419 </PRE>
1423 </P>
1425 <PRE>
1426 &lt;SCRIPT TYPE="text/ssperl"&gt;
1427 $resultvar = SAFEqx('echo "script"|command');
1428 &lt;/SCRIPT&gt;.
1429 </PRE>
1432 Safety is never complete. Although SAFEqx() prevents some of the
1433 most obvious forms of attacks and security slips, it cannot prevent
1434 them all. Especially, complex combinations of quotes and intricate
1435 variable references cannot be handled safely by SAFEqx. So be on
1436 guard.
1437 </P>
1439 <A NAME="SSPERL"><H2 ALIGN="CENTER">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</H2></A>
1442 All PERL scripts are evaluated inside a PERL package. This package
1443 has a separate name space. This isolated name space protects the
1444 CGIscriptor.pl program against interference from user code. However,
1445 some variables, e.g., $_, are global and cannot be protected. You are
1446 advised NOT to use such global variable names. You CAN write
1447 directives that directly access the variables in the main program.
1448 You do so at your own risk (there is definitely enough rope available
1449 to hang yourself). The behavior of CGIscriptor becomes undefined if
1450 you change its private variables during run time. The PERL code
1451 directives are used as in:
1452 </P>
1454 <PRE>
1455 $Result = eval($directive); print $Result;'';
1456 </PRE>
1459 ($directive contains all text between &lt;SCRIPT&gt;&lt;/SCRIPT&gt;).
1460 That is, the &lt;directive&gt; is treated as ''-quoted string and
1461 the result is treated as a scalar. To prevent the VALUE of the code
1462 block from appearing on the client's screen, end the directive with
1463 ';""&lt;/SCRIPT&gt;'. Evaluated directives return the last value, just as
1464 eval(), blocks, and subroutines, but only as a scalar.
1465 </P>
1468 IMPORTANT: All PERL variables defined are persistent. Each &lt;SCRIPT&gt;
1469 &lt;/SCRIPT&gt; construct is evaluated as a {}-block with associated scope
1470 (e.g., for "my $var;" declarations). This means that values assigned
1471 to a PERL variable can be used throughout the document unless they
1472 were declared with "my". The following will actually work as intended
1473 (note that the ``-quotes in this example are NOT evaluated, but used
1474 as simple quotes):
1475 </P>
1477 <PRE>
1478 &lt;META CONTENT="text/ssperl; CGI=`$String='abcdefg'`"&gt;
1479 anything ...
1480 &lt;SCRIPT TYPE="text/ssperl"&gt;@List = split('', $String);&lt;/SCRIPT&gt;
1481 anything ...
1482 &lt;SCRIPT TYPE="text/ssperl"&gt;join(", ", @List[1..$#List]);&lt;/SCRIPT&gt;
1483 </PRE>
1486 The first &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt; construct will return the
1487 value scalar(@List), the second &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt;
1488 construct will print the elements of $String separated by commas, leaving
1489 out the first element, i.e., $List[0].
1490 </P>
1493 Another warning: './' and '~/' are ALWAYS replaced by the values of
1494 $YOUR_SCRIPTS and $YOUR_HTML_FILES, respectively . This can interfere
1495 with pattern matching, e.g., $a =~ s/aap\./noot\./g will result in the
1496 evaluations of $a =~ s/aap\\${YOUR_SCRIPTS}noot\./g. Use
1497 s@<i>regexp</i>@<i>replacement</i>@g instead.
1498 </p>
1500 <A NAME="USEREXTENSIONS"><H2 ALIGN="CENTER">USER EXTENSIONS</H2></A>
1503 A CGIscriptor package is attached to the bottom of this file. With
1504 this package you can personalize your version of CGIscriptor by
1505 including often used perl routines. These subroutines can be
1506 accessed by prefixing their names with CGIscriptor::, e.g.,
1507 </P>
1509 <PRE>
1510 &lt;SCRIPT TYPE="text/ssperl"&gt;
1511 CGIscriptor::ListDocs("/Books/*") # List all documents in /Books
1512 &lt;/SCRIPT&gt;
1513 </PRE>
1516 It already contains some useful subroutines for Document Management.
1517 As it is a separate package, it has its own namespace, isolated from
1518 both the evaluator and the main program. To access variables from
1519 the document &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks, use $CGIexecute::&lt;var&gt;.
1520 </P>
1523 Currently, the following functions are implemented
1524 (precede them with CGIscriptor::, see below for more information)
1525 </P>
1527 <UL>
1528 <LI>SAFEqx ('String') -&gt; result of qx/"String"/ # Safe application of ``-quotes<br>
1529 Is used by text/osshell Shell scripts. Protects all CGI
1530 (client-supplied) values with single quotes before executing the
1531 commands (one of the few functions that also works WITHOUT CGIscriptor::
1532 in front)
1533 <LI>defineCGIvariable ($name[, $default) -&gt; 0/1 (i.e.,
1534 failure/success)<br>
1535 Is used by the META tag to define and initialize CGI and ENV
1536 name/value pairs. Tries to obtain an initializing value from (in
1537 order):<br>
1538 $ENV{$name}<br>
1539 The Query string<br>
1540 The default value given (if any)<br>
1541 (one of the few functions that also works WITHOUT CGIscriptor::
1542 in front)
1543 <LI>CGIsafeFileName (FileName) -> FileName or ""<br>
1544 Check a string against the Allowed File Characters (and ../ /..).
1545 Returns an empty string for unsafe filenames.
1546 <LI>CGIsafeEmailAddress (Email) -> Email or ""<br>
1547 Check a string against correct email address pattern.
1548 Returns an empty string for unsafe addresses.
1549 <LI>RedirectShellScript ('CommandString') -&gt; FILEHANDLER or undef<br>
1550 Open a named PIPE for SAFEqx to receive ALL shell scripts
1551 <LI>URLdecode (URL encoded string) -&gt; plain string # Decode URL encoded argument<br>
1552 <LI>URLencode (plain string) -&gt; URL encoded string # Encode argument as URL code<br>
1553 <LI>CGIparseValue (ValueName [, URL_encoded_QueryString]) -&gt; Decoded value<br>
1554 Extract the value of a CGI variable from the global or a private
1555 URL-encoded query (multipart POST raw, NOT decoded)
1556 <li>CGIparseValueList (ValueName [, URL_encoded_QueryString])
1557 -&gt; List of decoded values.<br>
1558 As CGIparseValue, but now assembles ALL values of ValueName into a list.
1559 <LI>CGIparseHeader (ValueName [, URL_encoded_QueryString]) -> Header<br>
1560 Extract the header of a multipart CGI variable from the global or a private
1561 URL-encoded query ("" when not a multipart variable or absent)
1562 <LI>CGIparseForm ([URL_encoded_QueryString]) -&gt; Decoded Form<br>
1563 Decode the complete global URL-encoded query or a private
1564 URL-encoded query
1565 <LI>read_url(URL)<br>
1566 Returns the page from URL (with added base tag, both FTP and HTTP)
1567 Uses main::GET_URL(URL, 1) to get at the command to read the URL.
1568 <LI>BrowseDirs(RootDirectory [, Pattern, Startdir, CGIname]) # print browsable directories
1569 <LI>ListDocs(Pattern [,ListType]) # Prints a nested HTML directory listing of
1570 all documents, e.g., ListDocs("/*", "dl");.<br>
1571 <LI>HTMLdocTree(Pattern [,ListType]) # Prints a nested HTML listing of all
1572 local links starting from a given document, e.g.,
1573 HTMLdocTree("/Welcome.html", "dl");<br>
1574 </UL>
1576 <A NAME="RESULTSSTACK"><H2 ALIGN="CENTER">THE RESULTS STACK: @CGIscriptorResults</H2></A>
1579 If the pseudo-variable "$CGIscriptorResults" has been defined in a
1580 META tag, all subsequent SCRIPT and META results are pushed
1581 on the @CGIscriptorResults stack. This list is just another
1582 Perl variable and can be used and manipulated like any other list.
1583 $CGIscriptorResults[-1] is always the last result.
1584 This is only of limited use, e.g., to use the results of an OS shell
1585 script inside a Perl script. Will NOT contain the results of Pipes
1586 or code from MIME-profiling.
1587 </P>
1589 <A NAME="CGIPREDEFINED"><H2 ALIGN="CENTER">USEFULL CGI PREDEFINED VARIABLES (DO NOT ASSIGN TO THESE)</H2></A>
1591 <ul>
1592 <li>$CGI_HOME - The ServerRoot directory
1593 <li>$CGI_Decoded_QS - The complete decoded Query String
1594 <li>$CGI_Content_Length - The ACTUAL length of the Query String
1595 <li>$CGI_Date - Current date and time
1596 <li>$CGI_Year $CGI_Month $CGI_Day $CGI_WeekDay - Current Date
1597 <li>$CGI_Time - Current Time
1598 <li>$CGI_Hour $CGI_Minutes $CGI_Seconds - Current Time, split
1599 GMT Date/Time:
1600 <li>$CGI_GMTYear $CGI_GMTMonth $CGI_GMTDay $CGI_GMTWeekDay $CGI_GMTYearDay
1601 <li>$CGI_GMTHour $CGI_GMTMinutes $CGI_GMTSeconds $CGI_GMTisdst
1602 </ul>
1604 <A NAME="ENVIRONMENT"><H2 ALIGN="CENTER">USEFULL CGI ENVIRONMENT VARIABLES</H2></A>
1607 Variables accessible (in APACHE) as $ENV{"&lt;name&gt;"}
1608 (see: "http://hoohoo.ncsa.uiuc.edu/cgi/env.html"):
1609 </P>
1611 <UL>
1612 <LI>QUERY_STRING - The query part of URL, that is, everything that follows the
1613 question mark.
1614 <LI>PATH_INFO - Extra path information given after the script name
1615 <LI>PATH_TRANSLATED - Extra pathinfo translated through the rule system.
1616 (This doesn't always make sense.)
1617 <LI>REMOTE_USER - If the server supports user authentication, and the script is
1618 protected, this is the username they have authenticated as.
1619 <LI>REMOTE_HOST - The hostname making the request. If the server does not have
1620 this information, it should set REMOTE_ADDR and leave this unset
1621 <LI>REMOTE_ADDR - The IP address of the remote host making the request.
1622 <LI>REMOTE_IDENT - If the HTTP server supports RFC 931 identification, then this
1623 variable will be set to the remote user name retrieved from
1624 the server. Usage of this variable should be limited to logging
1625 only.
1626 <LI>AUTH_TYPE - If the server supports user authentication, and the script
1627 is protected, this is the protocol-specific authentication
1628 method used to validate the user.
1629 <LI>CONTENT_TYPE - For queries which have attached information, such as HTTP
1630 POST and PUT, this is the content type of the data.
1631 <LI>CONTENT_LENGTH - The length of the said content as given by the client.
1632 <LI>SERVER_SOFTWARE - The name and version of the information server software
1633 answering the request (and running the gateway).
1634 Format: name/version
1635 <LI>SERVER_NAME - The server's hostname, DNS alias, or IP address as it
1636 would appear in self-referencing URLs
1637 <LI>GATEWAY_INTERFACE - The revision of the CGI specification to which this
1638 server complies. Format: CGI/revision
1639 <LI>SERVER_PROTOCOL - The name and revision of the information protocol this
1640 request came in with. Format: protocol/revision
1641 <LI>SERVER_PORT - The port number to which the request was sent.
1642 <LI>REQUEST_METHOD - The method with which the request was made. For HTTP,
1643 this is "GET", "HEAD", "POST", etc.
1644 <LI>SCRIPT_NAME - A virtual path to the script being executed, used for
1645 self-referencing URLs.
1646 <LI>HTTP_ACCEPT - The MIME types which the client will accept, as given by
1647 HTTP headers. Other protocols may need to get this
1648 information from elsewhere. Each item in this list should
1649 be separated by commas as per the HTTP spec.
1650 Format: type/subtype, type/subtype
1651 <LI>HTTP_USER_AGENT - The browser the client is using to send the request.
1652 General format: software/version library/version.
1653 </UL>
1655 <A NAME="RUNNING"><H2 ALIGN="CENTER">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</H2></A>
1658 CGIscriptor.pl will run on any WWW server that runs Perl scripts,
1659 just add a line like the following to your srm.conf file
1660 (Apache example):
1661 </P>
1663 <pre>
1664 ScriptAlias /SHTML/ /real-path/CGIscriptor.pl/
1665 </pre>
1668 URL's that refer to http://www.your.address/SHTML/... will now be handled
1669 by CGIscriptor.pl, which can use a private directory tree (default is the
1670 DOCUMENT_ROOT directory tree, but it can be anywhere, see manual).
1671 </P>
1674 If your hosting ISP won't let you add ScriptAlias lines you can use
1675 the following "rewrite"-based "scriptalias" in .htaccess
1676 (from Gerd Franke)
1677 </P>
1679 <pre>
1680 RewriteEngine On
1681 RewriteBase /
1682 RewriteCond %{REQUEST_FILENAME} .html$
1683 RewriteCond %{SCRIPT_FILENAME} !cgiscriptor.pl$
1684 RewriteCond %{REQUEST_FILENAME} -f
1685 RewriteRule ^(.*)$ /cgi-bin/cgiscriptor.pl/$1?%{QUERY_STRING}
1686 </Pre>
1689 Everthing with the extension ".html" and not including "cgiscriptor.pl"
1690 in the url and where the file "path/filename.html" exists is redirected
1691 to "/cgi.bin/cgiscriptor.pl/path/filename.html?query".
1692 The user configuration should get the same path-level as the
1693 .htaccess-file:
1694 </P>
1696 <pre>
1697 # Just enter your own directory path here
1698 $YOUR_HTML_FILES = "$ENV{'DOCUMENT_ROOT'}";
1699 # use DOCUMENT_ROOT only, if .htaccess lies in the root-directory.
1700 </Pre>
1703 If this .htaccess goes in a specific directory, the path to this
1704 directory must be added to $ENV{'DOCUMENT_ROOT'}.
1705 </p>
1708 The CGIscriptor file contains all documentation as comments. These comments
1709 can be removed to speed up loading (e.g., `egrep -v '^#' CGIscriptor.pl` >
1710 leanScriptor.pl). A bare bones version of CGIscriptor.pl, lacking
1711 documentation, most comments, access control, example functions etc.
1712 (but still with the copyright notice and some minimal documentation)
1713 can be obtained by calling CGIscriptor.pl on the command line with the
1714 '-slim' command line argument, e.g.,
1715 </p>
1717 <PRE>
1718 &gt;CGIscriptor.pl -slim &gt; slimCGIscriptor.pl
1719 </PRE>
1722 CGIscriptor.pl can be run from the command line with &lt;path&gt; and &lt;query&gt; as
1723 arguments, as `CGIscriptor.pl &lt;path&gt; &lt;query&gt;`, inside a perl script with
1724 'do CGIscriptor.pl' after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING},
1725 or CGIscriptor.pl can be loaded with 'require "/real-path/CGIscriptor.pl"'.
1726 In the latter case, requests are processed by 'Handle_Request();'
1727 (again after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING}).
1728 </P>
1731 The --help command line switch will print the manual.
1732 </p>
1735 Using the command line execution option, CGIscriptor.pl can be used as a document
1736 (meta-)preprocessor. If the first argument is '-', STDIN will be read. For example:
1737 </P>
1739 <PRE>
1740 &gt; cat MyDynamicDocument.html | CGIscriptor.pl - '[QueryString]' &gt; MyStaticFile.html
1741 </PRE>
1744 This command line will produce a STATIC file with the DYNAMIC content of
1745 MyDocument.html "interpolated". This option would be very dangerous when
1746 available over the internet. If someone could sneak a
1747 'http://www.your.domain/-' URL past your server, CGIscriptor could EXECUTE
1748 any POSTED contend. Therefore, for security reasons, STDIN will NOT
1749 be read if ANY of the HTTP server environment variables is set (e.g., SERVER_PORT,
1750 SERVER_PROTOCOL, SERVER_NAME, SERVER_SOFTWARE, HTTP_USER_AGENT,
1751 REMOTE_ADDR).<br>
1752 This block on processing STDIN on HTTP requests can be lifted by setting
1753 <pre>
1754 $BLOCK_STDIN_HTTP_REQUEST = 0;
1755 </pre>
1756 In the security configuration. But be carefull when doing this.
1757 It can be very dangerous.
1758 </P>
1761 Running demo's and more information can be found at
1762 http://www.fon.hum.uva.nl/~rob/OSS/OSS.html
1763 </P>
1766 A pocket-size HTTP daemon, CGIservlet.pl, is available from my web site
1767 or CPAN that can use CGIscriptor.pl as the base of a µWWW server and
1768 demonstrates its use.
1769 </P>
1771 <A NAME="NON-UNIX"><H2 ALIGN="CENTER">NON-UNIX PLATFORMS</H2></A>
1774 CGIscriptor.pl was mainly developed and tested on UNIX. However, as I
1775 coded part of the time on an Apple Macintosh under MacPerl, I made sure
1776 CGIscriptor did run under MacPerl (with command line options). But only as
1777 an independend script, not as part of a HTTP server. I have used it
1778 under Apache in Windows XP.
1779 </P>
1781 <A NAME="license"><H2 ALIGN="CENTER">license</H2></A>
1784 This program is free software; you can redistribute it and/or
1785 modify it under the terms of the GNU General Public License
1786 as published by the Free Software Foundation; either version 2
1787 of the License, or (at your option) any later version.
1788 </P>
1791 This program is distributed in the hope that it will be useful,
1792 but WITHOUT ANY WARRANTY; without even the implied warranty of
1793 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1794 GNU General Public License for more details.
1795 </P>
1798 You should have received a copy of the GNU General Public License
1799 along with this program; if not, write to the Free Software
1800 Foundation, Inc., 59 Temple Place - Suite 330,
1801 Boston, MA 02111-1307, USA.
1802 </P>
1804 <PRE>
1805 Author: Rob van Son
1806 email:
1807 R.J.J.H.vanSon@uva.nl
1808 University of Amsterdam
1810 Date: May 22, 2000
1811 Ver: 2.0
1812 Env: Perl 5.002
1813 </PRE>
1814 </BODY>
1816 </HTML>