Added USEFAT and spaces to filenames
[CGIscriptor.git] / CGIscriptor.html
blob50e299983b72d54beec446dc2f47154a5624b9d4
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2 <HTML>
4 <HEAD>
6 <TITLE>CGIscriptor 2.0 Manual</TITLE>
9 </HEAD>
11 <BODY>
13 <H1 ALIGN="CENTER">
14 <I>CGIscriptor 2.4</I>: An implementation of integrated server side CGI scripts
15 </H1>
17 <UL>
18 <P>
19 <LI><A HREF="#HYPE">HYPE</A>
20 <LI><A HREF="#HOWITWORKS">THIS IS HOW IT WORKS</A>
21 <LI><A HREF="#HTML4">HTML 4 COMPLIANCE</A>
22 <LI><A HREF="#SECURITY">SECURITY</A>
23 </P>
25 <P>
26 <LI><A HREF="#MANUAL">USER MANUAL</A>
27 <UL>
28 <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
29 <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
30 <LI><A HREF="#BINFILES">NON-HTML FILES</A>
31 <LI><A HREF="#META">THE META TAG</A>
32 <LI><A HREF="#DIV">THE DIV/INS TAG</A>
33 <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
34 <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
35 <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
36 <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
37 <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
38 <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
39 <LI><A HREF="#APPLIC">APPLICATION MIME TYPES</A>
40 <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
41 <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
42 <LI><A HREF="#SESSIONTICKETS">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</A>
43 <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
44 <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
45 <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
46 <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
47 <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
48 <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
49 </UL>
50 <LI><A HREF="#license">license</A>
51 </P>
53 </UL>
55 <A NAME="HYPE"><H2 ALIGN="CENTER">HYPE</H2></A>
57 <P>
58 CGIscriptor merges plain ASCII HTML files transparantly and safely
59 with CGI variables, in-line PERL code, shell commands, and executable
60 scripts in many languages (on-line and real-time). It combines the
61 "ease of use" of HTML files with the versatillity of specialized
62 scripts and PERL programs. It hides all the specifics and
63 idiosyncrasies of correct output and CGI coding and naming. Scripts
64 do not have to be aware of HTML, HTTP, or CGI conventions just as HTML
65 files can be ignorant of scripts and the associated values. CGIscriptor
66 complies with the W3C HTML 4.0 recommendations.
67 </P>
69 <P>
70 In addition to its use as a WWW embeded CGI processor, it can
71 be used as a command-line document preprocessor (text-filter).
72 </P>
74 <A NAME="HOWITWORKS"><H2 ALIGN="CENTER">THIS IS HOW IT WORKS</H2></A>
76 <P>
77 The aim of CGIscriptor is to execute "plain" scripts inside a text file
78 using any required CGIparameters and environment variables. It
79 is optimized to transparantly process HTML files inside a WWW server.
80 The native language is Perl, but many other scripting languages
81 can be used.
82 </P>
84 <P>
85 CGIscriptor reads text files from the requested input file (i.e., from
86 $YOUR_HTML_FILES$PATH_INFO) and writes them to &lt;STDOUT&gt; (i.e., the client
87 requesting the service) preceded by the obligatory
88 "Content-type: text/html\n\n" or "Content-type: text/plain\n\n" string
89 (except for "raw" files which supply their own Content-type message
90 and only if the SERVER_PROTOCOL contains HTTP, FTP, GOPHER, MAIL, or MIME).
91 </P>
93 <P>
94 When CGIscriptor encounters an embedded script, indicated by an HTML4 tag
95 </P>
97 <PRE>
98 &lt;SCRIPT TYPE="text/ssperl" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
99 PERL script
100 &lt;/SCRIPT&gt;
101 </PRE>
105 <PRE>
106 &lt;SCRIPT TYPE="text/osshell" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
107 OS Shell script
108 &lt;/SCRIPT&gt;
109 </PRE>
112 construct (anything between []-brackets is optional, other MIME-types are
113 supported), the embedded script is removed and both the contents of the
114 source file (i.e., "do 'ScriptSource'") AND the script are evaluated as a
115 PERL program (i.e., by eval()), a shell script (i.e., by a "safe" version
116 of `Command`, qx) or an external interpreter. The output of the eval()
117 function takes the place of the original &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
118 construct in the output string. Any CGI parameters declared by the CGI
119 attribute are available as simple perl variables, and can subsequently
120 be made available as variables to other scripting languages (e.g., bash,
121 python, or lisp).
122 </P>
125 Example: printing "Hello World"
126 </P>
128 <PRE>
129 &lt;HTML>&lt;HEAD>&lt;TITLE>Hello World&lt;/TITLE&gt;
130 &lt;BODY&gt;
131 &lt;H1&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;"Hello World"&lt;/SCRIPT&gt;&lt;/H1&gt;
132 &lt;/BODY&gt;&lt;/HTML&gt;
133 </PRE>
136 Save this in a file, hello.html, in the directory you indicated with
137 $YOUR_HTML_FILES and access http://your_server/SHTML/hello.html
138 (or to whatever name you use as an alias for CGIscriptor.pl).
139 This is realy ALL you need to do to get going.
140 </P>
143 You can use any values that are delivered in CGI-compliant form (i.e.,
144 the "?name=value" type URL additions) transparently as "$name" variables
145 in your scripts IFF you have declared them in a META or SCRIPT tag before e.g.:
146 </P>
148 <PRE>
149 &lt;META CONTENT="text/ssperl; CGI='$name = `default value`'
150 [SRC='ScriptSource']"&gt;
151 </PRE>
153 <PRE>
154 &lt;SCRIPT TYPE=text/ssperl CGI="$name = 'default value'"
155 [SRC='ScriptSource']&gt;
156 </PRE>
159 After such a 'CGI' attribute, you can use $name as an ordinary PERL variable
160 (the ScriptSource file is immediately evaluated with "do 'ScriptSource'").
161 The CGIscriptor script allows you to write ordinary HTML files which will
162 include dynamic CGI aware (run time) features, such as on-line answers
163 to specific CGI requests, queries, or the results of calculations.
164 </P>
167 For example, if you wanted to answer questions of clients, you could write
168 a Perl program called "Answer.pl" with a function "AnswerQuestion()"
169 that prints out the answer to requests given as arguments. You then write
170 a HTML page "Respond.html" containing the following fragment:
171 </P>
173 <hr>
174 <PRE>
175 &lt;CENTER&gt;
176 The Answer to your question
177 &lt;META CONTENT="text/ssperl; CGI='$Question'"&gt;
178 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;$Question&lt;/SCRIPT&gt;&lt;/h3&gt;
180 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./PATH/Answer.pl"&gt;
181 AnswerQuestion($Question);
182 &lt;/SCRIPT&gt;&lt;/h3&gt;
183 &lt;CENTER&gt;
184 &lt;FORM ACTION=Respond.html METHOD=GET&gt;
185 Next question: &lt;INPUT NAME="Question" TYPE=TEXT SIZE=40&gt;&lt;br&gt;
186 &lt;INPUT TYPE=SUBMIT VALUE="Ask"&gt;
187 &lt;/FORM&gt;
188 </PRE>
189 <hr>
192 The output could look like the following (in HTML-speak):
193 </P>
195 <hr>
196 <PRE>
197 <CENTER>
198 The Answer to your question
199 <h3>What is the capital of the Netherlands?</h3>
201 <h3>Amsterdam</h3>
202 </CENTER>
203 <FORM ACTION=Respond.html METHOD=GET>
204 Next question: <INPUT NAME="Question" TYPE=TEXT SIZE=40><br>
205 <INPUT TYPE=SUBMIT VALUE="Ask">
206 </PRE>
207 <hr>
210 Note that the function "Answer.pl" does know nothing about CGI or HTML,
211 it just prints out answers to arguments. Likewise, the text has no
212 provisions for scripts or CGI like constructs. Also, it is completely
213 trivial to extend this "program" to use the "Answer" later in the page
214 to call up other information or pictures/sounds. The final text never
215 shows any cue as to what the original "source" looked like, i.e.,
216 where you store your scripts and how they are called.
217 </P>
220 There are some extra's. The argument of the files called in a SRC= tag
221 can access the CGI variables declared in the preceding META tag from
222 the @ARGV array. Executable files are called as:
223 `file '$ARGV[0]' ... ` (e.g., `Answer.pl \'$Question\'`;)
224 The files called from SRC can even be (CGIscriptor) html files which are
225 processed in-line. Furthermore, the SRC= tag can contain a perl block
226 that is evaluated. That is,
227 </P>
229 <PRE>
230 &lt;META CONTENT="text/ssperl; CGI='$Question' SRC='{$Question}'"&gt;
231 </PRE>
234 will result in the evaluation of "print do {$Question};" and the VALUE
235 of $Question will be printed. Note that these "SRC-blocks" can be
236 preceded and followed by other file names, but only a single block is
237 allowed in a SRC= tag.
238 </P>
241 One of the major hassles of dynamic WWW pages is the fact that several
242 mutually incompatible browsers and platforms must be supported. For example,
243 the way sound is played automatically is different for Netscape and
244 Internet Explorer, and for each browser it is different again on
245 Unix, MacOS, and Windows. Realy dangerous is processing user-supplied
246 (form-) values to construct email addresses, file names, or database
247 queries. All Apache WWW-server exploits reported in the media are
248 based on faulty CGI-scripts that didn't check their user-data properly.
249 </p>
252 There is no panacee for these problems, but a lot of work and problems
253 can be safed by allowing easy and transparent control over which
254 &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks are executed on what CGI-data. CGIscriptor
255 supplies such a method in the form of a pair of attributes:
256 IF='...condition..' and UNLESS='...condition...'. When added to a
257 script tag, the whole block (including the SRC attribute) will be
258 ignored if the condition is false (IF) or true (UNLESS).
259 For example, the following block will NOT be evaluated if the value
260 of the CGI variable FILENAME is NOT a valid filename:
261 </p>
263 <pre>
264 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME' IF='CGIscriptor::CGIsafeFileName($FILENAME)'&gt;
265 .....
266 &lt;/SCRIPT&gt;
267 </pre>
270 (the function CGIsafeFileName(String) returns an empty string ("")
271 if the String argument is not a valid filename).
272 The UNLESS attribute is the mirror image of IF.
273 </p>
276 A user manual follows the HTML 4 and security paragraphs below.
277 </P>
280 <A NAME="HTML4"><H2 ALIGN="CENTER">HTML 4 COMPLIANCE</H2></A>
283 In general, CGIscriptor.pl complies with the HTML 4 recommendations of
284 the W3C. This means that any software to manage Web sites will be able
285 to handle CGIscriptor files, as will web agents.
286 </P>
289 All script code should be placed between &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, the
290 script type is indicated with TYPE="mime-type", the LANGUAGE
291 feature is ignored, and a SRC feature is implemented. All CGI specific
292 features are delegated to the CGI attribute.
293 </P>
296 However, the behavior deviates from the W3C recommendations at some
297 points. Most notably:
298 </P>
300 <DL>
301 <dt>0- The scripts are executed at the server side, invisible to the
302 client (i.e., the browser)
303 <dt>1- The mime-types are personal and idiosyncratic, but can be adapted.
304 <dt>2- Code in the body of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tag-pair is still evaluated
305 when a SRC feature is present.
306 <dt>3- The SRC feature reads a list of files.
307 <dt>4- The files in a SRC feature are processed according to file type.
308 <dt>5- The SRC feature evaluates inline Perl code.
309 <dt>6- Processed META, INS, and DIV tags are removed from the output document.
310 <dt>7- All attributes of the processed META tags, except CONTENT, are ignored
311 (i.e., deleted from the output).
312 <dt>8- META tags can be placed ANYWHERE in the document.
313 <dt>9- Through the SRC feature, META tags can have visible output in the
314 document.
315 <dt>10- The CGI attribute that declares CGI parameters, can be used
316 inside the &lt;SCRIPT&gt; tag.
317 <dt>11- Use of an extended quote set, i.e., '', "", ``, (), {}, []
318 and their \-slashed combinations: \'\', \"\", \`\`, \(\),
319 \{\}, \[\].
320 <dt>12- IF and UNLESS attributes to &lt;SCRIPT&gt;, &lt;META&gt;,
321 &lt;INS&gt;, and &lt;DIV&gt; tags.
322 <dt>13- &lt;DIV&gt; tags cannot be nested, &lt;DIV&gt; tags are not
323 rendered with new-lines.
324 <dt>14- The XML style &lt;TAG .... /&gt; is recognized and handled correctly.
325 (i.e., no content is processed)
326 </DL>
329 The reasons for these choices are:
330 </P>
333 You can still write completely HTML4 compliant documents. CGIscriptor
334 will not force you to write "deviant" code. However, it allows you to
335 do so (which is, in fact, just as bad). The prime design principle
336 was to allow users to include plain Perl code. The code itself should
337 be "enhancement free". Therefore, extra features were needed to
338 supply easy access to CGI and Web site components. For security
339 reasons these have to be declared explicitly. The SRC feature
340 transparently manages access to external files, especially the safe
341 use of executable files.
342 </P>
345 The CGI attribute handles the declarations of external (CGI) variables
346 in the SCRIPT and META tag's.<BR>
347 EVERYTHING THE CGI ATTRIBUTE AND THE META TAG DO CAN BE DONE INSIDE
348 A &lt;SCRIPT&gt;&lt;/SCRIPT&gt; TAG CONSTRUCT.
349 </P>
352 The reason for the IF, UNLESS, and SRC attributes (and its Perl code evaluation)
353 were build into the META and SCRIPT tags is part laziness, part security. The SRC
354 blocks allows more compact documents and easier debugging. The values of the
355 CGI variables can be immediately screened for security by IF or UNLESS
356 conditions, and even SRC attributes (e.g., email addresses and file names), and
357 a few commands can be called without having to add another Perl TAG pair.
358 This is especially important for documents that require the use of other
359 (restricted) "scripting" languages that lag transparent control structures.
360 </P>
363 <A NAME="SECURITY"><H2 ALIGN="CENTER">SECURITY</H2></A>
366 Your WWW site is a few keystrokes away from a few hundred million internet
367 users. A fair percentage of these users knows more about your computer
368 than you do. And some of these just might have bad intentions.
369 </P>
372 To ensure uncompromized operation of your server and platform, several
373 features are incorporated in CGIscriptor.pl to enhance security.
374 First of all, you should check the source of this program. No security
375 measures will help you when you download programs from anonymous sources.
376 If you want to use THIS file, please make sure that it is uncompromized.
377 The best way to do this is to contact the source and try to determine
378 whether s/he is reliable (and accountable).
379 </P>
382 BE AWARE THAT ANY PROGRAMMER CAN CHANGE THIS PROGRAM IN SUCH A WAY THAT
383 IT WILL SET THE DOORS TO YOUR SYSTEM WIDE OPEN
384 </P>
387 I would like to ask any user who finds bugs that could compromise
388 security to report them to me (and any other bug too,
389 Email: R.J.J.H.vanSon@gmail.com or ifa@hum.uva.nl).
390 </P>
392 <H2 ALIGN="CENTER">Security features</H2>
394 <dl>
395 <dt>1 Invisibility
396 <dd>The inner workings of the HTML source files are completely hidden
397 from the client. Only the HTTP header and the ever changing content
398 of the output distinguish it from the output of a plain, fixed HTML
399 file. Names, structures, and arguments of the "embedded" scripts
400 are invisible to the client. Error output is suppressed except
401 during debugging (user configurable).
403 <dt>2 Separate directory trees
404 <dd>Directories containing Inline text and script files can reside on
405 separate trees, distinct from those of the HTTP server. This means
406 that NEITHER the text files, NOR the script files can be read by
407 clients other than through CGIscriptor.pl, UNLESS they are
408 EXPLICITELY made available.
410 <dt>3 Requests are NEVER "evaluated"
411 <dd>All client supplied values are used as literal values (''-quoted).
412 Client supplied ''-quotes are ALWAYS removed. Therefore, as long as the
413 embedded scripts do NOT themselves evaluate these values, clients CANNOT
414 supply executable commands. Be sure to AVOID scripts like:
416 <PRE>
417 &lt;META CONTENT="text/ssperl; CGI='$UserValue'"&gt;
418 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 $UserValue`;&lt;/SCRIPT&gt;
419 </PRE>
422 These are a recipe for disaster. However, the following quoted
423 form should be save (but is still not adviced):
424 </P>
426 <PRE>
427 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 \'$UserValue\'`;&lt;/SCRIPT&gt;
428 </PRE>
431 A special function, SAFEqx(), will automatically do exactly this,
432 e.g., SAFEqx('ls -1 $UserValue') will execute `ls -1 \'$UserValue\'`
433 with $UserValue interpolated. I recommend to use SAFEqx() instead
434 of backticks whenever you can. The OS shell scripts inside
435 </P>
437 <PRE>
438 &lt;SCRIPT TYPE="text/osshell"&gt;ls -1 $UserValue&lt;/SCRIPT&gt;
439 </PRE>
442 are handeld by SAFEqx and automatically ''-quoted.
443 </P>
445 <dt>4 Logging of requests
446 <dd>All requests can be logged separate from the Host server. The level of
447 detail is user configurable: Including or excluding the actual queries.
448 This allows for the inspection of (im-) proper use.
450 <dt>5 Access control: Clients
451 <dd>The Remote addresses can be checked against a list of authorized
452 (i.e., accepted) or non-authorized (i.e., rejected) clients. Both
453 REMOTE_HOST and REMOTE_ADDR are tested so clients without a proper
454 HOST name can be (in-) excluded by their IP-address. Client patterns
455 containing all numbers and dots are considered IP-addresses, all others
456 domain names. No wild-cards or regexp's are allowed, only partial
457 addresses.<br>
458 Matching of names is done from the back to the front (domain first,
459 i.e., $REMOTE_HOST =~ /\Q$pattern\E$/is), so including ".edu" will
460 accept or reject all clients from the domain EDU. Matching of
461 IP-addresses is done from the front to the back (domain first, i.e.,
462 $REMOTE_ADDR =~ /^\Q$pattern\E/is), so including "128." will (in-)
463 exclude all clients whose IP-address starts with 128.
464 There are two special symbols: "-" matches HOSTs with no name and "*"
465 matches ALL HOSTS/clients.<br>
468 For those needing more expressional power, lines starting with
469 "-e" are evaluated by the perl eval() function. E.g.,
470 '-e $REMOTE_HOST =~ /\.edu$/is;' will accept/reject clients from the
471 domain '.edu'.
472 </P>
474 <dt>6 Access control: Files
475 <dd>In principle, CGIscriptor could read ANY file in the directory
476 tree as discussed in 1. However, for security reasons this is
477 restricted to text files. It can be made more restricted by entering
478 a global file pattern (e.g., ".html"). This is done by default.
479 For each client requesting access, the file pattern(s) can be made
480 more restrictive than the global pattern by entering client specific
481 file patterns in the Access Control files (see 5).
482 For example: if the ACCEPT file contained the lines
484 <PRE>
485 * DEMO
486 .hum.uva.nl LET
487 145.18.230.
488 </PRE>
491 Then all clients could request paths containing "DEMO" or "demo", e.g.
492 "/my/demo/file.html" ($PATH_INFO =~ /\Q$pattern\E/), Clients from
493 *.hum.uva.nl could also request paths containing "LET or "let", e.g.
494 "/my/let/file.html", and clients from the local cluster
495 145.18.230.[0-9]+ could access ALL files.
496 Again, for those needing more expressional power, lines starting with
497 "-e" are evaluated. For instance: <br />
498 '-e $REMOTE_HOST =~ /\.edu$/is && $PATH_INFO =~ m@/DEMO/@is;' <br />
499 will accept/reject requests for files from the directory "/demo/" from
500 clients from the domain '.edu'.<br />
501 Path selections starting with ! or 'not' will be inverted. That is:
502 </p>
503 <PRE>
504 * not .wav
505 </PRE>
507 Will match all file and path names that do NOT contain '.wav'
508 </P>
510 <dt>7 Access control: Server side session tickets
511 <dd>Specific paths can be controlled by Session Tickets which must be
512 present as a CGI or Cookie value in the request. These paths
513 are defined in %TicketRequiredPatterns as pairs of:<br />
514 ('regexp' =&gt; 'SessionPath\tPasswordPath\tLogin.html\tExpiration').<br />
515 Session Tickets are stored in a separate directory (SessionPath, e.g.,
516 "Private/.Session") as files with the exact same name of the TICKET
517 variable value.
518 The following is an example of a SESSION ticket:
519 <pre>
520 Type: SESSION
521 IPaddress: 127.0.0.1
522 AllowedPaths: ^/Private/Name/
523 DeniedPaths: ^/Private/CreateUser\.
524 Expires: +3600
525 Username: test
527 </pre>
528 Other content can follow. <br />
529 <br />
530 It is adviced that Session Tickets should expire and be deleted
531 after some (idle) time. The IP address should be the IP number at login, and
532 the ticket will be rejected if it is presented from another IP address.
533 AllowedPaths and DeniedPaths are perl regexps. Be careful how they match. Make sure to delimit
534 the names to prevent access to overlapping names, eg, "^/Private/Rob" will also
535 match "^/Private/Robert", however, "^/Private/Rob/" will not. Expires is the
536 time the ticket will remain valid after creation (file ctime). Time can be given
537 in s[econds] (default), m[inutes], h[hours], or d[ays], eg, "24h" means 24 hours.
538 Only the <em>Type:</em> field needs be present.<br />
539 <br />
540 Next to Session Tickets, there are four other type of ticket files:<br />
541 - LOGIN tickets store information about a current login request<br />
542 - PASSWORD tickets store account information to authorize login requests<br />
543 - IPADDRESS tickets for IP address-only checks<br />
544 - CHALLENGE tickets for challenge tasks for every request<br />
545 </p>
547 <dt>8 Query length limiting
548 <dd>The length of the Query string can be limited. If CONTENT_LENGTH is larger
549 than this limit, the request is rejected. The combined length of the
550 Query string and the POST input is checked before any processing is done.
551 This will prevent clients from overloading the scripts.
552 The actual, combined, Query Size is accessible as a variable through
553 $CGI_Content_Length.
554 </P>
557 <dt>9 Illegal filenames, paths, and protected directories
558 <dd>One of the primary security concerns in handling CGI-scripts is the
559 use of "funny" characters in the requests that con scripts in executing
560 malicious commands. Examples are inserting ';', null bytes, or &lt;newline&gt; characters
561 in URL's and filenames, followed by executable commands. A special
562 variable $FileAllowedChars stores a string of all allowed characters.
563 Any request that translates to a filename with a character OUTSIDE
564 this set will be rejected.<br>
565 In general, all (readable files) in the ServerRoot tree are accessible.
566 This might not be what you want. For instance, your ServerRoot directory
567 might be the working directory of a CVS project and contain sensitive
568 information (e.g., the password to get to the repository). You can block
569 access to these subdirectories by adding the corresponding patterns to
570 the $BlockPathAccess variable. For instance, $BlockPathAccess = '/CVS/'
571 will block any request that contains '/CVS/' or:<br>
572 <pre>
573 die if $BlockPathAccess && $ENV{'PATH_INFO'} =~ m@$BlockPathAccess@;
574 </pre>
575 </P>
578 <dt>10 The execution of code blocks can be controlled in a transparent way
579 by adding IF or UNLESS conditions in the tags themselves.
580 <dd>That is, a simple check of the validity of filenames or email
581 addresses can be done before any code is executed.
582 </p>
584 </dl>
586 <hr>
588 <A NAME="MANUAL"><H1 ALIGN="CENTER">USER MANUAL</H1></A>
590 <UL>
591 <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
592 <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
593 <LI><A HREF="#BINFILES">NON-HTML FILES</A>
594 <LI><A HREF="#META">THE META TAG</A>
595 <LI><A HREF="#DIV">THE DIV/INS TAG</A>
596 <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
597 <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
598 <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
599 <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
600 <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
601 <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
602 <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
603 <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
604 <LI><A HREF="#SESSIONTICKETS">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</A>
605 <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
606 <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
607 <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
608 <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
609 <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
610 <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
611 </UL>
613 <A NAME="INTRODUCTION"><H2 ALIGN="CENTER">INTRODUCTION</H2></A>
616 CGIscriptor removes embedded scripts, indicated by an HTML 4 type
617 &lt;SCRIPT TYPE='text/ssperl'&gt; &lt;/SCRIPT&gt; or &lt;SCRIPT TYPE='text/osshell'&gt;
618 &lt;/SCRIPT&gt; constructs. The contents of the directive are executed by
619 the PERL eval() and `` functions (in a separate name space). The
620 result of the eval() function replaces the &lt;SCRIPT&gt; &lt;/SCRIPT&gt; construct
621 in the output file. You can use the values that are delivered in
622 CGI-compliant form (i.e., the "?name=value&.." type URL additions)
623 transparently as "$name" variables in your directives after they are
624 defined in a &lt;META&gt; or &lt;SCRIPT&gt; tag.
625 If you define the variable "$CGIscriptorResults" in a CGI attribute, all
626 subsequent &lt;SCRIPT&gt; and &lt;META&gt; results (including the defining
627 tag) will also be pushed onto a stack: @CGIscriptorResults. This list
628 behaves like any other, ordinary list and can be manipulated.
629 </P>
632 Both GET and POST requests are accepted. These two methods are treated
633 equal. Variables, i.e., those values that are determined when a file is
634 processed, are indicated in the CGI attribute by $&lt;name&gt; or
635 $&lt;name&gt;=&lt;default&gt; in which &lt;name&gt; is the name of the
636 variable and &lt;default&gt; is the value used when there is NO current CGI
637 value for &lt;name&gt; (you can use white-spaces in
638 $&lt;name&gt;=&lt;default&gt; but really DO make sure that the default
639 value is followed by white space or is quoted). Names can contain any
640 alphanumeric characters and _ (i.e., names match /[\w]+/).<br>
641 If the <i>Content-type:</i> is 'multipart/*', the input is treated as a
642 MIME multipart message and automatically delimited. CGI variables get the
643 "raw" (i.e., undecoded) body of the corresponding message part.
644 </P>
647 Variables can be CGI variables, i.e., those from the QUERY_STRING,
648 environment variables, e.g., REMOTE_USER, REMOTE_HOST, or REMOTE_ADDR,
649 or predefined values, e.g., CGI_Decoded_QS (The complete, decoded,
650 query string), CGI_Content_Length (the length of the decoded query
651 string), CGI_Year, CGI_Month, CGI_Time, and CGI_Hour (the current
652 date and time).
653 </P>
656 All these are available when defined in a CGI attribute. All environment
657 variables are accessible as $ENV{'name'}. So, to access the REMOTE_HOST
658 and the REMOTE_USER, use, e.g.:
659 </P>
661 <PRE>
662 &lt;SCRIPT TYPE='text/ssperl'&gt;
663 ($ENV{'REMOTE_HOST'}||"-")." $ENV{'REMOTE_USER'}"
664 &lt;/SCRIPT&gt;
665 </PRE>
668 (This will print a "-" if REMOTE_HOST is not known)
669 Another way to do this is:
670 </P>
672 <PRE>
673 &lt;META CONTENT="text/ssperl; CGI='$REMOTE_HOST = - $REMOTE_USER'"&gt;
674 &lt;SCRIPT TYPE='text/ssperl'&gt;"$REMOTE_HOST $REMOTE_USER"&lt;/SCRIPT&gt;
675 </PRE>
679 <PRE>
680 &lt;META CONTENT='text/ssperl; CGI="$REMOTE_HOST = - $REMOTE_USER"
681 SRC={"$REMOTE_HOST $REMOTE_USER\n"}'&gt;
682 </PRE>
685 This is possible because ALL environment variables are available as
686 CGI variables. The environment variables take precedence over CGI
687 names in case of a "name clash". For instance:
688 </P>
690 <PRE>
691 &lt;META CONTENT="text/ssperl; CGI='$HOME' SRC={$HOME}"&gt;
692 </PRE>
695 Will print the current HOME directory (environment) irrespective whether
696 there is a CGI variable from the query
697 (e.g., Where do you live? &lt;INPUT TYPE="TEXT" NAME="HOME"&gt;)
698 THIS IS A SECURITY FEATURE. It prevents clients from changing
699 the values of defined environment variables (e.g., by supplying
700 a bogus $REMOTE_ADDR). Although $ENV{} is not changed by the META tags,
701 it would make the use of declared variables insecure. You can still
702 access CGI variables after a name clash with
703 CGIscriptor::CGIparseValue(&lt;name&gt;).
704 </P>
707 Some CGI variables are present several times in the query string
708 (e.g., from multiple selections). These should be defined as
709 @VARIABLENAME=default in the CGI attribute. The list @VARIABLENAME
710 will contain ALL VARIABLENAME values from the query, or a single
711 default value. If there is an ENVIRONMENT variable of the
712 same name, it will be used instead of the default AND the query
713 values. The corresponding function is
714 CGIscriptor::CGIparseValueList(&lt;name&gt;)
715 </P>
718 CGI variables collected in a @VARIABLENAME list are unordered.
719 When more structured variables are needed, a hash table can be used.
720 A variable defined as %VARIABLE=default will collect all
721 CGI-parameter values whose name start with 'VARIABLE' in a hash table
722 with the remainder of the name as a key. For instance, %PERSON will
723 collect PERSONname='John Doe', PERSONbirthdate='01 Jan 00', and
724 PERSONspouse='Alice' into a hash table %PERSON such that
725 $PERSON{'spouse'} equals 'Alice'. Any default value or environment
726 value will be stored under the "" key. If there is an ENVIRONMENT
727 variable of the same name, it will be used instead of the default
728 AND the query values. The corresponding function is
729 CGIscriptor::CGIparseValueHash(&lt;name&gt;)
730 </P>
733 This method of first declaring your environment and CGI variables
734 before being able to use them in the scripts might seem somewhat
735 clumsy, but it protects you from inadvertedly printing out the values of
736 system environment variables when their names coincide with those used
737 in the CGI forms. It also prevents "clients" from supplying CGI parameter
738 values for your private variables.
739 THIS IS A SECURITY FEATURE!
740 </P>
742 <A NAME="NON-HTML"><H2 ALIGN="CENTER">NON-HTML CONTENT TYPES</H2></A>
745 Normally, CGIscriptor prints the standard "Content-type: text/html\n\n"
746 message before anything is printed. This has been extended to include
747 plain text (.txt) files, for which the Content-type (MIME type)
748 'text/plain' is printed. In all other respects, text files are treated as
749 HTML files (this can be switched off by removing '.txt' from the
750 $FilePattern variable). When the content type should be something else,
751 e.g., with multipart files, use the $RawFilePattern (.xmr, see also next
752 item). CGIscriptor will not print a Content-type message for this file type
753 (which must supply its OWN Content-type message). Raw files must still
754 conform to the &lt;SCRIPT&gt;&lt;/SCRIPT&gt; and &lt;META&gt; tag
755 specifications.
756 </P>
758 <A NAME="BINFILES"><H2 ALIGN="CENTER">NON-HTML FILES</H2></A>
761 CGIscriptor is intended to process HTML and text files only. You can
762 create documents of any mime-type on-the-fly using "raw" text files, e.g.,
763 with the .xmr extension. However, CGIscriptor will not process binary files
764 of any type, e.g., pictures or sounds. Given the sheer number of formats, I
765 do not have any intention to do so. However, an escape route has been
766 provided. You can construct a genuine raw (.xmr) text file that contains
767 the perl code to service any file type you want. If the global
768 $BinaryMapFile variable contains the path to this file (e.g.,
769 /BinaryMapFile.xmr), this file will be called whenever an unsupported
770 (non-HTML) file type is requested. The path to the requested binary file
771 is stored in $ENV('CGI_BINARY_FILE') and can be used like any other
772 CGI-variable. Servicing binary files then becomes supplying the correct
773 Content-type (e.g., print "Content-type: image/jpeg\n\n";) and reading the
774 file and writing it to STDOUT (e.g., using sysread() and syswrite()).
775 </P>
777 <A NAME="META"><H2 ALIGN="CENTER">THE META TAG</H2></A>
780 All attributes of a META tag are ignored, except the
781 CONTENT='text/ssperl; CGI=" ... " [SRC=" ... "]' attribute. The string
782 inside the quotes following the CONTENT= indication (white-space is
783 ignored, "'` (){}[]-quotes are allowed, plus their \ versions) MUST
784 start with any of the CGIscriptor mime-types (e.g.: text/ssperl or
785 text/osshell) and a comma or semicolon.
786 The quoted string following CGI= contains a white-space separated list
787 of declarations of the CGI (and Environment) values and default values
788 used when no CGI values are supplied by the query string.
789 </P>
792 If the default value is a longer string containing special characters,
793 possibly spanning several lines, the string must be enclosed in quotes.
794 You may use any pair of quotes or brackets from the list '', "", ``, (),
795 [], or {} to distinguish default values (or preceded by \, e.g., \(...\)
796 is different from (...)). The outermost pair will always be used and any
797 other quotes inside the string are considered to be part of the string
798 value, e.g.,
799 </P>
801 <PRE>
802 $Value = {['this'
803 "and" (this)]}
804 </PRE>
807 will result in $Value getting the default value
808 </P>
810 <PRE>
811 ['this'
812 "and" (this)]
813 </PRE>
816 (NOTE that the newline is part of the default value!).
817 </P>
820 Internally, for defining and initializing CGI (ENV) values, the META
821 and SCRIPT tags use the function "defineCGIvariable($name, $default)"
822 (scalars) and "defineCGIvariableList($name, $default)" (lists).
823 These functions can be used inside scripts as
824 "CGIscriptor::defineCGIvariable($name, $default)" and
825 "CGIscriptor::defineCGIvariableList($name, $default)".
826 </P>
829 The CGI attribute will be processed exactly identical when used inside
830 the &lt;SCRIPT&gt; tag. However, this use is not according to the
831 HTML 4.0 specifications of the W3C.
832 </P>
834 <A NAME="DIV"><H2 ALIGN="CENTER">THE DIV/INS TAG</H2></A>
837 There is a problem when constructing html files containing
838 server-side perl scripts with standard HTML tools. These
839 tools will refuse to process any text between
840 &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
841 tags. This is quite annoying when you want to use large
842 HTML templates where you will fill in values.
843 </P>
846 For this purpose, CGIscriptor will read the neutral
847 &lt;DIV CLASS="ssperl" ID="varname"&gt;&lt;/DIV&gt;
848 &lt;INS CLASS="ssperl" ID="varname"&gt;&lt;/INS&gt;
849 tag (in Cascading Style Sheet manner) Note that "varname" has
850 NO '$' before it, it is a bare name. Any text between
851 these &lt;DIV ...&gt;&lt;/DIV&gt; or
852 &lt;INS ...&gt;&lt;/INS&gt; tags will be assigned
853 to '$varname' as is (e.g., as a literal). No
854 processing or interpolation will be performed.
855 There is also NO nesting possible. Do NOT nest
856 &lt;/DIV&gt; inside a &lt;DIV&gt;&lt;/DIV&gt;!
857 Moreover, DIV tags do NOT ensure a block structure in
858 the final rendering (i.e., no empty lines).
859 </P>
862 Note that &lt;DIV CLASS="ssperl" ID="varname"/&gt;
863 is handled the XML way. No content is processed,
864 but varname is defined, and any SRC directives are
865 processed.
866 </P>
869 You can use $varname like any other variable name.
870 However, $varname is NOT a CGI variable and will be
871 completely internal to your script. There is NO
872 interaction between $varname and the outside world.
873 </P>
876 To interpolate a DIV derived text, you can use:
877 <pre>
878 $varname =~ s/([\]])/\\\1/g; # Mark ']'-quotes
879 $varname = eval("qq[$varname]"); # Interpolate all values
880 </pre>
881 </P>
884 The DIV tag will process IF, UNLESS, CGI and SRC attributes.
885 The SRC files will be pre-pended to the body
886 text of the tag.
887 </p>
889 <A NAME="IFUNLESS"><H2 ALIGN="CENTER">
890 CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES
891 </H2></A>
894 It is often necessary to include code-blocks that should be executed
895 conditionally, e.g., only for certain browsers or operating system.
896 Furthermore, quite often sanity and security checks are necessary
897 before user (form) data can be processed, e.g., with respect to
898 email addresses and filenames.
899 </p>
902 Checks added to the code are often difficult to find, interpret or
903 maintain and in general mess up the code flow. This kind of confussion
904 is dangerous. Also, for many of the supported "foreign" scripting
905 languages, adding these checks is cumbersome or even impossible.
906 </p>
909 As a uniform method for asserting the correctness of "context", two
910 attributes are added to all supported tags: IF and UNLESS.
911 They both evaluate their value and block execution when the
912 result is &lt;FALSE&gt; (IF) or &lt;TRUE&gt; (UNLESS) in Perl, e.g.,
913 UNLESS='$NUMBER \&gt; 100;' blocks execution if $NUMBER &lt;= 100. Note that
914 the backslash in the '\&gt;' is removed and only used to differentiate
915 this conditional '&gt;' from the tag-closing '&gt;'. For symmetry, the
916 backslash in '\&lt;' is also removed. Inside these conditionals,
917 ~/ and ./ are expanded to their respective directory root paths.
918 </p>
921 For example, the following tag will be ignored when the filename is
922 invalid:
923 </p>
925 <pre>
926 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME'
927 IF='CGIscriptor::CGIsafeFileName($FILENAME);'&gt;
929 &lt;/SCRIPT&gt;
930 </pre>
933 The IF and UNLESS values must be quoted. The same quotes are supported
934 as with the other attributes. The SRC attribute is ignored when IF and
935 UNLESS block execution.
936 </p>
938 <A NAME="SRC"><H2 ALIGN="CENTER">
939 THE MAGIC SOURCE ATTRIBUTE (SRC=)</H2></A>
942 The SRC attribute inside tags accepts a list of filenames and URL's
943 separated by "," comma's (or ";" semicolons).
944 </P>
947 ALL the variable values defined in the CGI attribute are available in
948 @ARGV as if the file was executed from the command line, in
949 the exact order in which they were declared in the preceding CGI
950 attribute.
951 </P>
954 First, a SRC={}-block will be evaluated as if the code inside the
955 block was part of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; construct, i.e.,
956 "print do { code };'';" or `code` (i.e., SAFEqx('code)).
957 Only a single block is evaluated. Note that this is processed less
958 efficiently than &lt;SCRIPT&gt; &lt;/SCRIPT&gt; blocks. Type of evaluation
959 depends on the content-type: Perl for text/ssperl and OS shell for
960 text/osshell. For other mime types (scripting languages), anything in
961 the source block is put in front of the code block "inside" the tag.
962 </P>
965 Second, executable files (i.e., -x filename != 0) are evaluated as:
966 print `filename \'$ARGV[0]\' \'$ARGV[1]\' ...`
967 That is, you can actually call executables savely from the SRC tag.
968 </P>
971 Third, text files that match the file pattern, used by CGIscriptor to
972 check whether files should be processed ($FilePattern), are
973 processed in-line (i.e., recursively) by CGIscriptor as if the code
974 was inserted in the original source file. Recursions, i.e., calling
975 a file inside itself, are blocked. If you need them, you have to code
976 them explicitely using "main::ProcessFile($file_path)".
977 </P>
980 Fourth, Perl text files (i.e., -T filename != 0) are evaluated as:
981 "do FileName;'';".
982 </P>
985 Last, URL's (i.e., starting with 'HTTP://', 'FTP://', 'GOPHER://', 'TELNET://',
986 'WHOIS://' etc.) are loaded and printed. The loading and handling of &lt;BASE&gt;
987 and document header is done by main::GET_URL($URL [, 0]). You can enter your own
988 code (default is <i>curl</i>, <i>snarf</i>, or <i>wget</i> and some
989 post-processing to add a &lt;BASE&gt; tag).
990 </P>
993 There are two pseudo-file names: PREFIX and POSTFIX. These implement
994 a switch from prefixing the SRC code/files (PREFIX, default) before the content of
995 the tag to appending the code after the content of the tag (POSTFIX). The switches
996 are done in the order in which the PREFIX and POSTFIX labels are encountered.
997 You can mix PREFIX and POSTFIX labels in any order with the SRC files.
998 Note that the ORDER of file execution is determined for prefixed and
999 postfixed files seperately.
1003 File paths can be preceded by the URL protocol prefix "file://". This
1004 is simply STRIPPED from the name.
1005 </P>
1008 Example:
1009 </P>
1012 The request
1013 "http://cgi-bin/Action_Forms.pl/Statistics/Sign_Test.html?positive=8&negative=22
1014 will result in printing "${SS_PUB}/Statistics/Sign_Test.html"
1015 With QUERY_STRING = "positive=8&negative=22"
1016 </P>
1019 on encountering the lines:
1020 </P>
1022 <PRE>
1023 &lt;META CONTENT="text/osshell; CGI='$positive=11 $negative=3'"&gt;
1024 &lt;b&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./Statistics/SignTest.pl"&gt;
1025 &lt;/SCRIPT&gt;&lt;/b&gt;&lt;p&gt;"
1026 </PRE>
1028 This line will be processed as:
1030 <PRE>
1031 "&lt;b&gt;`${SS_SCRIPT}/Statistics/SignTest.pl '8' '22'`&lt;/b&gt;&lt;p&gt;"
1032 </PRE>
1035 In which "${SS_SCRIPT}/Statistics/SignTest.pl" is an executable script,
1036 This line will end up printed as:
1037 </P>
1039 <PRE>
1040 "&lt;b&gt;p &lt;= 0.0161&lt;/b&gt;&lt;p&gt;"
1041 </PRE>
1044 Note that the META tag itself will never be printed, and is invisible to
1045 the outside world.
1046 </P>
1049 The SRC files in a DIV/INS tag will be added (pre-pended) to the body
1050 of the &lt;DIV&gt;&lt;/DIV&gt; tag. Blocks are NOT executed!
1051 </P>
1053 <A NAME="ROOT"><H2 ALIGN="CENTER">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</H2></A>
1056 Inside &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, filepaths starting
1057 with "~/" are replaced by "$YOUR_HTML_FILES/", this way files in the
1058 public directories can be accessed without direct reference to the
1059 actual paths. Filepaths starting with "./" are replaced by
1060 "$YOUR_SCRIPTS/" and this should only be used for scripts.
1061 The "$YOUR_SCRIPTS" directory is added to @INC so, e.g., the
1062 'require' command will load from the "$YOUR_SCRIPTS" directory.
1063 </P>
1066 <b>Note:</b> this replacement can seriously affect Perl scripts. Watch
1067 out for constructs like $a =~ s/aap\./noot./g, use
1068 $a =~ s@aap\.@noot.@g instead.
1069 </P>
1072 CGIscriptor.pl will assign the values of $SS_PUB and $SS_SCRIPT
1073 (i.e., $YOUR_HTML_FILES and $YOUR_SCRIPTS) to the environment variables
1074 $SS_PUB and $SS_SCRIPT. These can be accessed by the scripts that are
1075 executed. The "$SS_SCRIPT" ($YOUR_SCRIPTS) directory is added to
1076 @INC so, e.g., the 'require' command will load from the "$SS_SCRIPT"
1077 directory.<br>
1078 Values not preceded by $, ~/, or ./ are used as literals
1079 </P>
1081 <A NAME="OSSHELL"><H2 ALIGN="CENTER">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</H2></A>
1084 OS scripts are executed by a "safe" version of the `` operator (i.e.,
1085 SAFEqx(), see also below) and any output is printed. CGIscriptor will
1086 interpolate the script and replace all user-supplied CGI-variables by
1087 their ''-quoted values (actually, all variables defined in CGI attributes are
1088 quoted). Other Perl variables are interpolated in a simple fasion, i.e.,
1089 $scalar by their value, @list by join(' ', @list), and %hash by their
1090 name=value pairs. Complex references, e.g., @$variable, are all
1091 evaluated in a scalar context. Quotes should be used with care.
1092 NOTE: the results of the shell script evaluation will appear in the
1093 @CGIscriptorResults stack just as any other result.
1094 </P>
1097 All occurrences of $@% that should NOT be interpolated must be
1098 preceeded by a "\". Interpolation can be switched off completely by
1099 setting $CGIscriptor::NoShellScriptInterpolation = 1
1100 (set to 0 or undef to switch interpolation on again)
1101 i.e.,
1102 </P>
1104 <PRE>
1105 &lt;SCRIPT TYPE="text/ssperl"&gt;
1106 $CGIscriptor::NoShellScriptInterpolation = 1;
1107 &lt;/SCRIPT&gt;
1108 </PRE>
1110 <A NAME="TRANSLATIONS">
1111 <H2 ALIGN="CENTER">RUN TIME TRANSLATION OF INPUT FILES</h2>
1114 Allows general and global conversions of files using Regular Expressions.
1115 Very handy (but costly) to rewrite legacy pages to a new format.
1116 Select files to use it on with <br>
1117 my $TranslationPaths = 'filepattern';<br>
1118 This is costly. For efficiency, define:<br>
1119 $TranslationPaths = ''; when not using translations.<br>
1120 Accepts general regular expressions: [$pattern, $replacement]
1121 </p>
1124 Define:</p>
1125 <pre>
1126 my $TranslationPaths = 'filepattern'; # Pattern matching PATH_INFO
1128 push(@TranslationTable, ['pattern', 'replacement']);
1129 # e.g. (for Ruby Rails):
1130 push(@TranslationTable, ['&lt;%=', '&lt;SCRIPT TYPE="text/ssruby"&gt;']);
1131 push(@TranslationTable, ['%&gt;', '&lt;/SCRIPT&gt;']);
1133 # Runs:
1134 my $currentRegExp;
1135 foreach $currentRegExp (@TranslationTable)
1137 my ($pattern, $replacement) = @$currentRegExp;
1138 $$text =~ s!$pattern!$replacement!msg;
1140 </pre>
1142 <A NAME="LANGUAGES">
1143 <H2 ALIGN="CENTER">EVALUATION OF OTHER SCRIPTING LANGUAGES</H2>
1144 </A>
1147 Adding a MIME-type and an interpreter command to
1148 %ScriptingLanguages automatically will catch any other
1149 scripting language in the standard
1150 &lt;SCRIPT TYPE="[mime]"&gt;&lt;/SCRIPT&gt; manner.
1151 E.g., adding: $ScriptingLanguages{'text/sspython'} = 'python';
1152 will actually execute the folowing code in an HTML page
1153 (ignore 'REMOTE_HOST' for the moment):
1154 </P>
1156 <PRE>
1157 &lt;SCRIPT TYPE="text/sspython"&gt;
1158 # A Python script
1159 x = ["A","real","python","script","Hello","World","and", REMOTE_HOST]
1160 print x[4:8] # Prints the list ["Hello","World","and", REMOTE_HOST]
1161 &lt;/SCRIPT&gt;
1162 </PRE>
1165 The script code is NOT interpolated by perl, EXCEPT for those
1166 interpreters that cannot handle variables themselves.
1167 Currently, several interpreters are pre-installed:
1168 </P>
1170 <PRE>
1171 Perl test - "text/testperl" =&gt; 'perl',
1172 Python - "text/sspython" =&gt; 'python',
1173 Ruby - "text/ssruby" =&gt; 'ruby',
1174 Tcl - "text/sstcl" =&gt; 'tcl',
1175 Awk - "text/ssawk" =&gt; 'awk -f-',
1176 Gnu Lisp - "text/sslisp" =&gt; 'rep | tail +5 '.
1177 # "| egrep -v '&gt; |^rep. |^nil\\\$'",
1178 Gnu Prolog- "text/ssprolog" =&gt; 'gprolog',
1179 M4 macro's- "text/ssm4" =&gt; 'm4',
1180 Born shell- "text/sh" =&gt; 'sh',
1181 Bash - "text/bash" =&gt; 'bash',
1182 C-shell - "text/csh" =&gt; 'csh',
1183 Korn shell- "text/ksh" =&gt; 'ksh',
1184 Praat - "text/sspraat" =&gt; "praat - | sed 's/Praat &gt; //g'",
1185 R - "text/ssr" =&gt; "R --vanilla --slave | sed 's/^[\[0-9\]*] //g'",
1186 REBOL - "text/ssrebol" =&gt;
1187 "rebol --quiet|egrep -v '^[&gt; ]* == '|sed 's/^\s*\[&gt; \]* //g'",
1188 PostgreSQL- "text/postgresql" =&gt; 'psql 2&gt;/dev/null',
1189 (psql)
1190 </PRE>
1193 Note that the "value" of $ScriptingLanguages{mime} must be a command
1194 that reads Standard Input and writes to standard output. Any extra
1195 output of interactive interpreters (banners, echo's, prompts)
1196 should be removed by piping the output through 'tail', 'grep',
1197 'sed', or even 'awk' or 'perl'.
1198 </P>
1201 For access to CGI variables there is a special hashtable:
1202 %ScriptingCGIvariables.
1203 CGI variables can be accessed in three ways.
1204 <dl>
1205 <dt>1. If the mime type is not present in %ScriptingCGIvariables,
1206 nothing is done and the script itself should parse the relevant
1207 environment variables.
1208 <dt>2. If the mime type IS present in %ScriptingCGIvariables, but it's
1209 value is empty, e.g., $ScriptingCGIvariables{"text/sspraat"} = '';,
1210 the script text is interpolated by perl. That is, all $var, @array,
1211 %hash, and \-slashes are replaced by their respective values.
1212 <dt>3. In all other cases, the CGI and environment variables are added
1213 in front of the script according to the format stored in
1214 %ScriptingCGIvariables. That is, the following (pseudo-)code is
1215 executed for each CGI- or Environment variable defined in the CGI-tag:
1216 printf(INTERPRETER, $ScriptingCGIvariables{$mime}, $CGI_NAME, $CGI_VALUE);
1217 </dl>
1218 </P>
1221 For instance, "text/testperl" =&gt; '$%s = "%s";' defines variable
1222 definitions for Perl, and "text/sspython" =&gt; '%s = "%s"' for Python
1223 (note that these definitions are not save, the real ones contain '-quotes).
1224 </P>
1227 THIS WILL NOT WORK FOR @VARIABLES, the (empty) $VARIABLES will be used
1228 instead.
1229 </P>
1232 The $CGI_VALUE parameters are "shrubed" of all control characters
1233 and quotes (by &shrubCGIparameter($CGI_VALUE)). Control characters
1234 are replaced by \0&lt;octal ascii value&gt; and quotes by their HTML character
1235 value (&#8217; -&gt; &amp;#8217; &#8216; -&gt; &amp;#8216;
1236 &quot; -&gt; &amp;quot;). For example:
1237 if a client would supply the string value (in standard perl)
1238 </P>
1241 <PRE>"/dev/null';\nrm -rf *;\necho '"</PRE>
1242 it would be processed as
1243 <PRE>'/dev/null&amp;#8217;;\015rm -rf *;\015echo &amp;#8217;'</PRE>
1244 (e.g., sh or bash would process the latter more according to your
1245 intentions).<br>
1246 If your intepreter requires different protection measures, you will
1247 have to supply these in %main::SHRUBcharacterTR (string =&gt; translation),
1248 e.g.,
1250 <PRE>
1251 $SHRUBcharacterTR{"\'"} = "&amp;#8217;";
1252 </PRE>
1253 </P>
1256 Currently, the following definitions are used:
1257 </P>
1259 <PRE>
1260 %ScriptingCGIvariables = (
1261 "text/testperl" =&gt; "\$\%s = '\%s';", # Perl $VAR = 'value' (for testing)
1262 "text/sspython" =&gt; "\%s = '\%s'", # Python VAR = 'value'
1263 "text/ssruby" =&gt; '@%s = "%s"', # Ruby @VAR = "value"
1264 "text/sstcl" =&gt; 'set %s "%s"', # TCL set VAR "value"
1265 "text/ssawk" =&gt; '%s = "%s";', # Awk VAR = "value";
1266 "text/sslisp" =&gt; '(setq %s "%s")', # Gnu lisp (rep) (setq VAR "value")
1267 "text/ssprolog" =&gt; '', # Gnu prolog (interpolated)
1268 "text/ssm4" =&gt; "define(`\%s', `\%s')", # M4 macro's define(`VAR', `value')
1269 "text/sh" =&gt; "\%s='\%s';", # Born shell VAR='value';
1270 "text/bash" =&gt; "\%s='\%s';", # Born again shell VAR='value';
1271 "text/csh" =&gt; "\$\%s = '\%s';", # C shell $VAR = 'value';
1272 "text/ksh" =&gt; "\$\%s = '\%s';", # Korn shell $VAR = 'value';
1273 "text/sspraat" =&gt; '', # Praat (interpolation)
1274 "text/ssr" =&gt; '%s &lt;- "%s";', # R VAR &lt;- "value";
1275 "text/ssrebol" =&gt; '%s: copy "%s"', # REBOL VAR: copy "value"
1276 "text/postgresql" =&gt; '', # PostgreSQL (interpolation)
1277 "" =&gt; ""
1279 </PRE>
1282 Four tables allow fine-tuning of interpreter with code that should be
1283 added before and after each code block:
1284 </P>
1287 Code added before each script block
1288 </P>
1290 <PRE>
1291 %ScriptingPrefix = (
1292 "text/testperl" =&gt; "\# Prefix Code;", # Perl script testing
1293 "text/ssm4" =&gt; 'divert(0)' # M4 macro's (open STDOUT)
1295 </PRE>
1298 Code added at the end of each script block
1299 </P>
1301 <PRE>
1302 %ScriptingPostfix = (
1303 "text/testperl" =&gt; "\# Postfix Code;", # Perl script testing
1304 "text/ssm4" =&gt; 'divert(-1)' # M4 macro's (block STDOUT)
1306 </PRE>
1309 Initialization code, inserted directly after opening (NEVER interpolated)
1310 </P>
1312 <PRE>
1313 %ScriptingInitialization = (
1314 "text/testperl" =&gt; "\# Initialization Code;", # Perl script testing
1315 "text/ssawk" =&gt; 'BEGIN {', # Server Side awk scripts
1316 "text/sslisp" =&gt; '(prog1 nil ', # Lisp (rep)
1317 "text/ssm4" =&gt; 'divert(-1)' # M4 macro's (block STDOUT)
1319 </PRE>
1322 Cleanup code, inserted before closing (NEVER interpolated)
1323 </P>
1325 <PRE>
1326 %ScriptingCleanup = (
1327 "text/testperl" =&gt; "\# Cleanup Code;", # Perl script testing
1328 "text/sspraat" =&gt; 'Quit',
1329 "text/ssawk" =&gt; '};', # Server Side awk scripts
1330 "text/sslisp" =&gt; '(princ "\n" standard-output)).' # Closing print to rep
1331 "text/postgresql" =&gt; '\q',
1333 </PRE>
1336 The SRC attribute is NOT magical for these interpreters. In short,
1337 all code inside a source file or {} block is written verbattim
1338 to the interpreter. No (pre-)processing or executional magic is done.
1339 </P>
1342 A serious shortcomming of the described mechanism for handling other
1343 (scripting) languages, with respect to standard perl scripts
1344 (i.e., 'text/ssperl'), is that the code is only executed when
1345 the pipe to the interpreter is closed. So the pipe has to be
1346 closed at the end of each block. This means that the state of the
1347 interpreter (e.g., all variable values) is lost after the closing of
1348 the next &lt;/SCRIPT&gt; tag. The standard 'text/ssperl' scripts retain
1349 all values and definitions.
1350 </P>
1353 <A NAME="APPLIC"><H2 ALIGN="CENTER">APPLICATION MIME TYPES</H2></A>
1356 To ease some important auxilliary functions from within the
1357 html pages I have added them as MIME types. This uses
1358 the mechanism that is also used for the evaluation of
1359 other scripting languages, with interpolation of CGI
1360 parameters (and perl-variables). Actually, these are
1361 defined exactly like any other "scripting language".
1362 </P>
1364 <dl>
1365 <dt>text/ssdisplay:
1366 <dd>display some (HTML) text with interpolated
1367 variables (uses `cat`).
1368 <dt>text/sslogfile:
1369 <dd>write (append) the interpolated block to the file
1370 mentioned on the first, non-empty line
1371 (the filename can be preceded by 'File: ',
1372 note the space after the ':',
1373 uses `awk .... &gt;&gt; &lt;filename&gt;`).
1374 <dt>text/ssmailto:
1375 <dd>send email directly from within the script block.
1376 The first line of the body must contain
1377 To:Name@Valid.Email.Address
1378 (note: NO space between 'To:' and the email adres)
1379 For other options see the mailto man pages.
1380 It works by directly sending the (interpolated)
1381 content of the text block to a pipe into the
1382 Linux program 'mailto'.
1383 </dl>
1386 In these script blocks, all Perl variables will be
1387 replaced by their values. All CGI variables are cleaned before
1388 they are used. These CGI variables must be redefined with a
1389 CGI attribute to restore their original values.
1390 In general, this will be more secure than constructing
1391 e.g., your own email command lines. For instance, Mailto will
1392 not execute any odd (forged) email address, but just stops
1393 when the email address is invalid and awk will construct
1394 any filename you give it (e.g. '&lt;File;rm\\\040-f' would end up
1395 as a "valid" UNIX filename). Note that it will also gladly
1396 store this file anywhere (/../../../etc/passwd will work!).
1397 Use the CGIscriptor::CGIsafeFileName() function to clean the
1398 filename.
1399 </P>
1401 <A NAME="PIPES"><H2 ALIGN="CENTER">SHELL SCRIPT PIPING</H2></A>
1404 If a shell script starts with the UNIX style "#! &lt;shell command&gt; \n"
1405 line, the rest of the shell script is piped into the indicated command,
1406 i.e.,
1407 open(COMMAND, "| command");print COMMAND $RestOfScript;
1408 </P>
1411 In many ways this is equivalent to the MIME-type profiling for
1412 evaluating other scripting languages as discussed above. The
1413 difference breaks down to convenience. Shell script piping is a
1414 "raw" implementation. It allows you to control all aspects of
1415 execution. Using the MIME-type profiling is easier, but has a
1416 lot of defaults built in that might get in the way. Another
1417 difference is that shell script piping uses the SAFEqx() function,
1418 and MIME-type profiling does not.
1419 </P>
1422 Execution of shell scripts is under the control of the Perl Script blocks
1423 in the document. The MIME-type triggered execution of <SCRIPT></SCRIPT>
1424 blocks can be simulated easily. You can switch to a different shell, e.g. tcl,
1425 completely by executing the following Perl commands inside your document:
1426 </P>
1428 <PRE>
1429 &lt;SCRIPT TYPE="text/ssperl"&gt;
1430 $main::ShellScriptContentType = "text/ssTcl"; # Yes, you can do this
1431 CGIscriptor::RedirectShellScript('/usr/bin/tcl'); # Pipe to Tcl
1432 $CGIscriptor::NoShellScriptInterpolation = 1;
1433 &lt;/SCRIPT&gt;
1434 </PRE>
1437 After this script is executed, CGIscriptor will parse scripts of
1438 TYPE="text/ssTcl" and pipe their contents into '|/usr/bin/tcl'
1439 WITHOUT interpolation (i.e., NO substitution of Perl variables).
1440 The crucial function is :
1441 </P>
1443 <PRE>
1444 CGIscriptor::RedirectShellScript('/usr/bin/tcl')
1445 </PRE>
1448 After executing this function, all shell scripts AND all
1449 calls to SAFEqx()) are piped into '|/usr/bin/tcl'. If the argument
1450 of RedirectShellScript is empty, e.g., '', the original (default)
1451 value is reset.
1452 </P>
1455 The standard output, STDOUT, of any pipe is send to the client.
1456 Currently, you should be carefull with quotes in such a piped script.
1457 The results of a pipe is NOT put on the @CGIscriptorResults stack.
1458 As a result, you do not have access to the output of any piped (#!)
1459 process! If you want such access, execute
1460 </P>
1462 <PRE>
1463 &lt;SCRIPT TYPE="text/ssperl"&gt;echo "script"|command&lt;/SCRIPT&gt;
1464 </PRE>
1468 </P>
1470 <PRE>
1471 &lt;SCRIPT TYPE="text/ssperl"&gt;
1472 $resultvar = SAFEqx('echo "script"|command');
1473 &lt;/SCRIPT&gt;.
1474 </PRE>
1477 Safety is never complete. Although SAFEqx() prevents some of the
1478 most obvious forms of attacks and security slips, it cannot prevent
1479 them all. Especially, complex combinations of quotes and intricate
1480 variable references cannot be handled safely by SAFEqx. So be on
1481 guard.
1482 </P>
1484 <A NAME="SSPERL"><H2 ALIGN="CENTER">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</H2></A>
1487 All PERL scripts are evaluated inside a PERL package. This package
1488 has a separate name space. This isolated name space protects the
1489 CGIscriptor.pl program against interference from user code. However,
1490 some variables, e.g., $_, are global and cannot be protected. You are
1491 advised NOT to use such global variable names. You CAN write
1492 directives that directly access the variables in the main program.
1493 You do so at your own risk (there is definitely enough rope available
1494 to hang yourself). The behavior of CGIscriptor becomes undefined if
1495 you change its private variables during run time. The PERL code
1496 directives are used as in:
1497 </P>
1499 <PRE>
1500 $Result = eval($directive); print $Result;'';
1501 </PRE>
1504 ($directive contains all text between &lt;SCRIPT&gt;&lt;/SCRIPT&gt;).
1505 That is, the &lt;directive&gt; is treated as ''-quoted string and
1506 the result is treated as a scalar. To prevent the VALUE of the code
1507 block from appearing on the client's screen, end the directive with
1508 ';""&lt;/SCRIPT&gt;'. Evaluated directives return the last value, just as
1509 eval(), blocks, and subroutines, but only as a scalar.
1510 </P>
1513 IMPORTANT: All PERL variables defined are persistent. Each &lt;SCRIPT&gt;
1514 &lt;/SCRIPT&gt; construct is evaluated as a {}-block with associated scope
1515 (e.g., for "my $var;" declarations). This means that values assigned
1516 to a PERL variable can be used throughout the document unless they
1517 were declared with "my". The following will actually work as intended
1518 (note that the ``-quotes in this example are NOT evaluated, but used
1519 as simple quotes):
1520 </P>
1522 <PRE>
1523 &lt;META CONTENT="text/ssperl; CGI=`$String='abcdefg'`"&gt;
1524 anything ...
1525 &lt;SCRIPT TYPE="text/ssperl"&gt;@List = split('', $String);&lt;/SCRIPT&gt;
1526 anything ...
1527 &lt;SCRIPT TYPE="text/ssperl"&gt;join(", ", @List[1..$#List]);&lt;/SCRIPT&gt;
1528 </PRE>
1531 The first &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt; construct will return the
1532 value scalar(@List), the second &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt;
1533 construct will print the elements of $String separated by commas, leaving
1534 out the first element, i.e., $List[0].
1535 </P>
1538 Another warning: './' and '~/' are ALWAYS replaced by the values of
1539 $YOUR_SCRIPTS and $YOUR_HTML_FILES, respectively . This can interfere
1540 with pattern matching, e.g., $a =~ s/aap\./noot\./g will result in the
1541 evaluations of $a =~ s/aap\\${YOUR_SCRIPTS}noot\./g. Use
1542 s@<i>regexp</i>@<i>replacement</i>@g instead.
1543 </p>
1545 <A NAME="SESSIONTICKETS"><H2 ALIGN="CENTER">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</H2></A>
1547 An infrastructure for user acount authorization and file access control
1548 is available. Each request is matched against a list of URL path patterns.
1549 If the request matches, a Session Ticket is required to access the URL.
1550 This Session Ticket should be present as a CGI parameter or Cookie, eg:
1551 </p>
1553 CGI: SESSIONTICKET=&lt;value&gt;<br />
1554 Cookie: CGIscriptorSESSION=&lt;value&gt;</p>
1556 The example implementation stores Session Tickets as files in a local
1557 directory. To create Session Tickets, a Login request must be given
1558 with a LOGIN=&lt;value&gt; CGI parameter, a user name and a (doubly hashed)
1559 password. The user name and (singly hashed) password are stored in a
1560 PASSWORD ticket with the same name as the user account (name cleaned up
1561 for security).
1562 </p>
1564 The example session model implements 4 functions:
1565 <ol>
1566 <li>Login<br />
1567 The password is hashed with the user name and server side salt, and then
1568 hashed with REMOTE_HOST and a random salt. Client and Server both perform
1569 these actions and the Server only grants access if restults are the same.
1570 The server side only stores the password hashed with the user name and
1571 server side salt. Neither the plain password, nor the hashed password is
1572 ever exchanged. Only values hashed with the one-time salt are exchanged.
1573 </li>
1574 <li>Session<br />
1575 For every access to a restricted URL, the Session Ticket is checked before
1576 access is granted. There are three session modes. The first uses a fixed
1577 Session Ticket that is stored as a cookie value in the browser (actually,
1578 as a sessionStorage value). The second uses only the IP address at login
1579 to authenticate requests. The third
1580 is a Challenge mode, where the client has to calculate the value of the
1581 next one-time Session Ticket from a value derived from the password and
1582 a random string.
1583 </li>
1584 <li>Password Change<br />
1585 A new password is hashed with the user name and server side salt, and
1586 then encrypted (XORed)
1587 with the old password hashed with the user name and salt and rehashed with
1588 the login ticket number. Ath the server side this operation is reversed.
1589 Again, the stored password value is never exchanged unencrypted.
1590 </li>
1591 <li>New Account<br />
1592 The text of a new account (Type: PASSWORD) file is constructed from
1593 the new username (CGI: <em>NEWUSERNAME</em>, converted to lowercase) and
1594 hashed new password (CGI: <em>NEWPASSWORD</em>).
1595 The same process is used to encrypt
1596 the new password as is used for the Password Change function.
1597 Again, the stored password value is never exchanged unencrypted.
1598 Some default setting are encoded. For display in the browser, the new password
1599 is reencrypted (XORed) with a special key, the old password hash
1600 hashed with a session specific random hex value sent initially with the
1601 session login ticket ($RANDOMSALT).
1602 <br />For example for user <em>NewUser</em>
1603 and password <em>NewPassword</em>:
1604 <pre>
1605 Type: PASSWORD
1606 Username: newuser
1607 Password: 19afeadfba8d5dcd252e157fafd3010859f8762b87682b6b6cdb3e565194fa91
1608 IPaddress: 127\.0\.0\.1
1609 AllowedPaths: ^/Private/[\w\-]+\.html?
1610 AllowedPaths: ^/Private/newuser/
1611 Salt: e93cf858a1d5626bf095ea5c25df990dfa969ff5a5dc908b22c9a5229b525f65
1612 Session: SESSION
1613 Date: Fri Jun 29 12:46:22 2012
1614 Time: 1340973982
1615 Signature: 676c35d3aa63540293ea5442f12872bfb0a22665b504f58f804582493b6ef04e
1616 </pre>
1617 The password is created with the commands:
1618 <pre>
1619 printf '%s' 'NewPasswordnewuser970e68017413fb0ea84d7fe3c463077636dd6d53486910d4a53c693dd4109b1a'|shasum -a 256
1620 </pre>
1621 If the CPAN mudule Digest is installed, it is used instead of the commands.
1622 However, the password account files are protected against unauthorized change.
1623 To obtain a valid Password account, the following command should be given:
1624 <pre>
1625 perl CGIscriptor.pl --managelogin salt=Private/.Passwords/SALT \
1626 masterkey='Sherlock investigates oleander curry in Bath' \
1627 password='NewPassword' \
1628 Private/.Passwords/newuser
1629 </pre>
1630 </li>
1631 </ol>
1632 </p>
1633 <H3 ALIGN="CENTER">Implementation</H3>
1635 The session authentication mechanism is based on the exchange of ticket
1636 identifiers. A ticket identifier is just a string of characters, a name
1637 or a random 64 character hexadecimal string. Authentication is based
1638 on a (password derived) shared secret and the ability to calculate ticket
1639 identifiers from this shared secret. Ticket identifiers should be
1640 "safe" filenames (except user names). There are four types of tickets:
1641 <ul>
1642 <li>PASSWORD: User account descriptors, including a user name and password</li>
1643 <li>LOGIN: Temporary anonymous tickets used during login</li>
1644 <li>IPADDRESS: Authentication tokens that allow access based on the IP address of the request</li>
1645 <li>SESSION: Reusable authentication tokens</li>
1646 <li>CHALLENGE: One-time authentication tokens</li>
1647 </ul>
1648 All tickets can have an expiration date in the form of a time duration
1649 from creation, in seconds, minutes, hours, or days (<em>+duration</em>[smhd]).
1650 An absolute time can be given in seconds since the epoch of the server host.
1651 Note that expiration times of CHALLENGE authentication tokens are calculated
1652 from the last access time. Accounts can include a maximal lifetime
1653 for session tickets (MaxLifetime).
1654 </p>
1656 A Login page should create a LOGIN ticket file locally and send a
1657 server specific salt, a Random salt, and a LOGIN ticket
1658 identifier. The server side compares the username and hashed password,
1659 actually hashed(hashed(password+serversalt)+Random salt) from the client with
1660 the values it calculates from the stored Random salt from the LOGIN
1661 ticket and the hashed(password+serversalt) from the PASSWORD ticket. If
1662 successful, a new SESSION ticket is generated as a (double) hash sum of the stored
1663 password and the LOGIN ticket, i.e.
1664 LoginTicket = hashed(hashed(password+serversalt)+REMOTE_HOST+Random salt) and
1665 SessionTicket = hashed(hashed(LoginTicket).LoginTicket). This SESSION
1666 ticket should also be generated by the client and stored as
1667 sessionStorage and cookie values as needed. The Username, IP address and
1668 Path are available as $LoginUsername, $LoginIPaddress, and $LoginPath,
1669 respectively.
1670 </p>
1672 The CHALLENGE protocol stores the single hashed version of the SESSION tickets.
1673 However, this value is not exchanged, but kept secret in the JavaScript
1674 <em>sessionStorage</em> object. Instead, every page returned from the
1675 server will contain a one-time Challenge value ($CHALLENGETICKET) which
1676 has to be hashed with the stored value to return the current ticket
1677 id string.
1678 </p>
1680 In the current example implementation, all random values are created as
1681 full, 256 bit SHA256 hash values (Hex strings) of 64 bytes read from
1682 /dev/urandom.
1683 </p>
1684 <H3 ALIGN="CENTER">Authorization</H3>
1686 A limited level of authorization tuning is build into the login system.
1687 Each account file (PASSWORD ticket file) can contain a number of
1688 <em>Capabilities</em> lines. These control special priveliges. The
1689 Capabilities can be checked inside the HTML pages as part of the
1690 ticket information. Two privileges are handled internally:
1691 <em>CreateUser</em> and <em>VariableREMOTE_ADDR</em>.
1692 <em>CreateUser</em> allows the logged in user to create a new user account.
1693 With <em>VariableREMOTE_ADDR</em>, the session of the logged in user is
1694 not limited to the Remote IP address from which the inital log-in took
1695 place. Sessions can hop from one apparant (proxy) IP address to another,
1696 e.g., when using <a href="https://www.torproject.org/">Tor</a>. Any
1697 IPaddress patterns given in the PASSWORD ticket file remain in effect
1698 during the session. For security reasons, the <em>VariableREMOTE_ADDR</em>
1699 capability is only effective if the session type is <em>CHALLENGE</em>.
1700 </p>
1702 <H3 ALIGN="CENTER">Security considerations with Session tickets</H3>
1704 For strong security, please use end-to-end encryption. This can be
1705 achieved using a VPN (Virtual Private Network), SSH tunnel, or a HTTPS
1706 capable server with OpenSSL. The session ticket system of CGIscriptor.pl
1707 is intended to be used as a simple authentication mechanism WITHOUT
1708 END-TO-END ENCRYPTION. The authenticating mechanism tries to use some
1709 simple means to protect the authentication process from eavesdropping.
1710 For this it uses a secure hash function, SHA256. For all practial purposes,
1711 it is impossible to "decrypt" a SHA256 sum. But this login scheme is
1712 only as secure as your browser. Which, in general, is not very secure.
1713 </p>
1715 One fundamental weakness of the implemented procedure is that the Client obtains
1716 the code to encrypt the passwords from the server. It is the JavaScript
1717 code in the HTML pages. An attacker who could place himself between Server
1718 and Client, a <em>man in the middle attack (MITM)</em>, could change the code to
1719 reveal the plaintext password and other information. There is no real
1720 protection against this attack without end-to-end encryption and
1721 authentication. A simple, but rather cumbersome, way to check for such
1722 attacks would be to store known good copys of the pages (downloaded
1723 with a browser or automatically with <em>curl</em> or <em>wget</em>) and
1724 then use other tools to download new pages at random intervals and compare
1725 them to the old pages. For instance, the following line would remove the
1726 variable ticket codes and give a fixed SHA256 sum for the original
1727 <em>Login.html</em> page+code:
1728 <pre>
1729 curl http://localhost:8080/Private/index.html | sed 's/=\"[a-z0-9]\{64\}\"/=""/g' | shasum -a 256
1730 </pre>
1731 A simple <em>diff</em> command between old and new files should give
1732 only differences in half a dozen lines, where only hexadecimal salt
1733 values will actually differ.
1734 </p>
1736 A sort of solution for the MITM attack problem that <em>might</em> protect at
1737 least the plaintext password would be to run a trusted web
1738 page from local storage to handle password input. The solution would be
1739 to add a hidden iFrame tag loading the untrusted page from the URL and
1740 extract the needed ticket and salt values. Then run the stored, trusted,
1741 code with these values. It is not (yet) possible to set the
1742 required session storage inside the browser, so this method only works
1743 for IPADDRESS sessions and plain SESSION tickets. There are many security
1744 problems with this "solution".
1745 </p>
1747 If you are able to ascertain the integrity of the login page using any
1748 of the above methods, you can check whether the IP address seen by the
1749 login server is indeed the IP address of your computer. The IP address
1750 of the REMOTE_HOST (your visible IP address) is part of the login
1751 "password". It is stored in the login page as a CLIENTIPADDRESS. It can
1752 can be inspected by clicking the "Check IP address" box. Provided the
1753 MitM attacker cannot spoof your IP address, you can ensure that the login
1754 server sees your IP address and not that of an attacker.
1755 </p>
1757 Humans tend to reuse passwords. A compromise of a site running
1758 CGIscriptor.pl could therefore lead to a compromise of user accounts at
1759 other sites. Therefore, plain text passwords are never stored, used, or
1760 exchanged. Instead, the plain password and user name are "encrypted" with
1761 a server site salt value. Actually, all are concatenated and hashed
1762 with a one-way secure hash function (SHA256) into a single string.
1763 Whenever the word "password" is used, this hash sum is meant. Note that
1764 the salts are generated from /dev/urandom. You should check whether the
1765 implementation of /dev/urandom on your platform is secure before
1766 relying on it. This might be a problem when running CGIscriptor under
1767 Cygwin on MS Windows.<br />
1768 <em>Note: no attempt is made to slow down the password hash, so bad
1769 passwords can be cracked by brute force</em>
1770 </p>
1772 As the (hashed) passwords are all that is needed to identify at the site,
1773 these should not be stored in this form. A site specific passphrase
1774 can be entered as an environment variable ($ENV{'CGIMasterKey'}). This
1775 phrase is hashed with the server site salt and the result is hashed with
1776 the user name and then XORed with the password when it is stored. Also, to
1777 detect changes to the account (PASSWORD) and session tickets, a
1778 (HMAC) hash of some of the contents of the ticket with the server salt and
1779 CGIMasterKey is stored in each ticket.
1780 </p>
1782 Creating a valid (hashed) password, encrypt it with the CGIMasterKey and
1783 construct a signature of the ticket are non-trivial. This has to be redone
1784 with every change of the ticket file or CGIMasterKey change. CGIscriptor
1785 can do this from the command line with the command:
1786 <pre>
1787 perl CGIscriptor.pl --managelogin salt=Private/.Passwords/SALT \
1788 masterkey='Sherlock investigates oleander curry in Bath' \
1789 password='There is no password like more password' \
1790 admin
1791 </pre>
1792 CGIscriptor will exit after this command with the first option being
1793 <em>--managelogin</em>. Options have the form:
1794 <ul>
1795 <li>salt=[file or string]<br />Server salt value to use io the value
1796 stored in the ticket file. Will replace the stored value if a new
1797 password is given. If you change the server salt, you have to
1798 reset all the passwords. There is <em>absolutely no</em> procedure known
1799 to recover plaintext passwords, except asking the account holders.
1800 You are strongly adviced to make a backup before you apply such a change</li>
1801 <li>masterkey=[file or string]<br />CGIMasterKey used to read and decrypt
1802 the ticket</li>
1803 <li>newmasterkey=[file or string]<br />CGIMasterKey used to encrypt, sign,
1804 and write the ticket. Defaults to the masterkey. If you change
1805 the masterkey, you will have to reset all the accounts. You are strongly
1806 adviced to make a backup before you apply such a change</li>
1807 <li>password=[file or string]<br />New plaintext password</li>
1808 </ul>
1809 When the value of an option is a existing file path, the first line of
1810 that file is used. Options are followed by one or more paths plus names
1811 of existing ticket files. Each password option is only used for a single
1812 ticket file. It is most definitely a bad idea to use a password that is
1813 identical to an existing filepath, as the file will be read instead. Be
1814 aware that the name of the file should be a cleaned up version of the
1815 Username. This will not be checked.
1816 </p>
1818 For the authentication and a change of password, the (old) password
1819 is used to "encrypt" a random one-time token or the new password,
1820 respectively. For authentication, decryption is not needed, so a secure
1821 hash function (SHA256) is used to create a one-way hash sum "encryption".
1822 A new password must be decrypted. New passwords are encryped by XORing
1823 them with the old password.
1824 </p>
1826 <h3 align=CENTER>Strong Passwords: It is so easy</h3>
1827 <p align=CENTER><em>If you only could see what you are typing</em></p>
1828 <p >
1829 Your password might be vulnerable to
1830 <a href="https://en.wikipedia.org/wiki/Brute_force_attack">
1831 <em>brute force</em></a> guessing. Protections against such attacks are
1832 costly in terms of code complexity, bugs, and execution time.
1833 However, there is a very simple and secure counter measure. See the
1834 <a href="http://xkcd.com/936/" target="_blank">XKCD comic</a>. The phrase,
1835 <em>There is no password like more password</em> would
1836 be both much easier to remember, and still stronger than
1837 <em>h4]D%@m:49</em>, at least before this phrase was pasted as an example
1838 on the Internet.<br />
1839 For the procedures used at this site, a basic computer setup can check
1840 in the order of a billion passwords per second. You need a password (or
1841 phrase) strength in the order of 56 bits to be a little secure (one year
1842 on a single computer). One of the largest network in the world, Bitcoin
1843 mining, can check some 12 terahashes per second (June 2012). This
1844 corresponds to checking 6 times 10<sup>12</sup> passwords per second.
1845 It would take a passwords strength of ~68 bits to keep the equivalent of
1846 the Bitcoin computer network occupied for around a year before it found
1847 a match.<br />
1848 Please be so kind and add the name of your favorite flower, dish,
1849 fictional character, or small town to your password. Say,
1850 <em>Oleander</em>, <em>Curry</em>, <em>Sherlock</em>, or <em>Bath</em>, UK
1851 (each adds ~12 bits) or even the phrase <em>Sherlock investigates oleander
1852 curry in Bath</em> (adds &gt; 56 bits, note that oleander is <em>poisonous</em>,
1853 so do not try this curry at home). That would be more effective than
1854 adding a thousand rounds of encryption.
1855 Typing long passwords without seeing what you are typing is problematic.
1856 So a button should be included to make password visible.
1857 </p>
1858 <h3 align=CENTER>Technical matters</h3>
1860 Client side JavaScript code definitions. Variable names starting with '$'
1861 are CGIscriptor CGI variables. Some of the hashes could be strengthened
1862 by switching to HMAC signatures. However, the security issues of
1863 maintaining parallel functions for HMAC in both Perl and Javascript seem
1864 to be more serious than the attack vectors against the hashes. But HMAC
1865 is indeed used for the ticket signatures.
1866 </p>
1867 <pre>
1868 // On Login
1869 HashPlaintextPassword() {
1870 var plaintextpassword = document.getElementById('PASSWORD');
1871 var serversalt = document.getElementById('SERVERSALT');
1872 var username = document.getElementById('CGIUSERNAME');
1873 return hex_sha256(plaintextpassword.value+username.value.toLowerCase()+serversalt.value);
1875 var randomsalt = $RANDOMSALT; // From CGIscriptor
1876 var loginticket = $LOGINTICKET; // From CGIscriptor
1877 // Hash plaintext password
1878 var password = HashPlaintextPassword();
1879 // Authorize login
1880 var hashedpassword = hex_sha256(randomsalt+password);
1881 // Sessionticket
1882 var sessionticket = hex_sha256(loginticket+password);
1883 sessionStorage.setItem("CGIscriptorPRIVATE", sessionticket);
1884 // Secretkey for encrypting new passwords, acts like a one-time pad
1885 // Is set anew with <em>every</em> login, ie, also whith password changes
1886 // and for each create new user request
1887 var secretkey = hex_sha256(randomsalt+loginticket+password);
1888 sessionStorage.setItem("CGIscriptorSECRET", secretkey);
1890 // For a SESSION type request
1891 sessionticket = hex_sha256(sessionStorage.getItem("CGIscriptorPRIVATE"));
1892 createCookie("CGIscriptorSESSION",sessionticket, 0, "");
1894 // For a CHALLENGE type request
1895 var sessionset = "$CHALLENGETICKET"; // From CGIscriptor
1896 var sessionkey = sessionStorage.getItem("CGIscriptorPRIVATE");
1897 sessionticket = hex_sha256(sessionset+sessionkey);
1898 createCookie("CGIscriptorCHALLENGE",sessionticket, 0, "");
1900 // For transmitting a new password
1901 HashPlaintextNewPassword() {
1902 var plaintextpassword = document.getElementById('NEWPASSWORD');
1903 var serversalt = document.getElementById('SERVERSALT');
1904 var username = document.getElementById('NEWUSERNAME');
1905 return hex_sha256(plaintextpassword.value+username.value.toLowerCase()+serversalt.value);
1908 var newpassword = document.getElementById('NEWPASSWORD');
1909 var newpasswordrep = document.getElementById('NEWPASSWORDREP');
1910 // Hash plaintext password
1911 newpassword.value = HashPlaintextNewPassword();
1912 var secretkey = sessionStorage.getItem("CGIscriptorSECRET");
1914 var encrypted = XOR_hex_strings(secretkey, newpassword.value);
1915 newpassword.value = encrypted;
1916 newpasswordrep.value = encrypted;
1918 // XOR of hexadecimal strings of equal length
1919 function XOR_hex_strings(hex1, hex2) {
1920 var resultHex = "";
1921 var maxlength = Math.max(hex1.length, hex2.length);
1923 for(var i=0; i &lt; maxlength; ++i) {
1924 var h1 = hex1.charAt(i);
1925 if(! h1) h1='0';
1926 var h2 = hex2.charAt(i);
1927 if(! h2) h2 ='0';
1928 var d1 = parseInt(h1,16);
1929 var d2 = parseInt(h2,16);
1930 var resultD = d1^d2;
1931 resultHex = resultHex+resultD.toString(16);
1933 return resultHex;
1935 </pre>
1937 Password encryption based on <em>$ENV{'CGIMasterKey'}</em>.
1938 Server side Perl code:
1939 </p>
1940 <pre>
1941 # Password encryption
1942 my $masterkey = $ENV{'CGIMasterKey'}
1943 my $hash1 = hash_string($masterkey.$serversalt);
1944 my $CryptKey = hash_string($username.$hash1);
1945 $password = XOR_hex_strings($CryptKey,$password);
1947 # Key for HMAC signing
1948 my $hash1 = hash_string($masterkey.$serversalt);
1949 my $HMACKey = hash_string($username.$hash1);
1950 </pre>
1952 <A NAME="USEREXTENSIONS"><H2 ALIGN="CENTER">USER EXTENSIONS</H2></A>
1955 A CGIscriptor package is attached to the bottom of this file. With
1956 this package you can personalize your version of CGIscriptor by
1957 including often used perl routines. These subroutines can be
1958 accessed by prefixing their names with CGIscriptor::, e.g.,
1959 </P>
1961 <PRE>
1962 &lt;SCRIPT TYPE="text/ssperl"&gt;
1963 CGIscriptor::ListDocs("/Books/*") # List all documents in /Books
1964 &lt;/SCRIPT&gt;
1965 </PRE>
1968 It already contains some useful subroutines for Document Management.
1969 As it is a separate package, it has its own namespace, isolated from
1970 both the evaluator and the main program. To access variables from
1971 the document &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks, use $CGIexecute::&lt;var&gt;.
1972 </P>
1975 Currently, the following functions are implemented
1976 (precede them with CGIscriptor::, see below for more information)
1977 </P>
1979 <UL>
1980 <LI>SAFEqx ('String') -&gt; result of qx/"String"/ # Safe application of ``-quotes<br>
1981 Is used by text/osshell Shell scripts. Protects all CGI
1982 (client-supplied) values with single quotes before executing the
1983 commands (one of the few functions that also works WITHOUT CGIscriptor::
1984 in front)
1985 <LI>defineCGIvariable ($name[, $default) -&gt; 0/1 (i.e.,
1986 failure/success)<br>
1987 Is used by the META tag to define and initialize CGI and ENV
1988 name/value pairs. Tries to obtain an initializing value from (in
1989 order):<br>
1990 $ENV{$name}<br>
1991 The Query string<br>
1992 The default value given (if any)<br>
1993 (one of the few functions that also works WITHOUT CGIscriptor::
1994 in front)
1995 <LI>CGIsafeFileName (FileName) -> FileName or ""<br>
1996 Check a string against the Allowed File Characters (and ../ /..).
1997 Returns an empty string for unsafe filenames.
1998 <LI>CGIsafeEmailAddress (Email) -> Email or ""<br>
1999 Check a string against correct email address pattern.
2000 Returns an empty string for unsafe addresses.
2001 <LI>RedirectShellScript ('CommandString') -&gt; FILEHANDLER or undef<br>
2002 Open a named PIPE for SAFEqx to receive ALL shell scripts
2003 <LI>URLdecode (URL encoded string) -&gt; plain string # Decode URL encoded argument<br>
2004 <LI>URLencode (plain string) -&gt; URL encoded string # Encode argument as URL code<br>
2005 <LI>CGIparseValue (ValueName [, URL_encoded_QueryString]) -&gt; Decoded value<br>
2006 Extract the value of a CGI variable from the global or a private
2007 URL-encoded query (multipart POST raw, NOT decoded)
2008 <li>CGIparseValueList (ValueName [, URL_encoded_QueryString])
2009 -&gt; List of decoded values.<br>
2010 As CGIparseValue, but now assembles ALL values of ValueName into a list.
2011 <LI>CGIparseHeader (ValueName [, URL_encoded_QueryString]) -> Header<br>
2012 Extract the header of a multipart CGI variable from the global or a private
2013 URL-encoded query ("" when not a multipart variable or absent)
2014 <LI>CGIparseForm ([URL_encoded_QueryString]) -&gt; Decoded Form<br>
2015 Decode the complete global URL-encoded query or a private
2016 URL-encoded query
2017 <LI>read_url(URL)<br>
2018 Returns the page from URL (with added base tag, both FTP and HTTP)
2019 Uses main::GET_URL(URL, 1) to get at the command to read the URL.
2020 <LI>BrowseDirs(RootDirectory [, Pattern, Startdir, CGIname]) # print browsable directories
2021 <LI>ListDocs(Pattern [,ListType]) # Prints a nested HTML directory listing of
2022 all documents, e.g., ListDocs("/*", "dl");.<br>
2023 <LI>HTMLdocTree(Pattern [,ListType]) # Prints a nested HTML listing of all
2024 local links starting from a given document, e.g.,
2025 HTMLdocTree("/Welcome.html", "dl");<br>
2026 </UL>
2028 <A NAME="RESULTSSTACK"><H2 ALIGN="CENTER">THE RESULTS STACK: @CGIscriptorResults</H2></A>
2031 If the pseudo-variable "$CGIscriptorResults" has been defined in a
2032 META tag, all subsequent SCRIPT and META results are pushed
2033 on the @CGIscriptorResults stack. This list is just another
2034 Perl variable and can be used and manipulated like any other list.
2035 $CGIscriptorResults[-1] is always the last result.
2036 This is only of limited use, e.g., to use the results of an OS shell
2037 script inside a Perl script. Will NOT contain the results of Pipes
2038 or code from MIME-profiling.
2039 </P>
2041 <A NAME="CGIPREDEFINED"><H2 ALIGN="CENTER">USEFULL CGI PREDEFINED VARIABLES (DO NOT ASSIGN TO THESE)</H2></A>
2043 <ul>
2044 <li>$CGI_HOME - The ServerRoot directory
2045 <li>$CGI_Decoded_QS - The complete decoded Query String
2046 <li>$CGI_Content_Length - The ACTUAL length of the Query String
2047 <li>$CGI_Date - Current date and time
2048 <li>$CGI_Year $CGI_Month $CGI_Day $CGI_WeekDay - Current Date
2049 <li>$CGI_Time - Current Time
2050 <li>$CGI_Hour $CGI_Minutes $CGI_Seconds - Current Time, split
2051 GMT Date/Time:
2052 <li>$CGI_GMTYear $CGI_GMTMonth $CGI_GMTDay $CGI_GMTWeekDay $CGI_GMTYearDay
2053 <li>$CGI_GMTHour $CGI_GMTMinutes $CGI_GMTSeconds $CGI_GMTisdst
2054 </ul>
2056 <A NAME="ENVIRONMENT"><H2 ALIGN="CENTER">USEFULL CGI ENVIRONMENT VARIABLES</H2></A>
2059 Variables accessible (in APACHE) as $ENV{"&lt;name&gt;"}
2060 (see: "http://hoohoo.ncsa.uiuc.edu/cgi/env.html"):
2061 </P>
2063 <UL>
2064 <LI>QUERY_STRING - The query part of URL, that is, everything that follows the
2065 question mark.
2066 <LI>PATH_INFO - Extra path information given after the script name
2067 <LI>PATH_TRANSLATED - Extra pathinfo translated through the rule system.
2068 (This doesn't always make sense.)
2069 <LI>REMOTE_USER - If the server supports user authentication, and the script is
2070 protected, this is the username they have authenticated as.
2071 <LI>REMOTE_HOST - The hostname making the request. If the server does not have
2072 this information, it should set REMOTE_ADDR and leave this unset
2073 <LI>REMOTE_ADDR - The IP address of the remote host making the request.
2074 <LI>REMOTE_IDENT - If the HTTP server supports RFC 931 identification, then this
2075 variable will be set to the remote user name retrieved from
2076 the server. Usage of this variable should be limited to logging
2077 only.
2078 <LI>AUTH_TYPE - If the server supports user authentication, and the script
2079 is protected, this is the protocol-specific authentication
2080 method used to validate the user.
2081 <LI>CONTENT_TYPE - For queries which have attached information, such as HTTP
2082 POST and PUT, this is the content type of the data.
2083 <LI>CONTENT_LENGTH - The length of the said content as given by the client.
2084 <LI>SERVER_SOFTWARE - The name and version of the information server software
2085 answering the request (and running the gateway).
2086 Format: name/version
2087 <LI>SERVER_NAME - The server's hostname, DNS alias, or IP address as it
2088 would appear in self-referencing URLs
2089 <LI>GATEWAY_INTERFACE - The revision of the CGI specification to which this
2090 server complies. Format: CGI/revision
2091 <LI>SERVER_PROTOCOL - The name and revision of the information protocol this
2092 request came in with. Format: protocol/revision
2093 <LI>SERVER_PORT - The port number to which the request was sent.
2094 <LI>REQUEST_METHOD - The method with which the request was made. For HTTP,
2095 this is "GET", "HEAD", "POST", etc.
2096 <LI>SCRIPT_NAME - A virtual path to the script being executed, used for
2097 self-referencing URLs.
2098 <LI>HTTP_ACCEPT - The MIME types which the client will accept, as given by
2099 HTTP headers. Other protocols may need to get this
2100 information from elsewhere. Each item in this list should
2101 be separated by commas as per the HTTP spec.
2102 Format: type/subtype, type/subtype
2103 <LI>HTTP_USER_AGENT - The browser the client is using to send the request.
2104 General format: software/version library/version.
2105 </UL>
2107 <A NAME="RUNNING"><H2 ALIGN="CENTER">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</H2></A>
2110 CGIscriptor.pl will run on any WWW server that runs Perl scripts,
2111 just add a line like the following to your srm.conf file
2112 (Apache example):
2113 </P>
2115 <pre>
2116 ScriptAlias /SHTML/ /real-path/CGIscriptor.pl/
2117 </pre>
2120 URL's that refer to http://www.your.address/SHTML/... will now be handled
2121 by CGIscriptor.pl, which can use a private directory tree (default is the
2122 DOCUMENT_ROOT directory tree, but it can be anywhere, see manual).
2123 </P>
2126 If your hosting ISP won't let you add ScriptAlias lines you can use
2127 the following "rewrite"-based "scriptalias" in .htaccess
2128 (from Gerd Franke)
2129 </P>
2131 <pre>
2132 RewriteEngine On
2133 RewriteBase /
2134 RewriteCond %{REQUEST_FILENAME} .html$
2135 RewriteCond %{SCRIPT_FILENAME} !cgiscriptor.pl$
2136 RewriteCond %{REQUEST_FILENAME} -f
2137 RewriteRule ^(.*)$ /cgi-bin/cgiscriptor.pl/$1?%{QUERY_STRING}
2138 </Pre>
2141 Everthing with the extension ".html" and not including "cgiscriptor.pl"
2142 in the url and where the file "path/filename.html" exists is redirected
2143 to "/cgi.bin/cgiscriptor.pl/path/filename.html?query".
2144 The user configuration should get the same path-level as the
2145 .htaccess-file:
2146 </P>
2148 <pre>
2149 # Just enter your own directory path here
2150 $YOUR_HTML_FILES = "$ENV{'DOCUMENT_ROOT'}";
2151 # use DOCUMENT_ROOT only, if .htaccess lies in the root-directory.
2152 </Pre>
2155 If this .htaccess goes in a specific directory, the path to this
2156 directory must be added to $ENV{'DOCUMENT_ROOT'}.
2157 </p>
2160 The CGIscriptor file contains all documentation as comments. These comments
2161 can be removed to speed up loading (e.g., `egrep -v '^#' CGIscriptor.pl` >
2162 leanScriptor.pl). A bare bones version of CGIscriptor.pl, lacking
2163 documentation, most comments, access control, example functions etc.
2164 (but still with the copyright notice and some minimal documentation)
2165 can be obtained by calling CGIscriptor.pl on the command line with the
2166 '-slim' command line argument, e.g.,
2167 </p>
2169 <PRE>
2170 &gt;CGIscriptor.pl -slim &gt; slimCGIscriptor.pl
2171 </PRE>
2174 CGIscriptor.pl can be run from the command line with &lt;path&gt; and &lt;query&gt; as
2175 arguments, as `CGIscriptor.pl &lt;path&gt; &lt;query&gt;`, inside a perl script with
2176 'do CGIscriptor.pl' after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING},
2177 or CGIscriptor.pl can be loaded with 'require "/real-path/CGIscriptor.pl"'.
2178 In the latter case, requests are processed by 'Handle_Request();'
2179 (again after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING}).
2180 </P>
2183 The --help command line switch will print the manual.
2184 </p>
2187 Using the command line execution option, CGIscriptor.pl can be used as a document
2188 (meta-)preprocessor. If the first argument is '-', STDIN will be read. For example:
2189 </P>
2191 <PRE>
2192 &gt; cat MyDynamicDocument.html | CGIscriptor.pl - '[QueryString]' &gt; MyStaticFile.html
2193 </PRE>
2196 This command line will produce a STATIC file with the DYNAMIC content of
2197 MyDocument.html "interpolated". This option would be very dangerous when
2198 available over the internet. If someone could sneak a
2199 'http://www.your.domain/-' URL past your server, CGIscriptor could EXECUTE
2200 any POSTED contend. Therefore, for security reasons, STDIN will NOT
2201 be read if ANY of the HTTP server environment variables is set (e.g., SERVER_PORT,
2202 SERVER_PROTOCOL, SERVER_NAME, SERVER_SOFTWARE, HTTP_USER_AGENT,
2203 REMOTE_ADDR).<br>
2204 This block on processing STDIN on HTTP requests can be lifted by setting
2205 <pre>
2206 $BLOCK_STDIN_HTTP_REQUEST = 0;
2207 </pre>
2208 In the security configuration. But be carefull when doing this.
2209 It can be very dangerous.
2210 </P>
2213 Running demo's and more information can be found at
2214 http://www.fon.hum.uva.nl/~rob/OSS/OSS.html
2215 </P>
2218 A pocket-size HTTP daemon, CGIservlet.pl, is available from my web site
2219 or CPAN that can use CGIscriptor.pl as the base of a µWWW server and
2220 demonstrates its use.
2221 </P>
2223 <A NAME="NON-UNIX"><H2 ALIGN="CENTER">NON-UNIX PLATFORMS</H2></A>
2226 CGIscriptor.pl was mainly developed and tested on UNIX. However, as I
2227 coded part of the time on an Apple Macintosh under MacPerl, I made sure
2228 CGIscriptor did run under MacPerl (with command line options). But only as
2229 an independend script, not as part of a HTTP server. I have used it
2230 under Apache in Windows XP.
2231 </P>
2233 <A NAME="license"><H2 ALIGN="CENTER">license</H2></A>
2236 This program is free software; you can redistribute it and/or
2237 modify it under the terms of the GNU General Public License
2238 as published by the Free Software Foundation; either version 2
2239 of the License, or (at your option) any later version.
2240 </P>
2243 This program is distributed in the hope that it will be useful,
2244 but WITHOUT ANY WARRANTY; without even the implied warranty of
2245 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
2246 GNU General Public License for more details.
2247 </P>
2250 You should have received a copy of the GNU General Public License
2251 along with this program; if not, write to the Free Software
2252 Foundation, Inc., 59 Temple Place - Suite 330,
2253 Boston, MA 02111-1307, USA.
2254 </P>
2256 <PRE>
2257 Author: Rob van Son
2258 email:
2259 R.J.J.H.vanSon@gmail.com
2260 University of Amsterdam
2262 Date: May 22, 2000
2263 Ver: 2.0
2264 Env: Perl 5.002
2265 </PRE>
2266 </BODY>
2268 </HTML>