CGIscriptor.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
   2 <HTML>
   3
   4 <HEAD>
   5
   6 <TITLE>CGIscriptor 2.0 Manual</TITLE>
   7
   8
   9 </HEAD>
  10
  11 <BODY>
  12
  13 <H1 ALIGN="CENTER">
  14 <I>CGIscriptor 2.4</I>: An implementation of integrated server side CGI scripts
  15 </H1>
  16
  17 <UL>
  18     <P>
  19     <LI><A HREF="#HYPE">HYPE</A>
  20     <LI><A HREF="#HOWITWORKS">THIS IS HOW IT WORKS</A>
  21     <LI><A HREF="#HTML4">HTML 4 COMPLIANCE</A>
  22     <LI><A HREF="#SECURITY">SECURITY</A>
  23     </P>
  24
  25     <P>
  26     <LI><A HREF="#MANUAL">USER MANUAL</A>
  27     <UL>
  28         <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
  29         <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
  30         <LI><A HREF="#BINFILES">NON-HTML FILES</A>
  31         <LI><A HREF="#META">THE META TAG</A>
  32         <LI><A HREF="#DIV">THE DIV/INS TAG</A>
  33         <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
  34         <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
  35         <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
  36         <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
  37         <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
  38         <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
  39         <LI><A HREF="#APPLIC">APPLICATION MIME TYPES</A>
  40         <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
  41         <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
  42                 <LI><A HREF="#SESSIONTICKETS">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</A>
  43         <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
  44         <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
  45         <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
  46         <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
  47         <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
  48         <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
  49     </UL>
  50     <LI><A HREF="#license">license</A>
  51     </P>
  52
  53     </UL>
  54
  55 <A NAME="HYPE"><H2 ALIGN="CENTER">HYPE</H2></A>
  56
  57 <P>
  58 CGIscriptor merges plain ASCII HTML files transparantly and safely
  59 with CGI variables, in-line PERL code, shell commands, and executable
  60 scripts in many languages (on-line and real-time). It combines the
  61 "ease of use" of HTML files with the versatillity of specialized
  62 scripts and PERL programs. It hides all the specifics and
  63 idiosyncrasies of correct output and CGI coding and naming. Scripts
  64 do not have to be aware of HTML, HTTP, or CGI conventions just as HTML
  65 files can be ignorant of scripts and the associated values. CGIscriptor
  66 complies with the W3C HTML 4.0 recommendations.
  67 </P>
  68
  69 <P>
  70 In addition to its use as a WWW embeded CGI processor, it can
  71 be used as a command-line document preprocessor (text-filter).
  72 </P>
  73
  74 <A NAME="HOWITWORKS"><H2 ALIGN="CENTER">THIS IS HOW IT WORKS</H2></A>
  75
  76 <P>
  77 The aim of CGIscriptor is to execute "plain" scripts inside a text file
  78 using any required CGIparameters and environment variables. It
  79 is optimized to transparantly process HTML files inside a WWW server.
  80 The native language is Perl, but many other scripting languages
  81 can be used.
  82 </P>
  83
  84 <P>
  85 CGIscriptor reads text files from the requested input file (i.e., from
  86 $YOUR_HTML_FILES$PATH_INFO) and writes them to &lt;STDOUT&gt; (i.e., the client
  87 requesting the service) preceded by the obligatory
  88 "Content-type: text/html\n\n" or "Content-type: text/plain\n\n" string
  89 (except for "raw" files which supply their own Content-type message
  90 and only if the SERVER_PROTOCOL contains HTTP, FTP, GOPHER, MAIL, or MIME).
  91 </P>
  92
  93 <P>
  94 When CGIscriptor encounters an embedded script, indicated by an HTML4 tag
  95 </P>
  96
  97 <PRE>
  98 &lt;SCRIPT TYPE="text/ssperl" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
  99 PERL script
 100 &lt;/SCRIPT&gt;
 101 </PRE>
 102
 103 or
 104
 105 <PRE>
 106 &lt;SCRIPT TYPE="text/osshell" [CGI="$name='default value'"] [SRC="ScriptSource"]&gt;
 107 OS Shell script
 108 &lt;/SCRIPT&gt;
 109 </PRE>
 110
 111 <P>
 112 construct (anything between []-brackets is optional, other MIME-types are
 113 supported), the embedded script is removed and both the contents of the
 114 source file (i.e., "do 'ScriptSource'") AND the script are evaluated as a
 115 PERL program (i.e., by eval()), a shell script (i.e., by a "safe" version
 116 of `Command`, qx) or an external interpreter. The output of the eval()
 117 function takes the place of the original &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
 118 construct in the output string. Any CGI parameters declared by the CGI
 119 attribute are available as simple perl variables, and can subsequently
 120 be made available as variables to other scripting languages (e.g., bash,
 121 python, or lisp).
 122 </P>
 123
 124 <P>
 125 Example: printing "Hello World"
 126 </P>
 127
 128 <PRE>
 129 &lt;HTML>&lt;HEAD>&lt;TITLE>Hello World&lt;/TITLE&gt;
 130 &lt;BODY&gt;
 131 &lt;H1&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;"Hello World"&lt;/SCRIPT&gt;&lt;/H1&gt;
 132 &lt;/BODY&gt;&lt;/HTML&gt;
 133 </PRE>
 134
 135 <P>
 136 Save this in a file, hello.html, in the directory you indicated with
 137 $YOUR_HTML_FILES and access http://your_server/SHTML/hello.html
 138 (or to whatever name you use as an alias for CGIscriptor.pl).
 139 This is realy ALL you need to do to get going.
 140 </P>
 141
 142 <P>
 143 You can use any values that are delivered in CGI-compliant form (i.e.,
 144 the "?name=value" type URL additions) transparently as "$name" variables
 145 in your scripts IFF you have declared them in a META or SCRIPT tag before e.g.:
 146 </P>
 147
 148 <PRE>
 149 &lt;META CONTENT="text/ssperl; CGI='$name = `default value`'
 150 [SRC='ScriptSource']"&gt;
 151 </PRE>
 152 or
 153 <PRE>
 154 &lt;SCRIPT TYPE=text/ssperl CGI="$name = 'default value'"
 155 [SRC='ScriptSource']&gt;
 156 </PRE>
 157
 158 <P>
 159 After such a 'CGI' attribute, you can use $name as an ordinary PERL variable
 160 (the ScriptSource file is immediately evaluated with "do 'ScriptSource'").
 161 The CGIscriptor script allows you to write ordinary HTML files which will
 162 include dynamic CGI aware (run time) features, such as on-line answers
 163 to specific CGI requests, queries, or the results of calculations.
 164 </P>
 165
 166 <P>
 167 For example, if you wanted to answer questions of clients, you could write
 168 a Perl program called "Answer.pl" with a function "AnswerQuestion()"
 169 that prints out the answer to requests given as arguments. You then write
 170 a HTML page "Respond.html" containing the following fragment:
 171 </P>
 172
 173 <hr>
 174 <PRE>
 175 &lt;CENTER&gt;
 176 The Answer to your question
 177 &lt;META CONTENT="text/ssperl; CGI='$Question'"&gt;
 178 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl"&gt;$Question&lt;/SCRIPT&gt;&lt;/h3&gt;
 179 is
 180 &lt;h3&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./PATH/Answer.pl"&gt;
 181 AnswerQuestion($Question);
 182 &lt;/SCRIPT&gt;&lt;/h3&gt;
 183 &lt;CENTER&gt;
 184 &lt;FORM ACTION=Respond.html METHOD=GET&gt;
 185 Next question: &lt;INPUT NAME="Question" TYPE=TEXT SIZE=40&gt;&lt;br&gt;
 186 &lt;INPUT TYPE=SUBMIT VALUE="Ask"&gt;
 187 &lt;/FORM&gt;
 188 </PRE>
 189 <hr>
 190
 191 <P>
 192 The output could look like the following (in HTML-speak):
 193 </P>
 194
 195 <hr>
 196 <PRE>
 197 <CENTER>
 198 The Answer to your question
 199 <h3>What is the capital of the Netherlands?</h3>
 200 is
 201 <h3>Amsterdam</h3>
 202 </CENTER>
 203 <FORM ACTION=Respond.html METHOD=GET>
 204 Next question: <INPUT NAME="Question" TYPE=TEXT SIZE=40><br>
 205 <INPUT TYPE=SUBMIT VALUE="Ask">
 206 </PRE>
 207 <hr>
 208
 209 <P>
 210 Note that the function "Answer.pl" does know nothing about CGI or HTML,
 211 it just prints out answers to arguments. Likewise, the text has no
 212 provisions for scripts or CGI like constructs. Also, it is completely
 213 trivial to extend this "program" to use the "Answer" later in the page
 214 to call up other information or pictures/sounds. The final text never
 215 shows any cue as to what the original "source" looked like, i.e.,
 216 where you store your scripts and how they are called.
 217 </P>
 218
 219 <P>
 220 There are some extra's. The argument of the files called in a SRC= tag
 221 can access the CGI variables declared in the preceding META tag from
 222 the @ARGV array. Executable files are called as:
 223 `file '$ARGV[0]' ... ` (e.g., `Answer.pl \'$Question\'`;)
 224 The files called from SRC can even be (CGIscriptor) html files which are
 225 processed in-line. Furthermore, the SRC= tag can contain a perl block
 226 that is evaluated. That is,
 227 </P>
 228
 229 <PRE>
 230 &lt;META CONTENT="text/ssperl; CGI='$Question' SRC='{$Question}'"&gt;
 231 </PRE>
 232
 233 <P>
 234 will result in the evaluation of "print do {$Question};" and the VALUE
 235 of $Question will be printed. Note that these "SRC-blocks" can be
 236 preceded and followed by other file names, but only a single block is
 237 allowed in a SRC= tag.
 238 </P>
 239
 240 <p>
 241 One of the major hassles of dynamic WWW pages is the fact that several
 242 mutually incompatible browsers and platforms must be supported. For example,
 243 the way sound is played automatically is different for Netscape and
 244 Internet Explorer, and for each browser it is different again on
 245 Unix, MacOS, and Windows. Realy dangerous is processing user-supplied
 246 (form-) values to construct email addresses, file names, or database
 247 queries. All Apache WWW-server exploits reported in the media are
 248 based on faulty CGI-scripts that didn't check their user-data properly.
 249 </p>
 250
 251 <p>
 252 There is no panacee for these problems, but a lot of work and problems
 253 can be safed by allowing easy and transparent control over which
 254 &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks are executed on what CGI-data. CGIscriptor
 255 supplies such a method in the form of a pair of attributes:
 256 IF='...condition..' and UNLESS='...condition...'. When added to a
 257 script tag, the whole block (including the SRC attribute) will be
 258 ignored if the condition is false (IF) or true (UNLESS).
 259 For example, the following block will NOT be evaluated if the value
 260 of the  CGI variable FILENAME is NOT a valid filename:
 261 </p>
 262
 263 <pre>
 264 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME' IF='CGIscriptor::CGIsafeFileName($FILENAME)'&gt;
 265 .....
 266 &lt;/SCRIPT&gt;
 267 </pre>
 268
 269 <p>
 270 (the function CGIsafeFileName(String) returns an empty string ("")
 271 if the String argument is not a valid filename).
 272 The UNLESS attribute is the mirror image of IF.
 273 </p>
 274
 275 <P>
 276 A user manual follows the HTML 4 and security paragraphs below.
 277 </P>
 278
 279
 280 <A NAME="HTML4"><H2 ALIGN="CENTER">HTML 4 COMPLIANCE</H2></A>
 281
 282 <P>
 283 In general, CGIscriptor.pl complies with the HTML 4 recommendations of
 284 the W3C. This means that any software to manage Web sites will be able
 285 to handle CGIscriptor files, as will web agents.
 286 </P>
 287
 288 <P>
 289 All script code should be placed between &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, the
 290 script type is indicated with TYPE="mime-type", the LANGUAGE
 291 feature is ignored, and a SRC feature is implemented.  All CGI specific
 292 features are delegated to the CGI attribute.
 293 </P>
 294
 295 <P>
 296 However, the behavior deviates from the W3C recommendations at some
 297 points. Most notably:
 298 </P>
 299
 300 <DL>
 301     <dt>0- The scripts are executed at the server side, invisible to the
 302     client (i.e., the browser)
 303     <dt>1- The mime-types are personal and idiosyncratic, but can be adapted.
 304     <dt>2- Code in the body of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tag-pair is still evaluated
 305     when a SRC feature is present.
 306     <dt>3- The SRC feature reads a list of files.
 307     <dt>4- The files in a SRC feature are processed according to file type.
 308     <dt>5- The SRC feature evaluates inline Perl code.
 309     <dt>6- Processed META, INS, and DIV tags are removed from the output document.
 310     <dt>7- All attributes of the processed META tags, except CONTENT, are ignored
 311     (i.e., deleted from the output).
 312     <dt>8- META tags can be placed ANYWHERE in the document.
 313     <dt>9- Through the SRC feature, META tags can have visible output in the
 314     document.
 315     <dt>10- The CGI attribute that declares CGI parameters, can be used
 316     inside the &lt;SCRIPT&gt; tag.
 317     <dt>11- Use of an extended quote set, i.e., '', "", ``, (), {}, []
 318      and their \-slashed combinations: \'\', \"\", \`\`, \(\),
 319      \{\}, \[\].
 320     <dt>12- IF and UNLESS attributes to &lt;SCRIPT&gt;, &lt;META&gt;,
 321             &lt;INS&gt;, and &lt;DIV&gt; tags.
 322     <dt>13- &lt;DIV&gt; tags cannot be nested, &lt;DIV&gt; tags are not
 323         rendered with new-lines.
 324     <dt>14- The XML style &lt;TAG .... /&gt; is recognized and handled correctly.
 325         (i.e., no content is processed)
 326 </DL>
 327
 328 <P>
 329 The reasons for these choices are:
 330 </P>
 331
 332 <P>
 333 You can still write completely HTML4 compliant documents. CGIscriptor
 334 will not force you to write "deviant" code. However, it allows you to
 335 do so (which is, in fact, just as bad). The prime design principle
 336 was to allow users to include plain Perl code. The code itself should
 337 be "enhancement free". Therefore, extra features were needed to
 338 supply easy access to CGI and Web site components. For security
 339 reasons these have to be declared explicitly. The SRC feature
 340 transparently manages access to external files, especially the safe
 341 use of executable files.
 342 </P>
 343
 344 <P>
 345 The CGI attribute handles the declarations of external (CGI) variables
 346 in the SCRIPT and META tag's.<BR>
 347 EVERYTHING THE CGI ATTRIBUTE AND THE META TAG DO CAN BE DONE INSIDE
 348 A &lt;SCRIPT&gt;&lt;/SCRIPT&gt; TAG CONSTRUCT.
 349 </P>
 350
 351 <P>
 352 The reason for the IF, UNLESS, and SRC attributes (and its Perl code evaluation)
 353 were build into the META and SCRIPT tags is part laziness, part security. The SRC
 354 blocks allows more compact documents and easier debugging. The values of the
 355 CGI variables can be immediately screened for security by IF or UNLESS
 356 conditions, and even SRC attributes (e.g., email addresses and file names), and
 357 a few commands can be called without having to add another Perl TAG pair.
 358 This is especially important for documents that require the use of other
 359 (restricted) "scripting" languages that lag transparent control structures.
 360 </P>
 361
 362
 363 <A NAME="SECURITY"><H2 ALIGN="CENTER">SECURITY</H2></A>
 364
 365 <P>
 366 Your WWW site is a few keystrokes away from a few hundred million internet
 367 users. A fair percentage of these users knows more about your computer
 368 than you do. And some of these just might have bad intentions.
 369 </P>
 370
 371 <P>
 372 To ensure uncompromized operation of your server and platform, several
 373 features are incorporated in CGIscriptor.pl to enhance security.
 374 First of all, you should check the source of this program. No security
 375 measures will help you when you download programs from anonymous sources.
 376 If you want to use THIS file, please make sure that it is uncompromized.
 377 The best way to do this is to contact the source and try to determine
 378 whether s/he is reliable (and accountable).
 379 </P>
 380
 381 <P>
 382 BE AWARE THAT ANY PROGRAMMER CAN CHANGE THIS PROGRAM IN SUCH A WAY THAT
 383 IT WILL SET THE DOORS TO YOUR SYSTEM WIDE OPEN
 384 </P>
 385
 386 <P>
 387 I would like to ask any user who finds bugs that could compromise
 388 security to report them to me (and any other bug too,
 389 Email: R.J.J.H.vanSon@gmail.com or ifa@hum.uva.nl).
 390 </P>
 391
 392 <H2 ALIGN="CENTER">Security features</H2>
 393
 394 <dl>
 395 <dt>1 Invisibility
 396 <dd>The inner workings of the HTML source files are completely hidden
 397 from the client. Only the HTTP header and the ever changing content
 398 of the output distinguish it from the output of a plain, fixed HTML
 399 file. Names, structures, and arguments of the "embedded" scripts
 400 are invisible to the client. Error output is suppressed except
 401 during debugging (user configurable).
 402
 403 <dt>2 Separate directory trees
 404 <dd>Directories containing Inline text and script files can reside on
 405 separate trees, distinct from those of the HTTP server. This means
 406 that NEITHER the text files, NOR the script files can be read by
 407 clients other than through CGIscriptor.pl, UNLESS they are
 408 EXPLICITELY made available.
 409
 410 <dt>3 Requests are NEVER "evaluated"
 411 <dd>All client supplied values are used as literal values (''-quoted).
 412 Client supplied ''-quotes are ALWAYS removed. Therefore, as long as the
 413 embedded scripts do NOT themselves evaluate these values, clients CANNOT
 414 supply executable commands. Be sure to AVOID scripts like:
 415
 416 <PRE>
 417 &lt;META CONTENT="text/ssperl; CGI='$UserValue'"&gt;
 418 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 $UserValue`;&lt;/SCRIPT&gt;
 419 </PRE>
 420
 421 <P>
 422 These are a recipe for disaster. However, the following quoted
 423 form should be save (but is still not adviced):
 424 </P>
 425
 426 <PRE>
 427 &lt;SCRIPT TYPE="text/ssperl"&gt;$dir = `ls -1 \'$UserValue\'`;&lt;/SCRIPT&gt;
 428 </PRE>
 429
 430 <P>
 431 A special function, SAFEqx(), will automatically do exactly this,
 432 e.g., SAFEqx('ls -1 $UserValue') will execute `ls -1 \'$UserValue\'`
 433 with $UserValue interpolated. I recommend to use SAFEqx() instead
 434 of backticks whenever you can. The OS shell scripts inside
 435 </P>
 436
 437 <PRE>
 438 &lt;SCRIPT TYPE="text/osshell"&gt;ls -1 $UserValue&lt;/SCRIPT&gt;
 439 </PRE>
 440
 441 <P>
 442 are handeld by SAFEqx and automatically ''-quoted.
 443 </P>
 444
 445 <dt>4 Logging of requests
 446 <dd>All requests can be logged separate from the Host server. The level of
 447 detail is user configurable: Including or excluding the actual queries.
 448 This allows for the inspection of (im-) proper use.
 449
 450 <dt>5 Access control: Clients
 451 <dd>The Remote addresses can be checked against a list of authorized
 452 (i.e., accepted) or non-authorized (i.e., rejected) clients. Both
 453 REMOTE_HOST and REMOTE_ADDR are tested so clients without a proper
 454 HOST name can be (in-) excluded by their IP-address. Client patterns
 455 containing all numbers and dots are considered IP-addresses, all others
 456 domain names. No wild-cards or regexp's are allowed, only partial
 457 addresses.<br>
 458 Matching of names is done from the back to the front (domain first,
 459 i.e., $REMOTE_HOST =~ /\Q$pattern\E$/is), so including ".edu" will
 460 accept or reject all clients from the domain EDU. Matching of
 461 IP-addresses is done from the front to the back (domain first, i.e.,
 462 $REMOTE_ADDR =~ /^\Q$pattern\E/is), so including "128." will (in-)
 463 exclude all clients whose IP-address starts with 128.
 464 There are two special symbols: "-" matches HOSTs with no name and "*"
 465 matches ALL HOSTS/clients.<br>
 466
 467 <P>
 468 For those needing more expressional power, lines starting with
 469 "-e" are evaluated by the perl eval() function. E.g.,
 470 '-e $REMOTE_HOST =~ /\.edu$/is;' will accept/reject clients from the
 471 domain '.edu'.
 472 </P>
 473
 474 <dt>6 Access control: Files
 475 <dd>In principle, CGIscriptor could read ANY file in the directory
 476 tree as discussed in 1. However, for security reasons this is
 477 restricted to text files. It can be made more restricted by entering
 478 a global file pattern (e.g., ".html"). This is done by default.
 479 For each client requesting access, the file pattern(s) can be made
 480 more restrictive than the global pattern by entering client specific
 481 file patterns in the Access Control files (see 5).
 482 For example: if the ACCEPT file contained the lines
 483
 484 <PRE>
 485 *           DEMO
 486 .hum.uva.nl LET
 487 145.18.230.
 488 </PRE>
 489
 490 <P>
 491 Then all clients could request paths containing "DEMO" or "demo", e.g.
 492 "/my/demo/file.html" ($PATH_INFO =~ /\Q$pattern\E/), Clients from
 493 *.hum.uva.nl could also request paths containing  "LET or "let", e.g.
 494 "/my/let/file.html", and clients from the local cluster
 495 145.18.230.[0-9]+ could access ALL files.
 496 Again, for those needing more expressional power, lines starting with
 497 "-e" are evaluated. For instance: <br />
 498 '-e $REMOTE_HOST =~ /\.edu$/is && $PATH_INFO =~ m@/DEMO/@is;' <br />
 499 will accept/reject requests for files from the directory "/demo/" from
 500 clients from the domain '.edu'.<br />
 501 Path selections starting with ! or 'not' will be inverted. That is:
 502 </p>
 503 <PRE>
 504 *           not .wav
 505 </PRE>
 506 <p>
 507 Will match all file and path names that do NOT contain '.wav'
 508 </P>
 509 <p>
 510 <dt>7 Access control: Server side session tickets
 511 <dd>Specific paths can be controlled by Session Tickets which must be
 512 present as a CGI or Cookie value in the request. These paths
 513 are defined in %TicketRequiredPatterns as pairs of:<br />
 514 ('regexp' =&gt; 'SessionPath\tPasswordPath\tLogin.html\tExpiration').<br />
 515 Session Tickets are stored in a separate directory (SessionPath, e.g.,
 516 "Private/.Session") as files with the exact same name of the TICKET
 517 variable value.
 518 The following is an example of a SESSION ticket:
 519 <pre>
 520 Type: SESSION
 521 IPaddress: 127.0.0.1
 522 AllowedPaths: ^/Private/Name/
 523 DeniedPaths: ^/Private/CreateUser\.
 524 Expires: +3600
 525 Username: test
 526 ...
 527 </pre>
 528 Other content can follow. <br />
 529 <br />
 530 It is adviced that Session Tickets should expire and be deleted
 531 after some (idle) time. The IP address should be the IP number at login, and
 532 the ticket will be rejected if it is presented from another IP address.
 533 AllowedPaths and DeniedPaths are perl regexps. Be careful how they match. Make sure to delimit
 534 the names to prevent access to overlapping names, eg, "^/Private/Rob" will also
 535 match "^/Private/Robert", however, "^/Private/Rob/" will not. Expires is the
 536 time the ticket will remain valid after creation (file ctime). Time can be given
 537 in s[econds] (default), m[inutes], h[hours], or d[ays], eg, "24h" means 24 hours.
 538 Only the <em>Type:</em> field needs be present.<br />
 539 <br />
 540 Next to Session Tickets, there are four other type of ticket files:<br />
 541 - LOGIN tickets store information about a current login request<br />
 542 - PASSWORD tickets store account information to authorize login requests<br />
 543 - IPADDRESS tickets for IP address-only checks<br />
 544 - CHALLENGE tickets for challenge tasks for every request<br />
 545 </p>
 546 <P>
 547 <dt>8 Query length limiting
 548 <dd>The length of the Query string can be limited. If CONTENT_LENGTH is larger
 549 than this limit, the request is rejected. The combined length of the
 550 Query string and the POST input is checked before any processing is done.
 551 This will prevent clients from overloading the scripts.
 552 The actual, combined, Query Size is accessible as a variable through
 553 $CGI_Content_Length.
 554 </P>
 555
 556 <P>
 557 <dt>9 Illegal filenames, paths, and protected directories
 558 <dd>One of the primary security concerns in handling CGI-scripts is the
 559 use of "funny" characters in the requests that con scripts in executing
 560 malicious commands. Examples are inserting ';', null bytes, or &lt;newline&gt; characters
 561 in URL's and filenames, followed by executable commands. A special
 562 variable $FileAllowedChars stores a string of all allowed characters.
 563 Any request that translates to a filename with a character OUTSIDE
 564 this set will be rejected.<br>
 565 In general, all (readable text) files in the DocumentRoot tree are accessible.
 566 Default, executable files are rejected, but this can be reversed by setting
 567 the environment variable $ENV{USEFAT}=1 ($useFAT = 1). This allows using
 568 CGIscriptor on MS FAT filesystems.
 569 This might not be what you want. For instance, your ServerRoot directory
 570 might be the working directory of a CVS project and contain sensitive
 571 information (e.g., the password to get to the repository). You can block
 572 access to these subdirectories by adding the corresponding patterns to
 573 the $BlockPathAccess variable. For instance, $BlockPathAccess = '/CVS/'
 574 will block any request that contains '/CVS/' or:<br>
 575 <pre>
 576 die if $BlockPathAccess && $ENV{'PATH_INFO'} =~ m@$BlockPathAccess@;
 577 </pre>
 578 </P>
 579
 580 <P>
 581 <dt>10 The execution of code blocks can be controlled in a transparent way
 582     by adding IF or UNLESS conditions in the tags themselves.
 583     <dd>That is, a simple check of the validity of filenames or email
 584      addresses can be done before any code is executed.
 585 </p>
 586
 587 </dl>
 588
 589 <hr>
 590
 591 <A NAME="MANUAL"><H1 ALIGN="CENTER">USER MANUAL</H1></A>
 592
 593 <UL>
 594     <LI><A HREF="#INTRODUCTION">INTRODUCTION</A>
 595     <LI><A HREF="#NON-HTML">NON-HTML CONTENT TYPES</A>
 596     <LI><A HREF="#BINFILES">NON-HTML FILES</A>
 597     <LI><A HREF="#META">THE META TAG</A>
 598     <LI><A HREF="#DIV">THE DIV/INS TAG</A>
 599     <LI><A HREF="#IFUNLESS">CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES</A>
 600     <LI><A HREF="#SRC">THE MAGIC SOURCE ATTRIBUTE (SRC=)</A>
 601     <LI><A HREF="#ROOT">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</A>
 602     <LI><A HREF="#OSSHELL">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</A>
 603     <LI><A HREF="#TRANSLATIONS">RUN TIME TRANSLATION OF INPUT FILES</A>
 604     <LI><A HREF="#LANGUAGES">EVALUATION OF OTHER SCRIPTING LANGUAGES</A>
 605     <LI><A HREF="#PIPES">SHELL SCRIPT PIPING</A>
 606     <LI><A HREF="#SSPERL">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</A>
 607     <LI><A HREF="#SESSIONTICKETS">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</A>
 608     <LI><A HREF="#USEREXTENSIONS">USER EXTENSIONS</A>
 609     <LI><A HREF="#RESULTSSTACK">THE RESULTS STACK: @CGIscriptorResults</A>
 610     <LI><A HREF="#CGIPREDEFINED">USEFULL CGI PREDEFINED VARIABLES</A>
 611     <LI><A HREF="#ENVIRONMENT">USEFULL CGI ENVIRONMENT VARIABLES</A>
 612     <LI><A HREF="#RUNNING">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</A>
 613     <LI><A HREF="#NON-UNIX">NON-UNIX OS-PLATFORMS</A>
 614 </UL>
 615
 616 <A NAME="INTRODUCTION"><H2 ALIGN="CENTER">INTRODUCTION</H2></A>
 617
 618 <P>
 619 CGIscriptor removes embedded scripts, indicated by an HTML 4 type
 620 &lt;SCRIPT TYPE='text/ssperl'&gt; &lt;/SCRIPT&gt; or &lt;SCRIPT TYPE='text/osshell'&gt;
 621 &lt;/SCRIPT&gt; constructs. The contents of the directive are executed by
 622 the PERL eval() and `` functions (in a separate name space). The
 623 result of the eval() function replaces the &lt;SCRIPT&gt; &lt;/SCRIPT&gt; construct
 624 in the output file. You can use the values that are delivered in
 625 CGI-compliant form (i.e., the "?name=value&.." type URL additions)
 626 transparently as "$name" variables in your directives after they are
 627 defined in a &lt;META&gt; or &lt;SCRIPT&gt; tag.
 628 If you define the variable "$CGIscriptorResults" in a CGI attribute, all
 629 subsequent &lt;SCRIPT&gt; and &lt;META&gt; results (including the defining
 630 tag) will also be pushed onto a stack: @CGIscriptorResults. This list
 631 behaves like any other, ordinary list and can be manipulated.
 632 </P>
 633
 634 <P>
 635 Both GET and POST requests are accepted. These two methods are treated
 636 equal. Variables, i.e., those values that are determined when a file is
 637 processed, are indicated in the CGI attribute by $&lt;name&gt; or
 638 $&lt;name&gt;=&lt;default&gt; in which  &lt;name&gt; is the name of the
 639 variable and &lt;default&gt; is the value used when there is NO current CGI
 640 value for &lt;name&gt; (you can use white-spaces in
 641 $&lt;name&gt;=&lt;default&gt; but really DO make sure that the default
 642 value is followed by white space or is quoted). Names can contain any
 643 alphanumeric characters and _ (i.e., names match /[\w]+/).<br>
 644 If the <i>Content-type:</i> is 'multipart/*', the input is treated as a
 645 MIME multipart message and automatically delimited. CGI variables get the
 646 "raw" (i.e., undecoded) body of the corresponding message part.
 647 </P>
 648
 649 <P>
 650 Variables can be CGI variables, i.e., those from the QUERY_STRING,
 651 environment variables, e.g., REMOTE_USER, REMOTE_HOST, or REMOTE_ADDR,
 652 or predefined values, e.g., CGI_Decoded_QS (The complete, decoded,
 653 query string), CGI_Content_Length (the length of the decoded query
 654 string), CGI_Year, CGI_Month, CGI_Time, and CGI_Hour (the current
 655 date and time).
 656 </P>
 657
 658 <P>
 659 All these are available when defined in a CGI attribute. All environment
 660 variables are accessible as $ENV{'name'}. So, to access the REMOTE_HOST
 661 and the REMOTE_USER, use, e.g.:
 662 </P>
 663
 664 <PRE>
 665 &lt;SCRIPT TYPE='text/ssperl'&gt;
 666 ($ENV{'REMOTE_HOST'}||"-")." $ENV{'REMOTE_USER'}"
 667 &lt;/SCRIPT&gt;
 668 </PRE>
 669
 670 <P>
 671 (This will print a "-" if REMOTE_HOST is not known)
 672 Another way to do this is:
 673 </P>
 674
 675 <PRE>
 676 &lt;META CONTENT="text/ssperl; CGI='$REMOTE_HOST = - $REMOTE_USER'"&gt;
 677 &lt;SCRIPT TYPE='text/ssperl'&gt;"$REMOTE_HOST $REMOTE_USER"&lt;/SCRIPT&gt;
 678 </PRE>
 679
 680 or
 681
 682 <PRE>
 683 &lt;META CONTENT='text/ssperl; CGI="$REMOTE_HOST = - $REMOTE_USER"
 684 SRC={"$REMOTE_HOST $REMOTE_USER\n"}'&gt;
 685 </PRE>
 686
 687 <P>
 688 This is possible because ALL environment variables are available as
 689 CGI variables. The environment variables take precedence over CGI
 690 names in case of a "name clash". For instance:
 691 </P>
 692
 693 <PRE>
 694 &lt;META CONTENT="text/ssperl; CGI='$HOME' SRC={$HOME}"&gt;
 695 </PRE>
 696
 697 <P>
 698 Will print the current HOME directory (environment) irrespective whether
 699 there is a CGI variable from the query
 700 (e.g., Where do you live? &lt;INPUT TYPE="TEXT" NAME="HOME"&gt;)
 701 THIS IS A SECURITY FEATURE. It prevents clients from changing
 702 the values of defined environment variables (e.g., by supplying
 703 a bogus $REMOTE_ADDR). Although $ENV{} is not changed by the META tags,
 704 it would make the use of declared variables insecure. You can still
 705 access CGI variables after a name clash with
 706 CGIscriptor::CGIparseValue(&lt;name&gt;).
 707 </P>
 708
 709 <P>
 710 Some CGI variables are present several times in the query string
 711 (e.g., from multiple selections). These should be defined as
 712 @VARIABLENAME=default in the CGI attribute. The list @VARIABLENAME
 713 will contain ALL VARIABLENAME values from the query, or a single
 714 default value. If there is an ENVIRONMENT variable of the
 715 same name, it will be used instead of the default AND the query
 716 values. The corresponding function is
 717 CGIscriptor::CGIparseValueList(&lt;name&gt;)
 718 </P>
 719
 720 <P>
 721 CGI variables collected in a @VARIABLENAME list are unordered.
 722 When more structured variables are needed, a hash table can be used.
 723 A variable defined as %VARIABLE=default will collect all
 724 CGI-parameter values whose name start with 'VARIABLE' in a hash table
 725 with the remainder of the name as a key. For instance, %PERSON will
 726 collect PERSONname='John Doe', PERSONbirthdate='01 Jan 00', and
 727 PERSONspouse='Alice' into a hash table %PERSON such that
 728 $PERSON{'spouse'} equals 'Alice'. Any default value or environment
 729 value will be stored under the "" key. If there is an ENVIRONMENT
 730 variable of the same name, it will be used instead of the default
 731 AND the query values. The corresponding function is
 732 CGIscriptor::CGIparseValueHash(&lt;name&gt;)
 733 </P>
 734
 735 <P>
 736 This method of first declaring your environment and CGI variables
 737 before being able to use them in the scripts might seem somewhat
 738 clumsy, but it protects you from inadvertedly printing out the values of
 739 system environment variables when their names coincide with those used
 740 in the CGI forms. It also prevents "clients" from supplying CGI parameter
 741 values for your private variables.
 742 THIS IS A SECURITY FEATURE!
 743 </P>
 744
 745 <A NAME="NON-HTML"><H2 ALIGN="CENTER">NON-HTML CONTENT TYPES</H2></A>
 746
 747 <P>
 748 Normally, CGIscriptor prints the standard "Content-type: text/html\n\n"
 749 message before anything is printed.  This has been extended to include
 750 plain text (.txt) files, for which the Content-type (MIME type)
 751 'text/plain' is printed. In all other respects, text files are treated as
 752 HTML files (this can be switched off by removing '.txt' from the
 753 $FilePattern variable). When the content type should be something else,
 754 e.g., with multipart files, use the $RawFilePattern (.xmr, see also next
 755 item). CGIscriptor will not print a Content-type message for this file type
 756 (which must supply its OWN Content-type message). Raw files must still
 757 conform to the &lt;SCRIPT&gt;&lt;/SCRIPT&gt; and &lt;META&gt; tag
 758 specifications.
 759 </P>
 760
 761 <A NAME="BINFILES"><H2 ALIGN="CENTER">NON-HTML FILES</H2></A>
 762
 763 <P>
 764 CGIscriptor is intended to process HTML and text files only. You can
 765 create documents of any mime-type on-the-fly using "raw" text files, e.g.,
 766 with the .xmr extension. However, CGIscriptor will not process binary files
 767 of any type, e.g., pictures or sounds. Given the sheer number of formats, I
 768 do not have any intention to do so. However, an escape route has been
 769 provided. You can construct a genuine raw (.xmr) text file that contains
 770 the perl code to service any file type you want. If the global
 771 $BinaryMapFile variable contains the path to this file (e.g.,
 772 /BinaryMapFile.xmr), this  file will be called whenever an unsupported
 773 (non-HTML) file type is  requested. The path to the requested binary file
 774 is stored in  $ENV('CGI_BINARY_FILE') and can be used like any other
 775 CGI-variable. Servicing binary files then becomes supplying the correct
 776 Content-type (e.g., print "Content-type: image/jpeg\n\n";) and reading the
 777 file and writing it to STDOUT (e.g., using sysread() and syswrite()).
 778 </P>
 779
 780 <A NAME="META"><H2 ALIGN="CENTER">THE META TAG</H2></A>
 781
 782 <P>
 783 All attributes of a META tag are ignored, except the
 784 CONTENT='text/ssperl; CGI=" ... " [SRC=" ... "]' attribute. The string
 785 inside the quotes following the CONTENT= indication (white-space is
 786 ignored, "'` (){}[]-quotes are allowed, plus their \ versions) MUST
 787 start with any of the CGIscriptor mime-types (e.g.: text/ssperl or
 788 text/osshell) and a comma or semicolon.
 789 The quoted string following CGI= contains a white-space separated list
 790 of declarations of the CGI (and Environment) values and default values
 791 used when no CGI values are supplied by the query string.
 792 </P>
 793
 794 <P>
 795 If the default value is a longer string containing special characters,
 796 possibly spanning several lines, the string must be enclosed in quotes.
 797 You may use any pair of quotes or brackets from the list '', "", ``, (),
 798 [], or {} to distinguish default values (or preceded by \, e.g., \(...\)
 799 is different from (...)). The outermost pair will always be used and any
 800 other quotes inside the string are considered to be part of the string
 801 value, e.g.,
 802 </P>
 803
 804 <PRE>
 805 $Value = {['this'
 806 "and" (this)]}
 807 </PRE>
 808
 809 <P>
 810 will result in $Value getting the default value
 811 </P>
 812
 813 <PRE>
 814 ['this'
 815 "and" (this)]
 816 </PRE>
 817
 818 <P>
 819 (NOTE that the newline is part of the default value!).
 820 </P>
 821
 822 <P>
 823 Internally, for defining and initializing CGI (ENV) values, the META
 824 and SCRIPT tags use the function "defineCGIvariable($name, $default)"
 825 (scalars) and "defineCGIvariableList($name, $default)" (lists).
 826 These functions can be used inside scripts as
 827 "CGIscriptor::defineCGIvariable($name, $default)" and
 828 "CGIscriptor::defineCGIvariableList($name, $default)".
 829 </P>
 830
 831 <P>
 832 The CGI attribute will be processed exactly identical when used inside
 833 the  &lt;SCRIPT&gt; tag. However, this use is not according to the
 834 HTML 4.0 specifications of the W3C.
 835 </P>
 836
 837 <A NAME="DIV"><H2 ALIGN="CENTER">THE DIV/INS TAG</H2></A>
 838
 839 <P>
 840 There is a problem when constructing html files containing
 841 server-side perl scripts with standard HTML tools. These
 842 tools will refuse to process any text between
 843 &lt;SCRIPT&gt;&lt;/SCRIPT&gt;
 844 tags. This is quite annoying when you want to use large
 845 HTML templates where you will fill in values.
 846 </P>
 847
 848 <P>
 849 For this purpose, CGIscriptor will read the neutral
 850 &lt;DIV CLASS="ssperl" ID="varname"&gt;&lt;/DIV&gt;
 851 &lt;INS CLASS="ssperl" ID="varname"&gt;&lt;/INS&gt;
 852 tag (in Cascading Style Sheet manner) Note that "varname" has
 853 NO '$' before it, it is a bare name. Any text between
 854 these &lt;DIV ...&gt;&lt;/DIV&gt; or
 855 &lt;INS ...&gt;&lt;/INS&gt; tags will be assigned
 856 to '$varname' as is (e.g., as a literal). No
 857 processing or interpolation will be performed.
 858 There is also NO nesting possible. Do NOT nest
 859 &lt;/DIV&gt; inside a &lt;DIV&gt;&lt;/DIV&gt;!
 860 Moreover, DIV tags do NOT ensure a block structure in
 861 the final rendering (i.e., no empty lines).
 862 </P>
 863
 864 <P>
 865 Note that &lt;DIV CLASS="ssperl" ID="varname"/&gt;
 866 is handled the XML way. No content is processed,
 867 but varname is defined, and any SRC directives are
 868 processed.
 869 </P>
 870
 871 <P>
 872 You can use $varname like any other variable name.
 873 However, $varname is NOT a CGI variable and will be
 874 completely internal to your script. There is NO
 875 interaction between $varname and the outside world.
 876 </P>
 877
 878 <P>
 879 To interpolate a DIV derived text, you can use:
 880 <pre>
 881 $varname =~ s/([\]])/\\\1/g; # Mark ']'-quotes
 882 $varname = eval("qq[$varname]"); # Interpolate all values
 883 </pre>
 884 </P>
 885
 886 <p>
 887 The DIV tag will process IF, UNLESS, CGI and SRC attributes.
 888 The SRC files will be pre-pended to the body
 889 text of the tag.
 890 </p>
 891
 892 <A NAME="IFUNLESS"><H2 ALIGN="CENTER">
 893 CONDITIONAL PROCESSING: THE 'IF' AND 'UNLESS' ATTRIBUTES
 894 </H2></A>
 895
 896 <p>
 897 It is often necessary to include code-blocks that should be executed
 898 conditionally, e.g., only for certain browsers or operating system.
 899 Furthermore, quite often sanity and security checks are necessary
 900 before user (form) data can be processed, e.g., with respect to
 901 email addresses and filenames.
 902 </p>
 903
 904 <p>
 905 Checks added to the code are often difficult to find, interpret or
 906 maintain and in general mess up the code flow. This kind of confussion
 907 is dangerous. Also, for many of the supported "foreign" scripting
 908 languages, adding these checks is cumbersome or even impossible.
 909 </p>
 910
 911 <p>
 912 As a uniform method for asserting the correctness of "context", two
 913 attributes are added to all supported tags: IF and UNLESS.
 914 They both evaluate their value and block execution when the
 915 result is &lt;FALSE&gt; (IF) or &lt;TRUE&gt; (UNLESS) in Perl, e.g.,
 916 UNLESS='$NUMBER \&gt; 100;' blocks execution if $NUMBER &lt;= 100. Note that
 917 the backslash in the '\&gt;' is removed and only used to differentiate
 918 this conditional '&gt;' from the tag-closing '&gt;'. For symmetry, the
 919 backslash in '\&lt;' is also removed. Inside these conditionals,
 920 ~/ and ./ are expanded to their respective directory root paths.
 921 </p>
 922
 923 <p>
 924 For example, the following tag will be ignored when the filename is
 925 invalid:
 926 </p>
 927
 928 <pre>
 929 &lt;SCRIPT TYPE='text/ssperl' CGI='$FILENAME'
 930 IF='CGIscriptor::CGIsafeFileName($FILENAME);'&gt;
 931 ...
 932 &lt;/SCRIPT&gt;
 933 </pre>
 934
 935 <p>
 936 The IF and UNLESS values must be quoted. The same quotes are supported
 937 as with the other attributes. The SRC attribute is ignored when IF and
 938 UNLESS block execution.
 939 </p>
 940
 941 <A NAME="SRC"><H2 ALIGN="CENTER">
 942 THE MAGIC SOURCE ATTRIBUTE (SRC=)</H2></A>
 943
 944 <P>
 945 The SRC attribute inside tags accepts a list of filenames and URL's
 946 separated by "," comma's (or ";" semicolons).
 947 </P>
 948
 949 <P>
 950 ALL the variable values defined in the CGI attribute are available in
 951 @ARGV as if the file was executed from the command line, in
 952 the exact order in which they were declared in the preceding CGI
 953 attribute.
 954 </P>
 955
 956 <P>
 957 First, a SRC={}-block will be evaluated as if the code inside the
 958 block was part of a &lt;SCRIPT&gt;&lt;/SCRIPT&gt; construct, i.e.,
 959 "print do { code };'';" or `code` (i.e., SAFEqx('code)).
 960 Only a single block is evaluated. Note that this is processed less
 961 efficiently than &lt;SCRIPT&gt; &lt;/SCRIPT&gt; blocks. Type of evaluation
 962 depends on the content-type: Perl for text/ssperl and OS shell for
 963 text/osshell. For other mime types (scripting languages), anything in
 964 the source block is put in front of the code block "inside" the tag.
 965 </P>
 966
 967 <P>
 968 Second, executable files (i.e., -x filename != 0) are evaluated as:
 969 print `filename \'$ARGV[0]\' \'$ARGV[1]\' ...`
 970 That is, you can actually call executables savely from the SRC tag.
 971 </P>
 972
 973 <P>
 974 Third, text files that match the file pattern, used by CGIscriptor to
 975 check whether files should be processed ($FilePattern), are
 976 processed in-line (i.e., recursively) by CGIscriptor as if the code
 977 was inserted in the original source file. Recursions, i.e., calling
 978 a file inside itself, are blocked. If you need them, you have to code
 979 them explicitely using "main::ProcessFile($file_path)".
 980 </P>
 981
 982 <P>
 983 Fourth, Perl text files (i.e., -T filename != 0) are evaluated as:
 984 "do FileName;'';".
 985 </P>
 986
 987 <P>
 988 Last, URL's (i.e., starting with 'HTTP://', 'FTP://', 'GOPHER://', 'TELNET://',
 989 'WHOIS://' etc.) are loaded and printed. The loading and handling of &lt;BASE&gt;
 990 and document header is done by main::GET_URL($URL [, 0]). You can enter your own
 991 code (default is <i>curl</i>, <i>snarf</i>, or <i>wget</i> and some
 992 post-processing to add a &lt;BASE&gt; tag).
 993 </P>
 994
 995 <P>
 996 There are two pseudo-file names: PREFIX and POSTFIX. These implement
 997 a switch from prefixing the SRC code/files (PREFIX, default) before the content of
 998 the tag to appending the code after the content of the tag (POSTFIX). The switches
 999 are done in the order in which the PREFIX and POSTFIX labels are encountered.
1000 You can mix PREFIX and POSTFIX labels in any order with the SRC files.
1001 Note that the ORDER of file execution is determined for prefixed and
1002 postfixed files seperately.
1003 <P>
1004
1005 <P>
1006 File paths can be preceded by the URL protocol prefix "file://". This
1007 is simply STRIPPED from the name.
1008 </P>
1009
1010 <P>
1011 Example:
1012 </P>
1013
1014 <P>
1015 The request
1016 "http://cgi-bin/Action_Forms.pl/Statistics/Sign_Test.html?positive=8&negative=22
1017 will result in printing "${SS_PUB}/Statistics/Sign_Test.html"
1018 With QUERY_STRING = "positive=8&negative=22"
1019 </P>
1020
1021 <P>
1022 on encountering the lines:
1023 </P>
1024
1025 <PRE>
1026 &lt;META CONTENT="text/osshell; CGI='$positive=11 $negative=3'"&gt;
1027 &lt;b&gt;&lt;SCRIPT TYPE="text/ssperl" SRC="./Statistics/SignTest.pl"&gt;
1028 &lt;/SCRIPT&gt;&lt;/b&gt;&lt;p&gt;"
1029 </PRE>
1030
1031 This line will be processed as:
1032
1033 <PRE>
1034 "&lt;b&gt;`${SS_SCRIPT}/Statistics/SignTest.pl '8' '22'`&lt;/b&gt;&lt;p&gt;"
1035 </PRE>
1036
1037 <P>
1038 In which "${SS_SCRIPT}/Statistics/SignTest.pl" is an executable script,
1039 This line will end up printed as:
1040 </P>
1041
1042 <PRE>
1043 "&lt;b&gt;p &lt;= 0.0161&lt;/b&gt;&lt;p&gt;"
1044 </PRE>
1045
1046 <P>
1047 Note that the META tag itself will never be printed, and is invisible to
1048 the outside world.
1049 </P>
1050
1051 <P>
1052 The SRC files in a DIV/INS tag will be added (pre-pended) to the body
1053 of the &lt;DIV&gt;&lt;/DIV&gt; tag. Blocks are NOT executed!
1054 </P>
1055
1056 <A NAME="ROOT"><H2 ALIGN="CENTER">THE CGISCRIPTOR ROOT DIRECTORIES ~/ AND ./</H2></A>
1057
1058 <P>
1059 Inside &lt;SCRIPT&gt;&lt;/SCRIPT&gt; tags, filepaths starting
1060 with "~/" are replaced by "$YOUR_HTML_FILES/", this way files in the
1061 public directories can be accessed without direct reference to the
1062 actual paths. Filepaths starting with "./" are replaced by
1063 "$YOUR_SCRIPTS/" and this should only be used for scripts.
1064 The "$YOUR_SCRIPTS" directory is added to @INC so, e.g., the
1065 'require' command will load from the "$YOUR_SCRIPTS" directory.
1066 </P>
1067
1068 <P>
1069 <b>Note:</b> this replacement can seriously affect Perl scripts. Watch
1070 out for constructs like $a =~ s/aap\./noot./g, use
1071 $a =~ s@aap\.@noot.@g instead.
1072 </P>
1073
1074 <P>
1075 CGIscriptor.pl will assign the values of $SS_PUB and $SS_SCRIPT
1076 (i.e., $YOUR_HTML_FILES and $YOUR_SCRIPTS) to the environment variables
1077 $SS_PUB and $SS_SCRIPT. These can be accessed by the scripts that are
1078 executed. The "$SS_SCRIPT" ($YOUR_SCRIPTS) directory is added to
1079 @INC so, e.g., the 'require' command will load from the "$SS_SCRIPT"
1080 directory.<br>
1081 Values not preceded by $, ~/, or ./ are used as literals
1082 </P>
1083
1084 <A NAME="OSSHELL"><H2 ALIGN="CENTER">OS SHELL SCRIPT EVALUATION (CONTENT-TYPE=TEXT/OSSHELL)</H2></A>
1085
1086 <P>
1087 OS scripts are executed by a "safe" version of the `` operator (i.e.,
1088 SAFEqx(), see also below) and any output is printed. CGIscriptor will
1089 interpolate the script and replace all user-supplied CGI-variables by
1090 their ''-quoted values (actually, all variables defined in CGI attributes are
1091 quoted). Other Perl variables are interpolated in a simple fasion, i.e.,
1092 $scalar by their value, @list by join(' ', @list), and %hash by their
1093 name=value pairs. Complex references, e.g., @$variable, are all
1094 evaluated in a scalar context. Quotes should be used with care.
1095 NOTE: the results of the shell script evaluation will appear in the
1096 @CGIscriptorResults stack just as any other result.
1097 </P>
1098
1099 <P>
1100 All occurrences of $@% that should NOT be interpolated must be
1101 preceeded by a "\". Interpolation can be switched off completely by
1102 setting $CGIscriptor::NoShellScriptInterpolation = 1
1103 (set to 0 or undef to switch interpolation on again)
1104 i.e.,
1105 </P>
1106
1107 <PRE>
1108 &lt;SCRIPT TYPE="text/ssperl"&gt;
1109 $CGIscriptor::NoShellScriptInterpolation = 1;
1110 &lt;/SCRIPT&gt;
1111 </PRE>
1112
1113 <A NAME="TRANSLATIONS">
1114 <H2 ALIGN="CENTER">RUN TIME TRANSLATION OF INPUT FILES</h2>
1115
1116 <p>
1117 Allows general and global conversions of files using Regular Expressions.
1118 Very handy (but costly) to rewrite legacy pages to a new format.
1119 Select files to use it on with <br>
1120 my $TranslationPaths = 'filepattern';<br>
1121 This is costly. For efficiency, define:<br>
1122 $TranslationPaths = ''; when not using translations.<br>
1123 Accepts general regular expressions: [$pattern, $replacement]
1124 </p>
1125
1126 <p>
1127 Define:</p>
1128 <pre>
1129 my $TranslationPaths = 'filepattern'; # Pattern matching PATH_INFO
1130
1131 push(@TranslationTable, ['pattern', 'replacement']);
1132 # e.g. (for Ruby Rails):
1133 push(@TranslationTable, ['&lt;%=', '&lt;SCRIPT TYPE="text/ssruby"&gt;']);
1134 push(@TranslationTable, ['%&gt;', '&lt;/SCRIPT&gt;']);
1135
1136 # Runs:
1137 my $currentRegExp;
1138 foreach $currentRegExp (@TranslationTable)
1139 {
1140     my ($pattern, $replacement) = @$currentRegExp;
1141     $$text =~ s!$pattern!$replacement!msg;
1142 };
1143 </pre>
1144
1145 <A NAME="LANGUAGES">
1146 <H2 ALIGN="CENTER">EVALUATION OF OTHER SCRIPTING LANGUAGES</H2>
1147 </A>
1148
1149 <P>
1150 Adding a MIME-type and an interpreter command to
1151 %ScriptingLanguages automatically will catch any other
1152 scripting language in the standard
1153 &lt;SCRIPT TYPE="[mime]"&gt;&lt;/SCRIPT&gt; manner.
1154 E.g., adding: $ScriptingLanguages{'text/sspython'} = 'python';
1155 will actually execute the folowing code in an HTML page
1156 (ignore 'REMOTE_HOST' for the moment):
1157 </P>
1158
1159 <PRE>
1160 &lt;SCRIPT TYPE="text/sspython"&gt;
1161 # A Python script
1162 x = ["A","real","python","script","Hello","World","and", REMOTE_HOST]
1163 print x[4:8] # Prints the list ["Hello","World","and", REMOTE_HOST]
1164 &lt;/SCRIPT&gt;
1165 </PRE>
1166
1167 <P>
1168 The script code is NOT interpolated by perl, EXCEPT for those
1169 interpreters that cannot handle variables themselves.
1170 Currently, several interpreters are pre-installed:
1171 </P>
1172
1173 <PRE>
1174 Perl test -  "text/testperl" =&gt; 'perl',
1175 Python    -  "text/sspython" =&gt; 'python',
1176 Ruby      -  "text/ssruby"   =&gt; 'ruby',
1177 Tcl       -  "text/sstcl"    =&gt; 'tcl',
1178 Awk       -  "text/ssawk"    =&gt; 'awk -f-',
1179 Gnu Lisp  -  "text/sslisp"   =&gt; 'rep | tail +5 '.
1180 #                                 "| egrep -v '&gt; |^rep. |^nil\\\$'",
1181 Gnu Prolog-  "text/ssprolog" =&gt; 'gprolog',
1182 M4 macro's-  "text/ssm4"     =&gt; 'm4',
1183 Born shell-  "text/sh"       =&gt; 'sh',
1184 Bash      -  "text/bash"     =&gt; 'bash',
1185 C-shell   -  "text/csh"      =&gt; 'csh',
1186 Korn shell-  "text/ksh"      =&gt; 'ksh',
1187 Praat     -  "text/sspraat"    =&gt; "praat - | sed 's/Praat &gt; //g'",
1188 R         -  "text/ssr" =&gt; "R --vanilla --slave | sed 's/^[\[0-9\]*] //g'",
1189 REBOL     -   "text/ssrebol" =&gt;
1190               "rebol --quiet|egrep -v '^[&gt; ]* == '|sed 's/^\s*\[&gt; \]* //g'",
1191 PostgreSQL-  "text/postgresql" =&gt; 'psql 2&gt;/dev/null',
1192 (psql)
1193 </PRE>
1194
1195 <P>
1196 Note that the "value" of $ScriptingLanguages{mime} must be a command
1197 that reads Standard Input and writes to standard output. Any extra
1198 output of interactive interpreters (banners, echo's, prompts)
1199 should be removed by piping the output through 'tail', 'grep',
1200 'sed', or even 'awk' or 'perl'.
1201 </P>
1202
1203 <P>
1204 For access to CGI variables there is a special hashtable:
1205 %ScriptingCGIvariables.
1206 CGI variables can be accessed in three ways.
1207 <dl>
1208 <dt>1. If the mime type is not present in %ScriptingCGIvariables,
1209 nothing is done and the script itself should parse the relevant
1210 environment variables.
1211 <dt>2. If the mime type IS present in %ScriptingCGIvariables, but it's
1212 value is empty, e.g., $ScriptingCGIvariables{"text/sspraat"}  = '';,
1213 the script text is interpolated by perl. That is, all $var, @array,
1214 %hash, and \-slashes are replaced by their respective values.
1215 <dt>3. In all other cases, the CGI and environment variables are added
1216 in front of the script according to the format stored in
1217 %ScriptingCGIvariables. That is, the following (pseudo-)code is
1218 executed for each CGI- or Environment variable defined in the CGI-tag:
1219 printf(INTERPRETER, $ScriptingCGIvariables{$mime}, $CGI_NAME, $CGI_VALUE);
1220 </dl>
1221 </P>
1222
1223 <P>
1224 For instance, "text/testperl" =&gt; '$%s = "%s";' defines variable
1225 definitions for Perl, and "text/sspython" =&gt; '%s = "%s"' for Python
1226 (note that these definitions are not save, the real ones contain '-quotes).
1227 </P>
1228
1229 <P>
1230 THIS WILL NOT WORK FOR @VARIABLES, the (empty) $VARIABLES will be used
1231 instead.
1232 </P>
1233
1234 <P>
1235 The $CGI_VALUE parameters are "shrubed" of all control characters
1236 and quotes (by &shrubCGIparameter($CGI_VALUE)). Control characters
1237 are replaced by \0&lt;octal ascii value&gt; and quotes by their HTML character
1238 value (&#8217; -&gt; &amp;#8217; &#8216; -&gt; &amp;#8216;
1239 &quot; -&gt; &amp;quot;). For example:
1240 if a client would supply the string value  (in standard perl)
1241 </P>
1242
1243 <P>
1244 <PRE>"/dev/null';\nrm -rf *;\necho '"</PRE>
1245 it would be processed as
1246 <PRE>'/dev/null&amp;#8217;;\015rm -rf *;\015echo &amp;#8217;'</PRE>
1247 (e.g., sh or bash would process the latter more according to your
1248 intentions).<br>
1249 If your intepreter requires different protection measures, you will
1250 have to supply these in %main::SHRUBcharacterTR (string =&gt; translation),
1251 e.g.,
1252
1253 <PRE>
1254 $SHRUBcharacterTR{"\'"} = "&amp;#8217;";
1255 </PRE>
1256 </P>
1257
1258 <P>
1259 Currently, the following definitions are used:
1260 </P>
1261
1262 <PRE>
1263 %ScriptingCGIvariables = (
1264 "text/testperl" =&gt; "\$\%s = '\%s';",    # Perl          $VAR = 'value' (for testing)
1265 "text/sspython" =&gt; "\%s = '\%s'",       # Python        VAR = 'value'
1266 "text/ssruby"   =&gt; '@%s = "%s"',        # Ruby          @VAR = "value"
1267 "text/sstcl"    =&gt; 'set %s "%s"',       # TCL           set VAR "value"
1268 "text/ssawk"    =&gt; '%s = "%s";',        # Awk           VAR = "value";
1269 "text/sslisp"   =&gt; '(setq %s "%s")',   # Gnu lisp (rep) (setq VAR "value")
1270 "text/ssprolog" =&gt; '',                 # Gnu prolog    (interpolated)
1271 "text/ssm4"     =&gt; "define(`\%s', `\%s')", # M4 macro's define(`VAR', `value')
1272 "text/sh"       =&gt; "\%s='\%s';",       # Born shell    VAR='value';
1273 "text/bash"     =&gt; "\%s='\%s';",       # Born again shell VAR='value';
1274 "text/csh"      =&gt; "\$\%s = '\%s';",   # C shell       $VAR = 'value';
1275 "text/ksh"      =&gt; "\$\%s = '\%s';",   # Korn shell    $VAR = 'value';
1276 "text/sspraat"  =&gt; '',                  # Praat         (interpolation)
1277 "text/ssr"      =&gt; '%s &lt;- "%s";',       # R             VAR &lt;- "value";
1278 "text/ssrebol"  =&gt; '%s: copy "%s"',     # REBOL         VAR: copy "value"
1279 "text/postgresql" =&gt; '',                # PostgreSQL    (interpolation)
1280 "" =&gt; ""
1281 );
1282 </PRE>
1283
1284 <P>
1285 Four tables allow fine-tuning of interpreter with code that should be
1286 added before and after each code block:
1287 </P>
1288
1289 <P>
1290 Code added before each script block
1291 </P>
1292
1293 <PRE>
1294 %ScriptingPrefix = (
1295 "text/testperl" =&gt; "\# Prefix Code;",   # Perl script testing
1296 "text/ssm4"     =&gt;  'divert(0)'        # M4 macro's (open STDOUT)
1297 );
1298 </PRE>
1299
1300 <P>
1301 Code added at the end of each script block
1302 </P>
1303
1304 <PRE>
1305 %ScriptingPostfix = (
1306 "text/testperl" =&gt; "\# Postfix Code;",  # Perl script testing
1307 "text/ssm4"     =&gt;  'divert(-1)'       # M4 macro's (block STDOUT)
1308 );
1309 </PRE>
1310
1311 <P>
1312 Initialization code, inserted directly after opening (NEVER interpolated)
1313 </P>
1314
1315 <PRE>
1316 %ScriptingInitialization = (
1317 "text/testperl" =&gt; "\# Initialization Code;", # Perl script testing
1318 "text/ssawk"    =&gt; 'BEGIN {',                # Server Side awk scripts
1319 "text/sslisp"   =&gt; '(prog1 nil ',            # Lisp (rep)
1320 "text/ssm4"     =&gt;  'divert(-1)'             # M4 macro's (block STDOUT)
1321 );
1322 </PRE>
1323
1324 <P>
1325 Cleanup code, inserted before closing (NEVER interpolated)
1326 </P>
1327
1328 <PRE>
1329 %ScriptingCleanup = (
1330 "text/testperl" =&gt; "\# Cleanup Code;",  # Perl script testing
1331 "text/sspraat" =&gt; 'Quit',
1332 "text/ssawk"    =&gt; '};',        # Server Side awk scripts
1333 "text/sslisp"   =&gt;  '(princ "\n" standard-output)).'   # Closing print to rep
1334 "text/postgresql" =&gt; '\q',
1335 );
1336 </PRE>
1337
1338 <P>
1339 The SRC attribute is NOT magical for these interpreters. In short,
1340 all code inside a source file or {} block is written verbattim
1341 to the interpreter. No (pre-)processing or executional magic is done.
1342 </P>
1343
1344 <P>
1345 A serious shortcomming of the described mechanism for handling other
1346 (scripting) languages, with respect to standard perl scripts
1347 (i.e., 'text/ssperl'), is that the code is only executed when
1348 the pipe to the interpreter is closed. So the pipe has to be
1349 closed at the end of each block. This means that the state of the
1350 interpreter (e.g., all variable values) is lost after the closing of
1351 the next &lt;/SCRIPT&gt; tag. The standard 'text/ssperl' scripts retain
1352 all values and definitions.
1353 </P>
1354
1355
1356 <A NAME="APPLIC"><H2 ALIGN="CENTER">APPLICATION MIME TYPES</H2></A>
1357
1358 <P>
1359 To ease some important auxilliary functions from within the
1360 html pages I have added them as MIME types. This uses
1361 the mechanism that is also used for the evaluation of
1362 other scripting languages, with interpolation of CGI
1363 parameters (and perl-variables). Actually, these are
1364 defined exactly like any other "scripting language".
1365 </P>
1366
1367 <dl>
1368 <dt>text/ssdisplay:
1369 <dd>display some (HTML) text with interpolated
1370                  variables (uses `cat`).
1371 <dt>text/sslogfile:
1372 <dd>write (append) the interpolated block to the file
1373                  mentioned on the first, non-empty line
1374                  (the filename can be preceded by 'File: ',
1375                  note the space after the ':',
1376                  uses `awk .... &gt;&gt; &lt;filename&gt;`).
1377 <dt>text/ssmailto:
1378 <dd>send email directly from within the script block.
1379                  The first line of the body must contain
1380                  To:Name@Valid.Email.Address
1381                  (note: NO space between 'To:' and the email adres)
1382                  For other options see the mailto man pages.
1383                  It works by directly sending the (interpolated)
1384                  content of the text block to a pipe into the
1385                  Linux program 'mailto'.
1386 </dl>
1387
1388 <P>
1389 In these script blocks, all Perl variables will be
1390 replaced by their values. All CGI variables are cleaned before
1391 they are used. These CGI variables must be redefined with a
1392 CGI attribute to restore their original values.
1393 In general, this will be more secure than constructing
1394 e.g., your own email command lines. For instance, Mailto will
1395 not execute any odd (forged) email address, but just stops
1396 when the email address is invalid and awk will construct
1397 any filename you give it (e.g. '&lt;File;rm\\\040-f' would end up
1398 as a "valid" UNIX filename). Note that it will also gladly
1399 store this file anywhere (/../../../etc/passwd will work!).
1400 Use the CGIscriptor::CGIsafeFileName() function to clean the
1401 filename.
1402 </P>
1403
1404 <A NAME="PIPES"><H2 ALIGN="CENTER">SHELL SCRIPT PIPING</H2></A>
1405
1406 <P>
1407 If a shell script starts with the UNIX style "#! &lt;shell command&gt; \n"
1408 line, the rest of the shell script is piped into the indicated command,
1409 i.e.,
1410 open(COMMAND, "| command");print COMMAND $RestOfScript;
1411 </P>
1412
1413 <P>
1414 In many ways this is equivalent to the MIME-type profiling for
1415 evaluating other scripting languages as discussed above. The
1416 difference breaks down to convenience. Shell script piping is a
1417 "raw" implementation. It allows you to control all aspects of
1418 execution. Using the MIME-type profiling is easier, but has a
1419 lot of defaults built in that might get in the way. Another
1420 difference is that shell script piping uses the SAFEqx() function,
1421 and MIME-type profiling does not.
1422 </P>
1423
1424 <P>
1425 Execution of shell scripts is under the control of the Perl Script blocks
1426 in the document. The MIME-type triggered execution of <SCRIPT></SCRIPT>
1427 blocks can be simulated easily. You can switch to a different shell, e.g. tcl,
1428 completely by executing the following Perl commands inside your document:
1429 </P>
1430
1431 <PRE>
1432 &lt;SCRIPT TYPE="text/ssperl"&gt;
1433 $main::ShellScriptContentType = "text/ssTcl";     # Yes, you can do this
1434 CGIscriptor::RedirectShellScript('/usr/bin/tcl'); # Pipe to Tcl
1435 $CGIscriptor::NoShellScriptInterpolation = 1;
1436 &lt;/SCRIPT&gt;
1437 </PRE>
1438
1439 <P>
1440 After this script is executed, CGIscriptor will parse scripts of
1441 TYPE="text/ssTcl" and pipe their contents into '|/usr/bin/tcl'
1442 WITHOUT interpolation (i.e., NO substitution of Perl variables).
1443 The crucial function is :
1444 </P>
1445
1446 <PRE>
1447 CGIscriptor::RedirectShellScript('/usr/bin/tcl')
1448 </PRE>
1449
1450 <P>
1451 After executing this function, all shell scripts AND all
1452 calls to SAFEqx()) are piped into '|/usr/bin/tcl'. If the argument
1453 of RedirectShellScript is empty, e.g., '', the original (default)
1454 value is reset.
1455 </P>
1456
1457 <P>
1458 The standard output, STDOUT, of any pipe is send to the client.
1459 Currently, you should be carefull with quotes in such a piped script.
1460 The results of a pipe is NOT put on the @CGIscriptorResults stack.
1461 As a result, you do not have access to the output of any piped (#!)
1462 process! If you want such access, execute
1463 </P>
1464
1465 <PRE>
1466 &lt;SCRIPT TYPE="text/ssperl"&gt;echo "script"|command&lt;/SCRIPT&gt;
1467 </PRE>
1468
1469 <P>
1470 or
1471 </P>
1472
1473 <PRE>
1474 &lt;SCRIPT TYPE="text/ssperl"&gt;
1475 $resultvar = SAFEqx('echo "script"|command');
1476 &lt;/SCRIPT&gt;.
1477 </PRE>
1478
1479 <P>
1480 Safety is never complete. Although SAFEqx() prevents some of the
1481 most obvious forms of attacks and security slips, it cannot prevent
1482 them all. Especially, complex combinations of quotes and intricate
1483 variable references cannot be handled safely by SAFEqx. So be on
1484 guard.
1485 </P>
1486
1487 <A NAME="SSPERL"><H2 ALIGN="CENTER">PERL CODE EVALUATION (CONTENT-TYPE=TEXT/SSPERL)</H2></A>
1488
1489 <P>
1490 All PERL scripts are evaluated inside a PERL package. This package
1491 has a separate name space. This isolated name space protects the
1492 CGIscriptor.pl program against interference from user code. However,
1493 some variables, e.g., $_, are global and cannot be protected. You are
1494 advised NOT to use such global variable names. You CAN write
1495 directives that directly access the variables in the main program.
1496 You do so at your own risk (there is definitely enough rope available
1497 to hang yourself). The behavior of CGIscriptor becomes undefined if
1498 you change its private variables during run time. The PERL code
1499 directives are used as in:
1500 </P>
1501
1502 <PRE>
1503 $Result = eval($directive); print $Result;'';
1504 </PRE>
1505
1506 <P>
1507 ($directive contains all text between &lt;SCRIPT&gt;&lt;/SCRIPT&gt;).
1508 That is, the &lt;directive&gt; is treated as ''-quoted string and
1509 the result is treated as a scalar. To prevent the VALUE of the code
1510 block from appearing on the client's screen, end the directive with
1511 ';""&lt;/SCRIPT&gt;'. Evaluated directives return the last value, just as
1512 eval(), blocks, and subroutines, but only as a scalar.
1513 </P>
1514
1515 <P>
1516 IMPORTANT: All PERL variables defined are persistent. Each &lt;SCRIPT&gt;
1517 &lt;/SCRIPT&gt; construct is evaluated as a {}-block with associated scope
1518 (e.g., for "my $var;" declarations). This means that values assigned
1519 to a PERL variable can be used throughout the document unless they
1520 were declared with "my". The following will actually work as intended
1521 (note that the ``-quotes in this example are NOT evaluated, but used
1522 as simple quotes):
1523 </P>
1524
1525 <PRE>
1526 &lt;META CONTENT="text/ssperl; CGI=`$String='abcdefg'`"&gt;
1527 anything ...
1528 &lt;SCRIPT TYPE="text/ssperl"&gt;@List = split('', $String);&lt;/SCRIPT&gt;
1529 anything ...
1530 &lt;SCRIPT TYPE="text/ssperl"&gt;join(", ", @List[1..$#List]);&lt;/SCRIPT&gt;
1531 </PRE>
1532
1533 <P>
1534 The first &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt; construct will return the
1535 value scalar(@List), the second &lt;SCRIPT TYPE="text/ssperl"&gt;&lt;/SCRIPT&gt;
1536 construct will print the elements of $String separated by commas, leaving
1537 out the first element, i.e., $List[0].
1538 </P>
1539
1540 <p>
1541 Another warning: './' and '~/' are ALWAYS replaced by the values of
1542 $YOUR_SCRIPTS and $YOUR_HTML_FILES, respectively . This can interfere
1543 with pattern matching, e.g., $a =~ s/aap\./noot\./g will result in the
1544 evaluations of $a =~ s/aap\\${YOUR_SCRIPTS}noot\./g. Use
1545 s@<i>regexp</i>@<i>replacement</i>@g instead.
1546 </p>
1547
1548 <A NAME="SESSIONTICKETS"><H2 ALIGN="CENTER">SERVER SIDE SESSIONS AND ACCESS CONTROL (LOGIN)</H2></A>
1549 <p>
1550 An infrastructure for user acount authorization and file access control
1551 is available. Each request is matched against a list of URL path patterns.
1552 If the request matches, a Session Ticket is required to access the URL.
1553 This Session Ticket should be present as a CGI parameter or Cookie, eg:
1554 </p>
1555 <p>
1556 CGI: SESSIONTICKET=&lt;value&gt;<br />
1557 Cookie: CGIscriptorSESSION=&lt;value&gt;</p>
1558 <p>
1559 The example implementation stores Session Tickets as files in a local
1560 directory. To create Session Tickets, a Login request must be given
1561 with a LOGIN=&lt;value&gt; CGI parameter, a user name and a (doubly hashed)
1562 password. The user name and (singly hashed) password are stored in a
1563 PASSWORD ticket with the same name as the user account (name cleaned up
1564 for security).
1565 </p>
1566 <p>
1567 The example session model implements 4 functions:
1568 <ol>
1569 <li>Login<br />
1570 The password is hashed with the user name and server side salt, and then
1571 hashed with REMOTE_HOST and a random salt. Client and Server both perform
1572 these actions and the Server only grants access if restults are the same.
1573 The server side only stores the password hashed with the user name and
1574 server side salt. Neither the plain password, nor the hashed password is
1575 ever exchanged. Only values hashed with the one-time salt are exchanged.
1576 </li>
1577 <li>Session<br />
1578 For every access to a restricted URL, the Session Ticket is checked before
1579 access is granted. There are three session modes. The first uses a fixed
1580 Session Ticket that is stored as a cookie value in the browser (actually,
1581 as a sessionStorage value). The second uses only the IP address at login
1582 to authenticate requests. The third
1583 is a Challenge mode, where the client has to calculate the value of the
1584 next one-time Session Ticket from a value derived from the password and
1585 a random string.
1586 </li>
1587 <li>Password Change<br />
1588 A new password is hashed with the user name and server side salt, and
1589 then encrypted (XORed)
1590 with the old password hashed with the user name and salt and rehashed with
1591 the login ticket number. Ath the server side this operation is reversed.
1592 Again, the stored password value is never exchanged unencrypted.
1593 </li>
1594 <li>New Account<br />
1595 The text of a new account (Type: PASSWORD) file is constructed from
1596 the new username (CGI: <em>NEWUSERNAME</em>, converted to lowercase) and
1597 hashed new password (CGI: <em>NEWPASSWORD</em>).
1598 The same process is used to encrypt
1599 the new password as is used for the Password Change function.
1600 Again, the stored password value is never exchanged unencrypted.
1601 Some default setting are encoded. For display in the browser, the new password
1602 is reencrypted (XORed) with a special key, the old password hash
1603 hashed with a session specific random hex value sent initially with the
1604 session login ticket ($RANDOMSALT).
1605 <br />For example for user <em>NewUser</em>
1606 and password <em>NewPassword</em>:
1607 <pre>
1608 Type: PASSWORD
1609 Username: newuser
1610 Password: 19afeadfba8d5dcd252e157fafd3010859f8762b87682b6b6cdb3e565194fa91
1611 IPaddress: 127\.0\.0\.1
1612 AllowedPaths: ^/Private/[\w\-]+\.html?
1613 AllowedPaths: ^/Private/newuser/
1614 Salt: e93cf858a1d5626bf095ea5c25df990dfa969ff5a5dc908b22c9a5229b525f65
1615 Session: SESSION
1616 Date: Fri Jun 29 12:46:22 2012
1617 Time: 1340973982
1618 Signature: 676c35d3aa63540293ea5442f12872bfb0a22665b504f58f804582493b6ef04e
1619 </pre>
1620 The password is created with the commands:
1621 <pre>
1622 printf '%s' 'NewPasswordnewuser970e68017413fb0ea84d7fe3c463077636dd6d53486910d4a53c693dd4109b1a'|shasum -a 256
1623 </pre>
1624 If the CPAN mudule Digest is installed, it is used instead of the commands.
1625 However, the password account files are protected against unauthorized change.
1626 To obtain a valid Password account, the following command should be given:
1627 <pre>
1628 perl CGIscriptor.pl --managelogin salt=Private/.Passwords/SALT \
1629   masterkey='Sherlock investigates oleander curry in Bath' \
1630   password='NewPassword' \
1631   Private/.Passwords/newuser
1632 </pre>
1633 </li>
1634 </ol>
1635 </p>
1636 <p>
1637 There are four default accounts present: <em>testip, test, testchallenge</em>, and
1638 <em>admin</em>. The former three have password <em>testing</em>, the latter has password
1639 <em>There is no password like more password</em>. The <em>admin</em>
1640 account is disabled by default. You can enable it with a new password
1641 using the <tt>--managelogin</tt> option. All four accounts are limited to local
1642 (<em>localhost</em>) requests. When present, <em>testip, test</em>, and
1643 <em>testchallenge</em> are reactivated with the default password <em>testing</em>
1644 whenever a new SALT is automatically generated. It is adviced that the <em>test</em>
1645 accounts are removed when setting up a site.
1646 </p>
1647 <H3 ALIGN="CENTER">Implementation</H3>
1648 <p>
1649 The session authentication mechanism is based on the exchange of ticket
1650 identifiers. A ticket identifier is just a string of characters, a name
1651 or a random 64 character hexadecimal string. Authentication is based
1652 on a (password derived) shared secret and the ability to calculate ticket
1653 identifiers from this shared secret. Ticket identifiers should be
1654 "safe" filenames (except user names). There are four types of tickets:
1655 <ul>
1656 <li>PASSWORD: User account descriptors, including a user name and password</li>
1657 <li>LOGIN: Temporary anonymous tickets used during login</li>
1658 <li>IPADDRESS: Authentication tokens that allow access based on the IP address of the request</li>
1659 <li>SESSION: Reusable authentication tokens</li>
1660 <li>CHALLENGE: One-time authentication tokens</li>
1661 </ul>
1662 All tickets can have an expiration date in the form of a time duration
1663 from creation, in seconds, minutes, hours, or days (<em>+duration</em>[smhd]).
1664 An absolute time can be given in seconds since the epoch of the server host.
1665 Note that expiration times of CHALLENGE authentication tokens are calculated
1666 from the last access time. Accounts can include a maximal lifetime
1667 for session tickets (MaxLifetime).
1668 </p>
1669 <p>
1670 A Login page should create a LOGIN ticket file locally and send a
1671 server specific salt, a Random salt, and a LOGIN ticket
1672 identifier. The server side compares the username and hashed password,
1673 actually hashed(hashed(password+serversalt)+Random salt) from the client with
1674 the values it calculates from the stored Random salt from the LOGIN
1675 ticket and the hashed(password+serversalt) from the PASSWORD ticket. If
1676 successful, a new SESSION ticket is generated as a (double) hash sum of the stored
1677 password and the LOGIN ticket, i.e.
1678 LoginTicket = hashed(hashed(password+serversalt)+REMOTE_HOST+Random salt) and
1679 SessionTicket = hashed(hashed(LoginTicket).LoginTicket). This SESSION
1680 ticket should also be generated by the client and stored as
1681 sessionStorage and cookie values as needed. The Username, IP address and
1682 Path are available as $LoginUsername, $LoginIPaddress, and $LoginPath,
1683 respectively.
1684 </p>
1685 <p>
1686 The CHALLENGE protocol stores the single hashed version of the SESSION tickets.
1687 However, this value is not exchanged, but kept secret in the JavaScript
1688 <em>sessionStorage</em> object. Instead, every page returned from the
1689 server will contain a one-time Challenge value ($CHALLENGETICKET) which
1690 has to be hashed with the stored value to return the current ticket
1691 id string.
1692 </p>
1693 <p>
1694 In the current example implementation, all random values are created as
1695 full, 256 bit SHA256 hash values (Hex strings) of 64 bytes read from
1696 /dev/urandom.
1697 </p>
1698 <H3 ALIGN="CENTER">Authorization</H3>
1699 <p>
1700 A limited level of authorization tuning is build into the login system.
1701 Each account file (PASSWORD ticket file) can contain a number of
1702 <em>Capabilities</em> lines. These control special priveliges. The
1703 Capabilities can be checked inside the HTML pages as part of the
1704 ticket information. Two privileges are handled internally:
1705 <em>CreateUser</em> and <em>VariableREMOTE_ADDR</em>.
1706 <em>CreateUser</em> allows the logged in user to create a new user account.
1707 With <em>VariableREMOTE_ADDR</em>, the session of the logged in user is
1708 not limited to the Remote IP address from which the inital log-in took
1709 place. Sessions can hop from one apparant (proxy) IP address to another,
1710 e.g., when using <a href="https://www.torproject.org/">Tor</a>. Any
1711 IPaddress patterns given in the PASSWORD ticket file remain in effect
1712 during the session. For security reasons, the <em>VariableREMOTE_ADDR</em>
1713 capability is only effective if the session type is <em>CHALLENGE</em>.
1714 </p>
1715
1716 <H3 ALIGN="CENTER">Security considerations with Session tickets</H3>
1717 <p>
1718 For strong security, please use end-to-end encryption. This can be
1719 achieved using a VPN (Virtual Private Network), SSH tunnel, or a HTTPS
1720 capable server with OpenSSL. The session ticket system of CGIscriptor.pl
1721 is intended to be used as a simple authentication mechanism WITHOUT
1722 END-TO-END ENCRYPTION. The authenticating mechanism tries to use some
1723 simple means to protect the authentication process from eavesdropping.
1724 For this it uses a secure hash function, SHA256. For all practial purposes,
1725 it is impossible to "decrypt" a SHA256 sum. But this login scheme is
1726 only as secure as your browser. Which, in general, is not very secure.
1727 </p>
1728 <p>
1729 One fundamental weakness of the implemented procedure is that the Client obtains
1730 the code to encrypt the passwords from the server. It is the JavaScript
1731 code in the HTML pages. An attacker who could place himself between Server
1732 and Client, a <em>man in the middle attack (MITM)</em>, could change the code to
1733 reveal the plaintext password and other information. There is no real
1734 protection against this attack without end-to-end encryption and
1735 authentication. A simple, but rather cumbersome, way to check for such
1736 attacks would be to store known good copies of the pages (downloaded
1737 with a browser or automatically with <em>curl</em> or <em>wget</em>) and
1738 then use other tools to download new pages at random intervals and compare
1739 them to the old pages. For instance, the following line would remove the
1740 variable ticket codes and give a fixed SHA256 sum for the original
1741 <em>Login.html</em> page+code:
1742 <pre>
1743 curl http://localhost:8080/Private/index.html | sed 's/=\"[a-z0-9]\{64\}\"/=""/g' | shasum -a 256
1744 </pre>
1745 A simple <em>diff</em> command between old and new files should give
1746 only differences in half a dozen lines, where only hexadecimal salt
1747 values will actually differ.
1748 </p>
1749 <p>
1750 A sort of solution for the MITM attack problem that <em>might</em> protect at
1751 least the plaintext password would be to run a trusted web
1752 page from local storage to handle password input. The solution would be
1753 to add a hidden iFrame tag loading the untrusted page from the URL and
1754 extract the needed ticket and salt values. Then run the stored, trusted,
1755 code with these values. It is not (yet) possible to set the
1756 required session storage inside the browser, so this method only works
1757 for IPADDRESS sessions and plain SESSION tickets. There are many security
1758 problems with this "solution".
1759 </p>
1760 <p>
1761 If you are able to ascertain the integrity of the login page using any
1762 of the above methods, you can check whether the IP address seen by the
1763 login server is indeed the IP address of your computer. The IP address
1764 of the REMOTE_HOST (your visible IP address) is part of the login
1765 "password". It is stored in the login page as a CLIENTIPADDRESS. It can
1766 can be inspected by clicking the "Check IP address" box. Provided the
1767 MitM attacker cannot spoof your IP address, you can ensure that the login
1768 server sees your IP address and not that of an attacker.
1769 </p>
1770 <p>
1771 Humans tend to reuse passwords. A compromise of a site running
1772 CGIscriptor.pl could therefore lead to a compromise of user accounts at
1773 other sites. Therefore, plain text passwords are never stored, used, or
1774 exchanged. Instead, the plain password and user name are "encrypted" with
1775 a server site salt value. Actually, all are concatenated and hashed
1776 with a one-way secure hash function (SHA256) into a single string.
1777 Whenever the word "password" is used, this hash sum is meant. Note that
1778 the salts are generated from /dev/urandom. You should check whether the
1779 implementation of /dev/urandom on your platform is secure before
1780 relying on it. This might be a problem when running CGIscriptor under
1781 Cygwin on MS Windows.<br />
1782 <em>Note: no attempt is made to slow down the password hash, so bad
1783 passwords can be cracked by brute force</em>
1784 </p>
1785 <p>
1786 As the (hashed) passwords are all that is needed to identify at the site,
1787 these should not be stored in this form. A site specific passphrase
1788 can be entered as an environment variable ($ENV{'CGIMasterKey'}). This
1789 phrase is hashed with the server site salt and the result is hashed with
1790 the user name and then XORed with the password when it is stored. Also, to
1791 detect changes to the account (PASSWORD) and session tickets, a
1792 (HMAC) hash of some of the contents of the ticket with the server salt and
1793 CGIMasterKey is stored in each ticket.
1794 </p>
1795 <p>
1796 Creating a valid (hashed) password, encrypt it with the CGIMasterKey and
1797 construct a signature of the ticket are non-trivial. This has to be redone
1798 with every change of the ticket file or CGIMasterKey change. CGIscriptor
1799 can do this from the command line with the command:
1800 <pre>
1801 perl CGIscriptor.pl --managelogin salt=Private/.Passwords/SALT \
1802   masterkey='Sherlock investigates oleander curry in Bath' \
1803   password='There is no password like more password' \
1804   admin
1805 </pre>
1806 CGIscriptor will exit after this command with the first option being
1807 <em>--managelogin</em>. Options have the form:
1808 <ul>
1809 <li>salt=[file or string]<br />Server salt value to use io the value
1810     stored in the ticket file. Will replace the stored value if a new
1811     password is given. If you change the server salt, you have to
1812     reset all the passwords. There is <em>absolutely no</em> procedure known
1813     to recover plaintext passwords, except asking the account holders.
1814     You are strongly adviced to make a backup before you apply such a change</li>
1815 <li>masterkey=[file or string]<br />CGIMasterKey used to read and decrypt
1816     the ticket</li>
1817 <li>newmasterkey=[file or string]<br />CGIMasterKey used to encrypt, sign,
1818     and write the ticket. Defaults to the masterkey. If you change
1819     the masterkey, you will have to reset all the accounts. You are strongly
1820     adviced to make a backup before you apply such a change</li>
1821 <li>password=[file or string]<br />New plaintext password</li>
1822 </ul>
1823 When the value of an option is a existing file path, the first line of
1824 that file is used. Options are followed by one or more paths plus names
1825 of existing ticket files. Each password option is only used for a single
1826 ticket file. It is most definitely a bad idea to use a password that is
1827 identical to an existing filepath, as the file will be read instead. Be
1828 aware that the name of the file should be a cleaned up version of the
1829 Username. This will not be checked.
1830 </p>
1831 <p>
1832 For the authentication and a change of password, the (old) password
1833 is used to "encrypt" a random one-time token or the new password,
1834 respectively. For authentication, decryption is not needed, so a secure
1835 hash function (SHA256) is used to create a one-way hash sum "encryption".
1836 A new password must be decrypted. New passwords are encryped by XORing
1837 them with the old password.
1838 </p>
1839
1840 <h3 align=CENTER>Strong Passwords: It is so easy</h3>
1841 <p align=CENTER><em>If you only could see what you are typing</em></p>
1842 <p >
1843 Your password might be vulnerable to
1844 <a href="https://en.wikipedia.org/wiki/Brute_force_attack">
1845 <em>brute force</em></a> guessing. Protections against such attacks are
1846 costly in terms of code complexity, bugs, and execution time.
1847 However, there is a very simple and secure counter measure. See the
1848 <a href="http://xkcd.com/936/" target="_blank">XKCD comic</a>. The phrase,
1849 <em>There is no password like more password</em> would
1850 be both much easier to remember, and still stronger than
1851 <em>h4]D%@m:49</em>, at least before this phrase was pasted as an example
1852 on the Internet.<br />
1853 For the procedures used at this site, a basic computer setup can check
1854 in the order of a billion passwords per second. You need a password (or
1855 phrase) strength in the order of 56 bits to be a little secure (one year
1856 on a single computer). One of the largest network in the world, Bitcoin
1857 mining, can check some 12 terahashes per second (June 2012). This
1858 corresponds to checking 6 times 10<sup>12</sup> passwords per second.
1859 It would take a passwords strength of ~68 bits to keep the equivalent of
1860 the Bitcoin computer network occupied for around a year before it found
1861 a match.<br />
1862 Please be so kind and add the name of your favorite flower, dish,
1863 fictional character, or small town to your password. Say,
1864 <em>Oleander</em>, <em>Curry</em>, <em>Sherlock</em>, or <em>Bath</em>, UK
1865 (each adds ~12 bits) or even the phrase <em>Sherlock investigates oleander
1866 curry in Bath</em> (adds &gt; 56 bits, note that oleander is <em>poisonous</em>,
1867 so do not try this curry at home). That would be more effective than
1868 adding a thousand rounds of encryption.
1869 Typing long passwords without seeing what you are typing is problematic.
1870 So a button should be included to make password visible.
1871 </p>
1872 <h3 align=CENTER>Technical matters</h3>
1873 <p>
1874 Client side JavaScript code definitions. Variable names starting with '$'
1875 are CGIscriptor CGI variables. Some of the hashes could be strengthened
1876 by switching to HMAC signatures. However, the security issues of
1877 maintaining parallel functions for HMAC in both Perl and Javascript seem
1878 to be  more serious than the attack vectors against the hashes. But HMAC
1879 is indeed used for the ticket signatures.
1880 </p>
1881 <pre>
1882 // On Login
1883 HashPlaintextPassword() {
1884         var plaintextpassword = document.getElementById('PASSWORD');
1885         var serversalt = document.getElementById('SERVERSALT');
1886         var username = document.getElementById('CGIUSERNAME');
1887         return hex_sha256(plaintextpassword.value+username.value.toLowerCase()+serversalt.value);
1888 }
1889 var randomsalt = $RANDOMSALT; // From CGIscriptor
1890 var loginticket = $LOGINTICKET; // From CGIscriptor
1891 // Hash plaintext password
1892 var password = HashPlaintextPassword();
1893 // Authorize login
1894 var hashedpassword = hex_sha256(randomsalt+password);
1895 // Sessionticket
1896 var sessionticket = hex_sha256(loginticket+password);
1897 sessionStorage.setItem("CGIscriptorPRIVATE", sessionticket);
1898 // Secretkey for encrypting new passwords, acts like a one-time pad
1899 // Is set anew with <em>every</em> login, ie, also whith password changes
1900 // and for each create new user request
1901 var secretkey = hex_sha256(randomsalt+loginticket+password);
1902 sessionStorage.setItem("CGIscriptorSECRET", secretkey);
1903
1904 // For a SESSION type request
1905 sessionticket = hex_sha256(sessionStorage.getItem("CGIscriptorPRIVATE"));
1906 createCookie("CGIscriptorSESSION",sessionticket, 0, "");
1907
1908 // For a CHALLENGE type request
1909 var sessionset = "$CHALLENGETICKET"; // From CGIscriptor
1910 var sessionkey = sessionStorage.getItem("CGIscriptorPRIVATE");
1911 sessionticket = hex_sha256(sessionset+sessionkey);
1912 createCookie("CGIscriptorCHALLENGE",sessionticket, 0, "");
1913
1914 // For transmitting a new password
1915 HashPlaintextNewPassword() {
1916         var plaintextpassword = document.getElementById('NEWPASSWORD');
1917         var serversalt = document.getElementById('SERVERSALT');
1918         var username = document.getElementById('NEWUSERNAME');
1919         return hex_sha256(plaintextpassword.value+username.value.toLowerCase()+serversalt.value);
1920 }
1921
1922 var newpassword = document.getElementById('NEWPASSWORD');
1923 var newpasswordrep = document.getElementById('NEWPASSWORDREP');
1924 // Hash plaintext password
1925 newpassword.value = HashPlaintextNewPassword();
1926 var secretkey = sessionStorage.getItem("CGIscriptorSECRET");
1927
1928 var encrypted = XOR_hex_strings(secretkey, newpassword.value);
1929 newpassword.value = encrypted;
1930 newpasswordrep.value = encrypted;
1931
1932 // XOR of hexadecimal strings of equal length
1933 function XOR_hex_strings(hex1, hex2) {
1934         var resultHex = "";
1935         var maxlength = Math.max(hex1.length, hex2.length);
1936
1937         for(var i=0; i &lt; maxlength; ++i) {
1938                 var h1 = hex1.charAt(i);
1939                 if(! h1) h1='0';
1940                 var h2 = hex2.charAt(i);
1941                 if(! h2) h2 ='0';
1942                 var d1 = parseInt(h1,16);
1943                 var d2 = parseInt(h2,16);
1944                 var resultD = d1^d2;
1945                 resultHex = resultHex+resultD.toString(16);
1946         };
1947         return resultHex;
1948 };
1949 </pre>
1950 <p>
1951 Password encryption based on <em>$ENV{'CGIMasterKey'}</em>.
1952 Server side Perl code:
1953 </p>
1954 <pre>
1955 # Password encryption
1956 my $masterkey = $ENV{'CGIMasterKey'}
1957 my $hash1 = hash_string($masterkey.$serversalt);
1958 my $CryptKey = hash_string($username.$hash1);
1959 $password = XOR_hex_strings($CryptKey,$password);
1960
1961 # Key for HMAC signing
1962 my $hash1 = hash_string($masterkey.$serversalt);
1963 my $HMACKey = hash_string($username.$hash1);
1964 </pre>
1965
1966 <A NAME="USEREXTENSIONS"><H2 ALIGN="CENTER">USER EXTENSIONS</H2></A>
1967
1968 <P>
1969 A CGIscriptor package is attached to the bottom of this file. With
1970 this package you can personalize your version of CGIscriptor by
1971 including often used perl routines. These subroutines can be
1972 accessed by prefixing their names with CGIscriptor::, e.g.,
1973 </P>
1974
1975 <PRE>
1976 &lt;SCRIPT TYPE="text/ssperl"&gt;
1977 CGIscriptor::ListDocs("/Books/*") # List all documents in /Books
1978 &lt;/SCRIPT&gt;
1979 </PRE>
1980
1981 <P>
1982 It already contains some useful subroutines for Document Management.
1983 As it is a separate package, it has its own namespace, isolated from
1984 both the evaluator and the main program. To access variables from
1985 the document &lt;SCRIPT&gt;&lt;/SCRIPT&gt; blocks, use $CGIexecute::&lt;var&gt;.
1986 </P>
1987
1988 <P>
1989 Currently, the following functions are implemented
1990 (precede them with CGIscriptor::, see below for more information)
1991 </P>
1992
1993 <UL>
1994     <LI>SAFEqx ('String') -&gt; result of qx/"String"/ # Safe application of ``-quotes<br>
1995     Is used by text/osshell Shell scripts. Protects all CGI
1996     (client-supplied) values with single quotes before executing the
1997     commands (one of the few functions that also works WITHOUT CGIscriptor::
1998     in front)
1999     <LI>defineCGIvariable ($name[, $default) -&gt; 0/1 (i.e.,
2000     failure/success)<br>
2001     Is used by the META tag to define and initialize CGI and ENV
2002     name/value pairs. Tries to obtain an initializing value from (in
2003     order):<br>
2004     $ENV{$name}<br>
2005     The Query string<br>
2006     The default value given (if any)<br>
2007     (one of the few functions that also works WITHOUT CGIscriptor::
2008     in front)
2009     <LI>CGIsafeFileName (FileName) -> FileName or ""<br>
2010     Check a string against the Allowed File Characters (and ../ /..).
2011     Returns an empty string for unsafe filenames.
2012     <LI>CGIsafeEmailAddress (Email) -> Email or ""<br>
2013     Check a string against correct email address pattern.
2014     Returns an empty string for unsafe addresses.
2015     <LI>RedirectShellScript ('CommandString') -&gt; FILEHANDLER or undef<br>
2016     Open a named PIPE for SAFEqx to receive ALL shell scripts
2017     <LI>URLdecode (URL encoded string) -&gt; plain string # Decode URL encoded argument<br>
2018     <LI>URLencode (plain string) -&gt; URL encoded string # Encode argument as URL code<br>
2019     <LI>CGIparseValue (ValueName [, URL_encoded_QueryString]) -&gt; Decoded value<br>
2020     Extract the value of a CGI variable from the global or a private
2021     URL-encoded query (multipart POST raw, NOT decoded)
2022     <li>CGIparseValueList (ValueName [, URL_encoded_QueryString])
2023      -&gt; List of decoded values.<br>
2024     As CGIparseValue, but now assembles ALL values of ValueName into a list.
2025     <LI>CGIparseHeader (ValueName [, URL_encoded_QueryString]) -> Header<br>
2026     Extract the header of a multipart CGI variable from the global or a private
2027     URL-encoded query ("" when not a multipart variable or absent)
2028     <LI>CGIparseForm ([URL_encoded_QueryString]) -&gt; Decoded Form<br>
2029     Decode the complete global URL-encoded query or a private
2030     URL-encoded query
2031     <LI>read_url(URL)<br>
2032     Returns the page from URL (with added base tag, both FTP and HTTP)
2033     Uses main::GET_URL(URL, 1) to get at the command to read the URL.
2034     <LI>BrowseDirs(RootDirectory [, Pattern, Startdir, CGIname]) # print browsable directories
2035     <LI>ListDocs(Pattern [,ListType])  # Prints a nested HTML directory listing of
2036     all documents, e.g., ListDocs("/*", "dl");.<br>
2037     <LI>HTMLdocTree(Pattern [,ListType])  # Prints a nested HTML listing of all
2038     local links starting from a given document, e.g.,
2039     HTMLdocTree("/Welcome.html", "dl");<br>
2040 </UL>
2041
2042 <A NAME="RESULTSSTACK"><H2 ALIGN="CENTER">THE RESULTS STACK: @CGIscriptorResults</H2></A>
2043
2044 <P>
2045 If the pseudo-variable "$CGIscriptorResults" has been defined in a
2046 META tag, all subsequent SCRIPT and META results are pushed
2047 on the @CGIscriptorResults stack. This list is just another
2048 Perl variable and can be used and manipulated like any other list.
2049 $CGIscriptorResults[-1] is always the last result.
2050 This is only of limited use, e.g., to use the results of an OS shell
2051 script inside a Perl script. Will NOT contain the results of Pipes
2052 or code from MIME-profiling.
2053 </P>
2054
2055 <A NAME="CGIPREDEFINED"><H2 ALIGN="CENTER">USEFULL CGI PREDEFINED VARIABLES (DO NOT ASSIGN TO THESE)</H2></A>
2056
2057 <ul>
2058 <li>$CGI_HOME - The ServerRoot directory
2059 <li>$CGI_Decoded_QS - The complete decoded Query String
2060 <li>$CGI_Content_Length - The ACTUAL length of the Query String
2061 <li>$CGI_Date - Current date and time
2062 <li>$CGI_Year $CGI_Month $CGI_Day $CGI_WeekDay - Current Date
2063 <li>$CGI_Time - Current Time
2064 <li>$CGI_Hour $CGI_Minutes $CGI_Seconds - Current Time, split
2065 GMT Date/Time:
2066 <li>$CGI_GMTYear $CGI_GMTMonth $CGI_GMTDay $CGI_GMTWeekDay $CGI_GMTYearDay
2067 <li>$CGI_GMTHour $CGI_GMTMinutes $CGI_GMTSeconds $CGI_GMTisdst
2068 </ul>
2069
2070 <A NAME="ENVIRONMENT"><H2 ALIGN="CENTER">USEFULL CGI ENVIRONMENT VARIABLES</H2></A>
2071
2072 <P>
2073 Variables accessible (in APACHE) as $ENV{"&lt;name&gt;"}
2074 (see: "http://hoohoo.ncsa.uiuc.edu/cgi/env.html"):
2075 </P>
2076
2077 <UL>
2078     <LI>QUERY_STRING - The query part of URL, that is, everything that follows the
2079     question mark.
2080     <LI>PATH_INFO    - Extra path information given after the script name
2081     <LI>PATH_TRANSLATED - Extra pathinfo translated through the rule system.
2082     (This doesn't always make sense.)
2083     <LI>REMOTE_USER  - If the server supports user authentication, and the script is
2084     protected, this is the username they have authenticated as.
2085     <LI>REMOTE_HOST  - The hostname making the request. If the server does not have
2086     this information, it should set REMOTE_ADDR and leave this unset
2087     <LI>REMOTE_ADDR  - The IP address of the remote host making the request.
2088     <LI>REMOTE_IDENT - If the HTTP server supports RFC 931 identification, then this
2089     variable will be set to the remote user name retrieved from
2090     the server. Usage of this variable should be limited to logging
2091     only.
2092     <LI>AUTH_TYPE    - If the server supports user authentication, and the script
2093     is protected, this is the protocol-specific authentication
2094     method used to validate the user.
2095     <LI>CONTENT_TYPE - For queries which have attached information, such as HTTP
2096     POST and PUT, this is the content type of the data.
2097     <LI>CONTENT_LENGTH - The length of the said content as given by the client.
2098     <LI>SERVER_SOFTWARE - The name and version of the information server software
2099     answering the request (and running the gateway).
2100     Format: name/version
2101     <LI>SERVER_NAME  - The server's hostname, DNS alias, or IP address as it
2102     would appear in self-referencing URLs
2103     <LI>GATEWAY_INTERFACE - The revision of the CGI specification to which this
2104     server complies. Format: CGI/revision
2105     <LI>SERVER_PROTOCOL - The name and revision of the information protocol this
2106     request came in with. Format: protocol/revision
2107     <LI>SERVER_PORT  - The port number to which the request was sent.
2108     <LI>REQUEST_METHOD - The method with which the request was made. For HTTP,
2109     this is "GET", "HEAD", "POST", etc.
2110     <LI>SCRIPT_NAME  - A virtual path to the script being executed, used for
2111     self-referencing URLs.
2112     <LI>HTTP_ACCEPT  - The MIME types which the client will accept, as given by
2113     HTTP headers. Other protocols may need to get this
2114     information from elsewhere. Each item in this list should
2115     be separated by commas as per the HTTP spec.
2116     Format: type/subtype, type/subtype
2117     <LI>HTTP_USER_AGENT - The browser the client is using to send the request.
2118     General format: software/version library/version.
2119 </UL>
2120
2121 <A NAME="RUNNING"><H2 ALIGN="CENTER">INSTRUCTIONS FOR RUNNING CGIscriptor ON UNIX</H2></A>
2122
2123 <P>
2124 CGIscriptor.pl will run on any WWW server that runs Perl scripts,
2125 just add a line like the following to your srm.conf file
2126 (Apache example):
2127 </P>
2128
2129 <pre>
2130 ScriptAlias /SHTML/ /real-path/CGIscriptor.pl/
2131 </pre>
2132
2133 <p>
2134 URL's that refer to http://www.your.address/SHTML/... will now be handled
2135 by CGIscriptor.pl, which can use a private directory tree (default is the
2136 DOCUMENT_ROOT directory tree, but it can be anywhere, see manual).
2137 </P>
2138
2139 <p>
2140 If your hosting ISP won't let you add ScriptAlias lines you can use
2141 the following "rewrite"-based "scriptalias" in .htaccess
2142 (from Gerd Franke)
2143 </P>
2144
2145 <pre>
2146 RewriteEngine   On
2147 RewriteBase  /
2148 RewriteCond %{REQUEST_FILENAME} .html$
2149 RewriteCond %{SCRIPT_FILENAME}  !cgiscriptor.pl$
2150 RewriteCond %{REQUEST_FILENAME} -f
2151 RewriteRule     ^(.*)$  /cgi-bin/cgiscriptor.pl/$1?%{QUERY_STRING}
2152 </Pre>
2153
2154 <p>
2155 Everthing with the extension ".html" and not including "cgiscriptor.pl"
2156 in the url and where the file "path/filename.html" exists is redirected
2157 to "/cgi.bin/cgiscriptor.pl/path/filename.html?query".
2158 The user configuration should get the same path-level as the
2159 .htaccess-file:
2160 </P>
2161
2162 <pre>
2163 # Just enter your own directory path here
2164 $YOUR_HTML_FILES = "$ENV{'DOCUMENT_ROOT'}";
2165 # use DOCUMENT_ROOT only, if .htaccess lies in the root-directory.
2166 </Pre>
2167
2168 <p>
2169 If this .htaccess goes in a specific directory, the path to this
2170 directory must be added to $ENV{'DOCUMENT_ROOT'}.
2171 </p>
2172
2173 <p>
2174 The CGIscriptor file contains all documentation as comments. These comments
2175 can be removed to speed up loading (e.g., `egrep -v '^#' CGIscriptor.pl` >
2176 leanScriptor.pl). A bare bones version of CGIscriptor.pl, lacking
2177 documentation, most comments, access control, example functions etc.
2178 (but still with the copyright notice and some minimal documentation)
2179 can be obtained by calling CGIscriptor.pl on the command line with the
2180 '-slim' command line argument, e.g.,
2181 </p>
2182
2183 <PRE>
2184 &gt;CGIscriptor.pl -slim &gt; slimCGIscriptor.pl
2185 </PRE>
2186
2187 <P>
2188 CGIscriptor.pl can be run from the command line with &lt;path&gt; and &lt;query&gt; as
2189 arguments, as `CGIscriptor.pl &lt;path&gt; &lt;query&gt;`, inside a perl script with
2190 'do CGIscriptor.pl' after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING},
2191 or CGIscriptor.pl can be loaded with 'require "/real-path/CGIscriptor.pl"'.
2192 In the latter case, requests are processed by 'Handle_Request();'
2193 (again after setting $ENV{PATH_INFO} and $ENV{QUERY_STRING}).
2194 </P>
2195
2196 <p>
2197 The --help command line switch will print the manual.
2198 </p>
2199
2200 <P>
2201 Using the command line execution option, CGIscriptor.pl can be used as a document
2202 (meta-)preprocessor. If the first argument is '-', STDIN will be read. For example:
2203 </P>
2204
2205 <PRE>
2206 &gt; cat MyDynamicDocument.html | CGIscriptor.pl - '[QueryString]' &gt; MyStaticFile.html
2207 </PRE>
2208
2209 <P>
2210 This command line will produce a STATIC file with the DYNAMIC content of
2211 MyDocument.html "interpolated". This option would be very dangerous when
2212 available over the internet. If someone could sneak a
2213 'http://www.your.domain/-' URL past your server, CGIscriptor could EXECUTE
2214 any POSTED contend. Therefore, for security reasons, STDIN will NOT
2215 be read if ANY of the HTTP server environment variables is set (e.g., SERVER_PORT,
2216 SERVER_PROTOCOL, SERVER_NAME, SERVER_SOFTWARE, HTTP_USER_AGENT,
2217 REMOTE_ADDR).<br>
2218 This block on processing STDIN on HTTP requests can be lifted by setting
2219 <pre>
2220 $BLOCK_STDIN_HTTP_REQUEST = 0;
2221 </pre>
2222 In the security configuration. But be carefull when doing this.
2223 It can be very dangerous.
2224 </P>
2225
2226 <P>
2227 Running demo's and more information can be found at
2228 http://www.fon.hum.uva.nl/~rob/OSS/OSS.html
2229 </P>
2230
2231 <P>
2232 A pocket-size HTTP daemon, CGIservlet.pl, is available from my web site
2233 or CPAN that can use CGIscriptor.pl as the base of a µWWW server and
2234 demonstrates its use.
2235 </P>
2236
2237 <A NAME="NON-UNIX"><H2 ALIGN="CENTER">NON-UNIX PLATFORMS</H2></A>
2238
2239 <P>
2240 CGIscriptor.pl was mainly developed and tested on UNIX. However, as I
2241 coded part of the time on an Apple Macintosh under MacPerl, I made sure
2242 CGIscriptor did run under MacPerl (with command line options). But only as
2243 an independend script, not as part of a HTTP server. I have used it
2244 under Apache in Windows XP.
2245 </P>
2246
2247 <A NAME="license"><H2 ALIGN="CENTER">license</H2></A>
2248
2249 <P>
2250 This program is free software; you can redistribute it and/or
2251 modify it under the terms of the GNU General Public License
2252 as published by the Free Software Foundation; either version 2
2253 of the License, or (at your option) any later version.
2254 </P>
2255
2256 <P>
2257 This program is distributed in the hope that it will be useful,
2258 but WITHOUT ANY WARRANTY; without even the implied warranty of
2259 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
2260 GNU General Public License for more details.
2261 </P>
2262
2263 <P>
2264 You should have received a copy of the GNU General Public License
2265 along with this program; if not, write to the Free Software
2266 Foundation, Inc., 59 Temple Place - Suite 330,
2267 Boston, MA  02111-1307, USA.
2268 </P>
2269
2270 <PRE>
2271 Author: Rob van Son
2272         email:
2273         R.J.J.H.vanSon@gmail.com
2274         University of Amsterdam
2275
2276 Date:   May 22, 2000
2277 Ver:    2.0
2278 Env:    Perl 5.002
2279 </PRE>
2280 </BODY>
2281
2282 </HTML>