fangle.tm

   1 <TeXmacs|1.0.7.10>
   2
   3 <style|<tuple|book|fangle|header-book|tmdoc-keyboard>>
   4
   5 <\body>
   6   <hide-preamble|<assign|LyX|<macro|L<space|-0.1667em><move|Y|0fn|-0.25em><space|-0.125em>X>><assign|par-first|0fn><assign|par-par-sep|0.5fn>>
   7
   8   <doc-data|<doc-title|fangle>|<doc-author-data|<author-name|Sam
   9   Liddicott>|<\author-address>
  10     sam@liddicott.com
  11   </author-address>>|<doc-date|August 2009>>
  12
  13   <section*|Introduction>
  14
  15   <name|Fangle> is a tool for fangled literate programming. Newfangled is
  16   defined as <em|New and often needlessly novel> by
  17   <name|TheFreeDictionary.com>.
  18
  19   In this case, fangled means yet another not-so-new<footnote|but improved.>
  20   method for literate programming.
  21
  22   <name|Literate Programming> has a long history starting with the great
  23   <name|Donald Knuth> himself, whose literate programming tools seem to make
  24   use of as many escape sequences for semantic markup as <TeX> (also by
  25   <name|Donald Knuth>).
  26
  27   <name|Norman Ramsey> wrote the <name|Noweb> set of tools
  28   (<verbatim|notangle>, <verbatim|noweave> and <verbatim|noroots>) and
  29   helpfully reduced the amount of magic character sequences to pretty much
  30   just <verbatim|\<less\>\<less\>>, <verbatim|\<gtr\>\<gtr\>> and
  31   <verbatim|@>, and in doing so brought the wonders of literate programming
  32   within my reach.
  33
  34   While using the <LyX> editor for <LaTeX> editing I had various troubles
  35   with the noweb tools, some of which were my fault, some of which were
  36   noweb's fault and some of which were <LyX>'s fault.
  37
  38   <name|Noweb> generally brought literate programming to the masses through
  39   removing some of the complexity of the original literate programming, but
  40   this would be of no advantage to me if the <LyX> / <LaTeX> combination
  41   brought more complications in their place.
  42
  43   <name|Fangle> was thus born (originally called <name|Newfangle>) as an awk
  44   replacement for notangle, adding some important features, like better
  45   integration with <LyX> and <LaTeX> (and later <TeXmacs>), multiple output
  46   format conversions, and fixing notangle bugs like indentation when using -L
  47   for line numbers.
  48
  49   Significantly, fangle is just one program which replaces various programs
  50   in <name|Noweb>. Noweave is done away with and implemented directly as
  51   <LaTeX> macros, and noroots is implemented as a function of the untangler
  52   fangle.
  53
  54   Fangle is written in awk for portability reasons, awk being available for
  55   most platforms. A Python version<\footnote>
  56     hasn't anyone implemented awk in python yet?
  57   </footnote> was considered for the benefit of <LyX> but a scheme version
  58   for <TeXmacs> will probably materialise first; as <TeXmacs> macro
  59   capabilities help make edit-time and format-time rendering of fangle chunks
  60   simple enough for my weak brain.
  61
  62   As an extension to many literate-programming styles, Fangle permits code
  63   chunks to take parameters and thus operate somewhat like C pre-processor
  64   macros, or like C++ templates. Name parameters (or even local
  65   <em|variables> in the callers scope) are anticipated, as parameterized
  66   chunks <emdash> useful though they are <emdash> are hard to comprehend in
  67   the literate document.
  68
  69   <section*|License><new-page*><label|License>
  70
  71   Fangle is licensed under the GPL 3 (or later).
  72
  73   This doesn't mean that sources generated by fangle must be licensed under
  74   the GPL 3.
  75
  76   This doesn't mean that you can't use or distribute fangle with sources of
  77   an incompatible license, but it means you must make the source of fangle
  78   available too.
  79
  80   As fangle is currently written in awk, an interpreted language, this should
  81   not be too hard.
  82
  83   <\nf-chunk|gpl3-copyright>
  84     <item># fangle - fully featured notangle replacement in awk
  85
  86     <item>#
  87
  88     <item># Copyright (C) 2009-2010 Sam Liddicott
  89     \<less\>sam@liddicott.com\<gtr\>
  90
  91     <item>#
  92
  93     <item># This program is free software: you can redistribute it and/or
  94     modify
  95
  96     <item># it under the terms of the GNU General Public License as published
  97     by
  98
  99     <item># the Free Software Foundation, either version 3 of the License, or
 100
 101     <item># (at your option) any later version.
 102
 103     <item>#
 104
 105     <item># This program is distributed in the hope that it will be useful,
 106
 107     <item># but WITHOUT ANY WARRANTY; without even the implied warranty of
 108
 109     <item># MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. \ See the
 110
 111     <item># GNU General Public License for more details.
 112
 113     <item>#
 114
 115     <item># You should have received a copy of the GNU General Public License
 116
 117     <item># along with this program. \ If not, see
 118     \<less\>http://www.gnu.org/licenses/\<gtr\>.
 119   </nf-chunk|text|>
 120
 121   <\table-of-contents|toc>
 122   </table-of-contents>
 123
 124   <part|Using Fangle>
 125
 126   <chapter|Introduction to Literate Programming>
 127
 128   Todo: Should really follow on from a part-0 explanation of what literate
 129   programming is.
 130
 131   <chapter|Running Fangle>
 132
 133   Fangle is a replacement for <name|noweb>, which consists of
 134   <verbatim|notangle>, <verbatim|noroots> and <verbatim|noweave>.
 135
 136   Like <verbatim|notangle> and <verbatim|noroots>, <verbatim|fangle> can read
 137   multiple named files, or from stdin.
 138
 139   <section|Listing roots>
 140
 141   The -r option causes fangle to behave like noroots.
 142
 143   <code*|fangle -r filename.tex>
 144
 145   will print out the fangle roots of a tex file.\
 146
 147   Unlike the <verbatim|noroots> command, the printed roots are not enclosed
 148   in angle brackets e.g. <verbatim|\<less\>\<less\>name\<gtr\>\<gtr\>>,
 149   unless at least one of the roots is defined using the <verbatim|notangle>
 150   notation <verbatim|\<less\>\<less\>name\<gtr\>\<gtr\>=>.
 151
 152   Also, unlike noroots, it prints out all roots --- not just those that are
 153   not used elsewhere. I find that a root not being used doesn't make it
 154   particularly top level <emdash> and so-called top level roots could also be
 155   included in another root as well.\
 156
 157   My convention is that top level roots to be extracted begin with
 158   <verbatim|./> and have the form of a filename.
 159
 160   Makefile.inc, discussed in <reference|makefile.inc>, can automatically
 161   extract all such sources prefixed with <verbatim|./>
 162
 163   <section|Extracting roots>
 164
 165   notangle's <verbatim|-R> and <verbatim|-L> options are supported.
 166
 167   If you are using <LyX> or <LaTeX>, the standard way to extract a file would
 168   be:
 169
 170   <verbatim|fangle -R./Makefile.inc fangle.tex \<gtr\> ./Makefile.inc>
 171
 172   If you are using <TeXmacs>, the standard way to extract a file would
 173   similarly be:
 174
 175   <verbatim|fangle -R./Makefile.inc fangle.txt \<gtr\> ./Makefile.inc>
 176
 177   <TeXmacs> users would obtain the text file with a <em|verbatim> export from
 178   <TeXmacs> which can be done on the command line with <verbatim|texmacs -s
 179   -c fangle.tm fangle.txt -q>
 180
 181   Unlike the <verbatim|noroots> command, the <verbatim|<verbatim|-L>> option
 182   to generate C pre-preocessor <verbatim|#file> style line-number
 183   directives,does not break indenting of the generated file..
 184
 185   Also, thanks to mode tracking (described in <reference|modes>) the
 186   <verbatim|-L> option does not interrupt (and break) multi-line C macros
 187   either.
 188
 189   This does mean that sometimes the compiler might calculate the source line
 190   wrongly when generating error messages in such cases, but there isn't any
 191   other way around if multi-line macros include other chunks.
 192
 193   Future releases will include a mapping file so that line/character
 194   references from the C compiler can be converted to the correct part of the
 195   source document.
 196
 197   <section|Formatting the document>
 198
 199   The noweave replacement built into the editing and formatting environment
 200   for <TeXmacs>, <LyX> (which uses <LaTeX>), and even for raw <LaTeX>.
 201
 202   Use of fangle with <TeXmacs>, <LyX> and <LaTeX> are explained the the next
 203   few chapters.
 204
 205   <chapter|Using Fangle with <LaTeX>>
 206
 207   Because the noweave replacement is impemented in <LaTeX>, there is no
 208   processing stage required before running the <LaTeX> command. Of course,
 209   <LaTeX> may need running two or more times, so that the code chunk
 210   references can be fully calculated.
 211
 212   The formatting is managed by a set of macros shown in
 213   <reference|latex-source>, and can be included with:
 214
 215   <verbatim|\\usepackage{fangle.sty}>
 216
 217   Norman Ramsay's origial <filename|noweb.sty> package is currently required
 218   as it is used for formatting the code chunk captions.
 219
 220   The <filename|listings.sty> package is required, and is used for formatting
 221   the code chunks and syntax highlighting.
 222
 223   The <filename|xargs.sty> package is also required, and makes writing
 224   <LaTeX> macro so much more pleasant.
 225
 226   <todo|Add examples of use of Macros>
 227
 228   <chapter|Using Fangle with <LyX>>
 229
 230   <LyX> uses the same <LaTeX> macros shown in <reference|latex-source> as
 231   part of a <LyX> module file <filename|fangle.module>, which automatically
 232   includes the macros in the document pre-amble provided that the fangle
 233   <LyX> module is used in the document.
 234
 235   <section|Installing the <LyX> module>
 236
 237   Copy <filename|fangle.module> to your <LyX> layouts directory, which for
 238   unix users will be <filename|~/.lyx/layouts>
 239
 240   In order to make the new literate styles availalble, you will need to
 241   reconfigure <LyX> by clicking Tools-\<gtr\>Reconfigure, and then re-start
 242   <LyX>.
 243
 244   <section|Obtaining a decent mono font>
 245
 246   The syntax high-lighting features of <name|lstlistings> makes use of bold;
 247   however a mono-space tt font is used to typeset the listings. Obtaining a
 248   <with|font-family|tt|<strong|bold> tt font> can be impossibly difficult and
 249   amazingly easy. I spent many hours at it, following complicated
 250   instructions from those who had spend many hours over it, and was finally
 251   delivered the simple solution on the lyx mailing list.
 252
 253   <subsection|txfonts>
 254
 255   The simple way was to add this to my preamble:
 256
 257   <\verbatim>
 258     \\usepackage{txfonts}
 259
 260     \\renewcommand{\\ttdefault}{txtt}
 261   </verbatim>
 262
 263   \;
 264
 265   <subsection|ams pmb>
 266
 267   The next simplest way was to use ams poor-mans-bold, by adding this to the
 268   pre-amble:
 269
 270   <\verbatim>
 271     \\usepackage{amsbsy}
 272
 273     %\\renewcommand{\\ttdefault}{txtt}
 274
 275     %somehow make \\pmb be the command for bold, forgot how, sorry, above
 276     line not work
 277   </verbatim>
 278
 279   It works, but looks wretched on the dvi viewer.
 280
 281   <subsection|Luximono>
 282
 283   The lstlistings documention suggests using Luximono.
 284
 285   Luximono was installed according to the instructions in Ubuntu Forums
 286   thread 1159181<\footnote>
 287     http://ubuntuforums.org/showthread.php?t=1159181
 288   </footnote> with tips from miknight<\footnote>
 289     http://miknight.blogspot.com/2005/11/how-to-install-luxi-mono-font-in.html
 290   </footnote> stating that <verbatim|sudo updmap --enable MixedMap ul9.map>
 291   is required. It looks fine in PDF and PS view but still looks rotten in dvi
 292   view.
 293
 294   <section|Formatting your Lyx document>
 295
 296   It is not necessary to base your literate document on any of the original
 297   <LyX> literate classes; so select a regular class for your document type.
 298
 299   Add the new module <em|Fangle Literate Listings> and also <em|Logical
 300   Markup> which is very useful.
 301
 302   In the drop-down style listbox you should notice a new style defined,
 303   called <em|Chunk>.
 304
 305   When you wish to insert a literate chunk, you enter it's plain name in the
 306   Chunk style, instead of the old <name|noweb> method that uses
 307   <verbatim|\<less\>\<less\>name\<gtr\>\<gtr\>=> type tags. In the line (or
 308   paragraph) following the chunk name, you insert a listing with:
 309   Insert-\<gtr\>Program Listing.
 310
 311   Inside the white listing box you can type (or paste using
 312   <kbd|shift+ctrl+V>) your listing. There is no need to use <kbd|ctrl+enter>
 313   at the end of lines as with some older <LyX> literate techniques --- just
 314   press enter as normal.
 315
 316   <subsection|Customising the listing appearance>
 317
 318   The code is formatted using the <name|lstlistings> package. The chunk style
 319   doesn't just define the chunk name, but can also define any other chunk
 320   options supported by the lstlistings package <verbatim|\\lstset> command.
 321   In fact, what you type in the chunk style is raw latex. If you want to set
 322   the chunk language without having to right-click the listing, just add
 323   <verbatim|,lanuage=C> after the chunk name. (Currently the language will
 324   affect all subsequent listings, so you may need to specify
 325   <verbatim|,language=> quite a lot).
 326
 327   <todo|so fix the bug>
 328
 329   Of course you can do this by editing the listings box advanced properties
 330   by right-clicking on the listings box, but that takes longer, and you can't
 331   see at-a-glance what the advanced settings are while editing the document;
 332   also advanced settings apply only to that box --- the chunk settings apply
 333   through the rest of the document<\footnote>
 334     It ought to apply only to subsequent chunks of the same name. I'll fix
 335     that later
 336   </footnote>.
 337
 338   <todo|So make sure they only apply to chunks of that name>
 339
 340   <subsection|Global customisations>
 341
 342   As lstlistings is used to set the code chunks, it's <verbatim|\\lstset>
 343   command can be used in the pre-amble to set some document wide settings.
 344
 345   If your source has many words with long sequences of capital letters, then
 346   <verbatim|columns=fullflexible> may be a good idea, or the capital letters
 347   will get crowded. (I think lstlistings ought to use a slightly smaller font
 348   for captial letters so that they still fit).
 349
 350   The font family <verbatim|\\ttfamily> looks more normal for code, but has
 351   no bold (an alternate typewriter font is used).\
 352
 353   With <verbatim|\\ttfamily>, I must also specify
 354   <verbatim|columns=fullflexible> or the wrong letter spacing is used.
 355
 356   In my <LaTeX> pre-amble I usually specialise my code format with:
 357
 358   <\nf-chunk|document-preamble>
 359     <item>\\lstset{
 360
 361     <item>numbers=left, stepnumber=1, numbersep=5pt,
 362
 363     <item>breaklines=false,
 364
 365     <item>basicstyle=\\footnotesize\\ttfamily,
 366
 367     <item>numberstyle=\\tiny,
 368
 369     <item>language=C,
 370
 371     <item>columns=fullflexible,
 372
 373     <item>numberfirstline=true
 374
 375     <item>}
 376   </nf-chunk|tex|>
 377
 378   \;
 379
 380   <section|Configuring the build script>
 381
 382   You can invoke code extraction and building from the <LyX> menu option
 383   Document-\<gtr\>Build Program.
 384
 385   First, make sure you don't have a conversion defined for Lyx-\<gtr\>Program
 386
 387   From the menu Tools-\<gtr\>Preferences, add a conversion from
 388   Latex(Plain)-\<gtr\>Program as:
 389
 390   <\verbatim>
 391     set -x ; fangle -Rlyx-build $$i \|\
 392
 393     \ \ env LYX_b=$$b LYX_i=$$i LYX_o=$$o LYX_p=$$p LYX_r=$$r bash
 394   </verbatim>
 395
 396   (But don't cut-n-paste it from this document or you may be be pasting a
 397   multi-line string which will break your lyx preferences file).\
 398
 399   I hope that one day, <LyX> will set these into the environment when calling
 400   the build script.
 401
 402   You may also want to consider adding options to this conversion...
 403
 404   <verbatim|parselog=/usr/share/lyx/scripts/listerrors>
 405
 406   ...but if you do you will lose your stderr<\footnote>
 407     There is some bash plumbing to get a copy of stderr but this footnote is
 408     too small
 409   </footnote>.
 410
 411   Now, a shell script chunk called <filename|lyx-build> will be extracted and
 412   run whenever you choose the Document-\<gtr\>Build Program menu item.
 413
 414   This document was originally managed using <LyX> and lyx-build script for
 415   this document is shown here for historical reference.\
 416
 417   <\verbatim>
 418     lyx -e latex fangle.lyx && \\
 419
 420     \ \ fangle fangle.lyx \<gtr\> ./autoboot
 421   </verbatim>
 422
 423   This looks simple enough, but as mentioned, fangle has to be had from
 424   somewhere before it can be extracted.
 425
 426   <subsection|...>
 427
 428   When the lyx-build chunk is executed, the current directory will be a
 429   temporary directory, and <verbatim|LYX_SOURCE> will refer to the tex file
 430   in this temporary directory. This is unfortunate as our makefile wants to
 431   run from the project directory where the Lyx file is kept.
 432
 433   We can extract the project directory from <verbatim|$$r>, and derive the
 434   probable Lyx filename from the noweb file that Lyx generated.
 435
 436   <\nf-chunk|lyx-build-helper>
 437     <item>PROJECT_DIR="$LYX_r"
 438
 439     <item>LYX_SRC="$PROJECT_DIR/${LYX_i%.tex}.lyx"
 440
 441     <item>TEX_DIR="$LYX_p"
 442
 443     <item>TEX_SRC="$TEX_DIR/$LYX_i"
 444   </nf-chunk|sh|>
 445
 446   And then we can define a lyx-build fragment similar to the autoboot
 447   fragment
 448
 449   <\nf-chunk|lyx-build>
 450     <item>#! /bin/sh
 451
 452     <item>=\<less\>\\chunkref{lyx-build-helper}\<gtr\>
 453
 454     <item>cd $PROJECT_DIR \|\| exit 1
 455
 456     <item>
 457
 458     <item>#/usr/bin/fangle -filter ./notanglefix-filter \\
 459
 460     <item># \ -R./Makefile.inc "../../noweb-lyx/noweb-lyx3.lyx" \\
 461
 462     <item># \ \| sed '/NOWEB_SOURCE=/s/=.*/=samba4-dfs.lyx/' \\
 463
 464     <item># \ \<gtr\> ./Makefile.inc
 465
 466     <item>#
 467
 468     <item>#make -f ./Makefile.inc fangle_sources
 469   </nf-chunk|sh|>
 470
 471   \;
 472
 473   <chapter|Using Fangle with <TeXmacs>>
 474
 475   <todo|Write this chapter>
 476
 477   <chapter|Fangle with Makefiles><label|makefile.inc>
 478
 479   Here we describe a <filename|Makefile.inc> that you can include in your own
 480   Makefiles, or glue as a recursive make to other projects.
 481
 482   <filename|Makefile.inc> will cope with extracting all the other source
 483   files from this or any specified literate document and keeping them up to
 484   date.\
 485
 486   It may also be included by a <verbatim|Makefile> or <verbatim|Makefile.am>
 487   defined in a literate document to automatically deal with the extraction of
 488   source files and documents during normal builds.
 489
 490   Thus, if <verbatim|Makefile.inc> is included into a main project makefile
 491   it add rules for the source files, capable of extracting the source files
 492   from the literate document.
 493
 494   <section|A word about makefiles formats>
 495
 496   Whitespace formatting is very important in a Makefile. The first character
 497   of each action line must be a TAB.\
 498
 499   <\verbatim>
 500     target: pre-requisite
 501
 502     <nf-tab>action
 503
 504     <nf-tab>action
 505   </verbatim>
 506
 507   This requires that the literate programming environment have the ability to
 508   represent a TAB character in a way that fangle will generate an actual TAB
 509   character.
 510
 511   We also adopt a convention that code chunks whose names beginning with
 512   <verbatim|./> should always be automatically extracted from the document.
 513   Code chunks whose names do not begin with <verbatim|./> are for internal
 514   reference. Such chunks may be extracted directly, but will not be
 515   automatically extracted by this Makefile.
 516
 517   <section|Extracting Sources>
 518
 519   Our makefile has two parts; variables must be defined before the targets
 520   that use them.
 521
 522   As we progress through this chapter, explaining concepts, we will be adding
 523   lines to <nf-ref|Makefile.inc-vars|> and <nf-ref|Makefile.inc-targets|>
 524   which are included in <nf-ref|./Makefile.inc|> below.
 525
 526   <\nf-chunk|./Makefile.inc>
 527     <item><nf-ref|Makefile.inc-vars|>
 528
 529     <item><nf-ref|Makefile.inc-targets|>
 530   </nf-chunk|make|>
 531
 532   We first define a placeholder for <verbatim|LITERATE_SOURCE> to hold the
 533   name of this document. This will normally be passed on the command line.
 534
 535   <\nf-chunk|Makefile.inc-vars>
 536     <item>LITERATE_SOURCE=
 537   </nf-chunk||>
 538
 539   Fangle cannot process <LyX> or <TeXmacs> documents directly, so the first
 540   stage is to convert these to more suitable text based formats<\footnote>
 541     <LyX> and <TeXmacs> formats are text-based, but not suitable for fangle
 542   </footnote>.
 543
 544   <subsection|Converting from <LyX> to <LaTeX>><label|Converting-from-Lyx>
 545
 546   The first stage will always be to convert the <LyX> file to a <LaTeX> file.
 547   Fangle must run on a <TeX> file because the <LyX> command
 548   <verbatim|server-goto-file-line><\footnote>
 549     The Lyx command <verbatim|server-goto-file-line> is used to position the
 550     Lyx cursor at the compiler errors.
 551   </footnote> requries that the line number provided be a line of the <TeX>
 552   file and always maps this the line in the <LyX> docment. We use
 553   <verbatim|server-goto-file-line> when moving the cursor to error lines
 554   during compile failures.
 555
 556   The command <verbatim|lyx -e literate fangle.lyx> will produce
 557   <verbatim|fangle.tex>, a <TeX> file; so we define a make target to be the
 558   same as the <LyX> file but with the <verbatim|.tex> extension.
 559
 560   The <verbatim|EXTRA_DIST> is for automake support so that the <TeX> files
 561   will automaticaly be distributed with the source, to help those who don't
 562   have <LyX> installed.
 563
 564   <\nf-chunk|Makefile.inc-vars>
 565     <item>TEX_SOURCE=$(LYX_SOURCE:.lyx=.tex)
 566
 567     <item>EXTRA_DIST+=$(TEX_SOURCE)
 568   </nf-chunk||>
 569
 570   We then specify that the <TeX> source is to be generated from the <LyX>
 571   source.
 572
 573   <\nf-chunk|Makefile.inc-targets>
 574     <item>$(TEX_SOURCE): $(LYX_SOURCE)
 575
 576     <item><nf-tab>lyx -e latex $\<less\>
 577
 578     <item>clean_tex:
 579
 580     <item><nf-tab>rm -f -- $(TEX_SOURCE)
 581
 582     <item>clean: clean_tex
 583   </nf-chunk||>
 584
 585   <subsection|Converting from <TeXmacs>><label|Converting-from-Lyx>
 586
 587   Fangle cannot process <TeXmacs> files directly<\footnote>
 588     but this is planned when <TeXmacs> uses xml as it's native format
 589   </footnote>, but must first convert them to text files.
 590
 591   The command <verbatim|texmacs -c fangle.tm fangle.txt -q> will produce
 592   <verbatim|fangle.txt>, a text file; so we define a make target to be the
 593   same as the <TeXmacs> file but with the <verbatim|.txt> extension.
 594
 595   The <verbatim|EXTRA_DIST> is for automake support so that the <TeX> files
 596   will automaticaly be distributed with the source, to help those who don't
 597   have <LyX> installed.
 598
 599   <\nf-chunk|Makefile.inc-vars>
 600     <item>TXT_SOURCE=$(LITERATE_SOURCE:.tm=.txt)
 601
 602     <item>EXTRA_DIST+=$(TXT_SOURCE)
 603   </nf-chunk||>
 604
 605   <todo|Add loop around each $\<less\> so multiple targets can be specified>
 606
 607   <\nf-chunk|Makefile.inc-targets>
 608     <item>$(TXT_SOURCE): $(LITERATE_SOURCE)
 609
 610     <item><nf-tab>texmacs -c $\<less\> $(TXT_SOURCE) -q
 611
 612     <item>clean_txt:
 613
 614     <item><nf-tab>rm -f -- $(TXT_SOURCE)
 615
 616     <item>clean: clean_txt
 617   </nf-chunk||>
 618
 619   <section|Extracting Program Source>
 620
 621   The program source is extracted using fangle, which is designed to operate
 622   on text or a <LaTeX> documents<\footnote>
 623     <LaTeX> documents are just slightly special text documents
 624   </footnote>.
 625
 626   <\nf-chunk|Makefile.inc-vars>
 627     <item>FANGLE_SOURCE=$(TEX_SOURCE) $(TXT_SOURCE)
 628   </nf-chunk||>
 629
 630   The literate document can result in any number of source files, but not all
 631   of these will be changed each time the document is updated. We certainly
 632   don't want to update the timestamps of these files and cause the whole
 633   source tree to be recompiled just because the literate explanation was
 634   revised. We use <verbatim|CPIF> from the <em|Noweb> tools to avoid updating
 635   the file if the content has not changed, but should probably write our own.
 636
 637   However, if a source file is not updated, then the fangle file will always
 638   have a newer time-stamp and the makefile would always re-attempt to extact
 639   a newer source file which would be a waste of time.
 640
 641   Because of this, we use a stamp file which is always updated each time the
 642   sources are fully extracted from the <LaTeX> document. If the stamp file is
 643   newer than the document, then we can avoid an attempt to re-extract any of
 644   the sources. Because this stamp file is only updated when extraction is
 645   complete, it is safe for the user to interrupt the build-process
 646   mid-extraction.
 647
 648   We use <verbatim|echo> rather than <verbatim|touch> to update the stamp
 649   file beause the <verbatim|touch> command does not work very well over an
 650   <verbatim|sshfs>mount \ that I was using.
 651
 652   <\nf-chunk|Makefile.inc-vars>
 653     <item>FANGLE_SOURCE_STAMP=$(FANGLE_SOURCE).stamp
 654   </nf-chunk||>
 655
 656   <\nf-chunk|Makefile.inc-targets>
 657     <item>$(FANGLE_SOURCE_STAMP): $(FANGLE_SOURCE) \\
 658
 659     <item><nf-tab> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ $(FANGLE_SOURCES) ; \\
 660
 661     <item><nf-tab>echo -n \<gtr\> $(FANGLE_SOURCE_STAMP)
 662
 663     <item>clean_stamp:
 664
 665     <item><nf-tab>rm -f $(FANGLE_SOURCE_STAMP)
 666
 667     <item>clean: clean_stamp
 668   </nf-chunk||>
 669
 670   <section|Extracting Source Files>
 671
 672   We compute <verbatim|FANGLE_SOURCES> to hold the names of all the source
 673   files defined in the document. We compute this only once, by means of
 674   <verbatim|:=> in assignent. The sed deletes the any
 675   <verbatim|\<less\>\<less\>> and <verbatim|\<gtr\>\<gtr\>> which may
 676   surround the roots names (for compatibility with Noweb's noroots command).
 677
 678   As we use chunk names beginning with <filename|./> to denote top level
 679   fragments that should be extracted, we filter out all fragments that do not
 680   begin with <filename|./>
 681
 682   <\note>
 683     <verbatim|FANGLE_PREFIX> is set to <verbatim|./> by default, but whatever
 684     it may be overridden to, the prefix is replaced by a literal
 685     <verbatim|./> before extraction so that files will be extracted in the
 686     current directory whatever the prefix. This helps namespace or
 687     sub-project prefixes like <verbatim|documents:> for chunks like
 688     <verbatim|documents:docbook/intro.xml>
 689   </note>
 690
 691   <todo|This doesn't work though, because it loses the full name and doesn't
 692   know what to extact!>
 693
 694   <\nf-chunk|Makefile.inc-vars>
 695     <item>FANGLE_PREFIX:=\\.\\/
 696
 697     <item>FANGLE_SOURCES:=$(shell \\
 698
 699     <item> \ fangle -r $(FANGLE_SOURCE) \|\\
 700
 701     <item> \ sed -e 's/^[\<less\>][\<less\>]//;s/[\<gtr\>][\<gtr\>]$$//;/^$(FANGLE_PREFIX)/!d'
 702     \\
 703
 704     <item> \ \ \ \ \ -e 's/^$(FANGLE_PREFIX)/\\.\\//' )
 705   </nf-chunk||>
 706
 707   The target below, <verbatim|echo_fangle_sources> is a helpful debugging
 708   target and shows the names of the files that would be extracted.
 709
 710   <\nf-chunk|Makefile.inc-targets>
 711     <item>.PHONY: echo_fangle_sources
 712
 713     <item>echo_fangle_sources: ; @echo $(FANGLE_SOURCES)
 714   </nf-chunk||>
 715
 716   We define a convenient target called <verbatim|fangle_sources> so that
 717   <verbatim|make -f fangle_sources> will re-extract the source if the
 718   literate document has been updated.\
 719
 720   <\nf-chunk|Makefile.inc-targets>
 721     <item>.PHONY: fangle_sources
 722
 723     <item>fangle_sources: $(FANGLE_SOURCE_STAMP)
 724   </nf-chunk||>
 725
 726   And also a convenient target to remove extracted sources.
 727
 728   <\nf-chunk|Makefile.inc-targets>
 729     <item>.PHONY: clean_fangle_sources
 730
 731     <item>clean_fangle_sources: ; \\
 732
 733     <item> \ \ \ \ \ \ \ rm -f -- $(FANGLE_SOURCE_STAMP) $(FANGLE_SOURCES)
 734   </nf-chunk||>
 735
 736   We now look at the extraction of the source files.
 737
 738   This makefile macro <verbatim|if_extension> takes 4 arguments: the filename
 739   <verbatim|$(1)>, some extensions to match <verbatim|$(2)> and a shell
 740   command to return if the filename does match the exensions <verbatim|$(3)>,
 741   and a shell command to return if it does not match the extensions
 742   <verbatim|$(4)>.
 743
 744   <\nf-chunk|Makefile.inc-vars>
 745     <item>if_extension=$(if $(findstring $(suffix $(1)),$(2)),$(3),$(4))
 746   </nf-chunk||>
 747
 748   For some source files like C files, we want to output the line number and
 749   filename of the original <LaTeX> document from which the source
 750   came<\footnote>
 751     I plan to replace this option with a separate mapping file so as not to
 752     pollute the generated source, and also to allow a code pretty-printing
 753     reformatter like <verbatim|indent> be able to re-format the file and
 754     adjust for changes through comparing the character streams.
 755   </footnote>.
 756
 757   To make this easier we define the file extensions for which we want to do
 758   this.
 759
 760   <\nf-chunk|Makefile.inc-vars>
 761     <item>C_EXTENSIONS=.c .h
 762   </nf-chunk||>
 763
 764   We can then use the <verbatim|if_extensions> macro to define a macro which
 765   expands out to the <verbatim|-L> option if fangle is being invoked in a C
 766   source file, so that C compile errors will refer to the line number in the
 767   <TeX> document.\
 768
 769   <\nf-chunk|Makefile.inc-vars>
 770     <item>TABS=8
 771
 772     <item>nf_line=-L -T$(TABS)
 773
 774     <item>fangle=fangle $(call if_extension,$(2),$(C_EXTENSIONS),$(nf_line))
 775     -R"$(2)" $(1)
 776   </nf-chunk||>
 777
 778   We can use a similar trick to define an indent macro which takes just the
 779   filename as an argument and can return a pipeline stage calling the indent
 780   command. Indent can be turned off with <verbatim|make fangle_sources
 781   indent=>
 782
 783   <\nf-chunk|Makefile.inc-vars>
 784     <item>indent_options=-npro -kr -i8 -ts8 -sob -l80 -ss -ncs
 785
 786     <item>indent=$(call if_extension,$(1),$(C_EXTENSIONS), \| indent
 787     $(indent_options))
 788   </nf-chunk||>
 789
 790   We now define the pattern for extracting a file. The files are written
 791   using noweb's <verbatim|cpif> so that the file timestamp will not be
 792   touched if the contents haven't changed. This avoids the need to rebuild
 793   the entire project because of a typographical change in the documentation,
 794   or if none or a few C source files have changed.
 795
 796   <\nf-chunk|Makefile.inc-vars>
 797     <item>fangle_extract=@mkdir -p $(dir $(1)) && \\
 798
 799     <item> \ $(call fangle,$(2),$(1)) \<gtr\> "$(1).tmp" && \\
 800
 801     <item> \ cat "$(1).tmp" $(indent) \| cpif "$(1)" \\
 802
 803     <item> \ && rm -- "$(1).tmp" \|\| \\
 804
 805     <item> \ (echo error newfangling $(1) from $(2) ; exit 1)
 806   </nf-chunk||>
 807
 808   We define a target which will extract or update all sources. To do this we
 809   first defined a makefile template that can do this for any source file in
 810   the <LaTeX> document.
 811
 812   <\nf-chunk|Makefile.inc-vars>
 813     <item>define FANGLE_template
 814
 815     <item> \ $(1): $(2)
 816
 817     <item><nf-tab>$$(call fangle_extract,$(1),$(2))
 818
 819     <item> \ FANGLE_TARGETS+=$(1)
 820
 821     <item>endef
 822   </nf-chunk||>
 823
 824   We then enumerate the discovered <verbatim|FANGLE_SOURCES> to generate a
 825   makefile rule for each one using the makefile template we defined above.
 826
 827   <\nf-chunk|Makefile.inc-targets>
 828     <item>$(foreach source,$(FANGLE_SOURCES),\\
 829
 830     <item> \ $(eval $(call FANGLE_template,$(source),$(FANGLE_SOURCE))) \\
 831
 832     <item>)
 833   </nf-chunk||>
 834
 835   These will all be built with <verbatim|FANGLE_SOURCE_STAMP>.
 836
 837   We also remove the generated sources on a make distclean.
 838
 839   <\nf-chunk|Makefile.inc-targets>
 840     <item>_distclean: clean_fangle_sources
 841   </nf-chunk||>
 842
 843   <section|Extracting Documentation>
 844
 845   We then identify the intermediate stages of the documentation and their
 846   build and clean targets.
 847
 848   <subsection|Formatting <TeX>>
 849
 850   <subsubsection|Running pdflatex>
 851
 852   We produce a pdf file from the tex file.
 853
 854   <\nf-chunk|Makefile.inc-vars>
 855     <item>FANGLE_PDF=$(TEX_SOURCE:.tex=.pdf)
 856   </nf-chunk||>
 857
 858   We run pdflatex twice to be sure that the contents and aux files are up to
 859   date. We certainly are <em|required> to run pdflatex at least twice if
 860   these files do not exist.
 861
 862   <\nf-chunk|Makefile.inc-targets>
 863     <item>$(FANGLE_PDF): $(TEX_SOURCE)
 864
 865     <item><nf-tab>pdflatex $\<less\> && pdflatex $\<less\>
 866
 867     <item>
 868
 869     <item>clean_pdf:
 870
 871     <item><nf-tab>rm -f -- $(FANGLE_PDF) $(TEX_SOURCE:.tex=.toc) \\
 872
 873     <item><nf-tab> \ $(TEX_SOURCE:.tex=.log) $(TEX_SOURCE:.tex=.aux)
 874   </nf-chunk||>
 875
 876   <subsection|Formatting <TeXmacs>>
 877
 878   <TeXmacs> can produce a PDF file directly.
 879
 880   <\nf-chunk|Makefile.inc-vars>
 881     <item>FANGLE_PDF=$(TEX_SOURCE:.tm=.pdf)
 882   </nf-chunk||>
 883
 884   <\todo>
 885     Outputting the PDF may not be enough to update the links and page
 886     references. I think
 887
 888     we need to update twice, generate a pdf, update twice mode and generate a
 889     new PDF.
 890
 891     Basically the PDF export of <TeXmacs> is pretty rotten and doesn't work
 892     properly from the CLI
 893   </todo>
 894
 895   <\nf-chunk|Makefile.inc-targets>
 896     <item>$(FANGLE_PDF): $(TEXMACS_SOURCE)
 897
 898     <item><nf-tab>texmacs -c $(TEXMACS_SOURCE) $\<less\> -q
 899
 900     <item>
 901
 902     <item>clean_pdf:
 903
 904     <item><nf-tab>rm -f -- $(FANGLE_PDF)
 905   </nf-chunk||>
 906
 907   <subsection|Building the Documentation as a Whole>
 908
 909   Currently we only build pdf as a final format, but <verbatim|FANGLE_DOCS>
 910   may later hold other output formats.
 911
 912   <\nf-chunk|Makefile.inc-vars>
 913     <item>FANGLE_DOCS=$(FANGLE_PDF)
 914   </nf-chunk||>
 915
 916   We also define <verbatim|fangle_docs> as a convenient phony target.
 917
 918   <\nf-chunk|Makefile.inc-targets>
 919     <item>.PHONY: fangle_docs
 920
 921     <item>fangle_docs: $(FANGLE_DOCS)
 922
 923     <item>docs: fangle_docs
 924   </nf-chunk||>
 925
 926   And define a convenient <verbatim|clean_fangle_docs> which we add to the
 927   regular clean target
 928
 929   <\nf-chunk|Makefile.inc-targets>
 930     <item>.PHONEY: clean_fangle_docs
 931
 932     <item>clean_fangle_docs: clean_tex clean_pdf
 933
 934     <item>clean: clean_fangle_docs
 935
 936     <item>
 937
 938     <item>distclean_fangle_docs: clean_tex clean_fangle_docs
 939
 940     <item>distclean: clean distclean_fangle_docs
 941   </nf-chunk||>
 942
 943   <section|Other helpers>
 944
 945   If <filename|Makefile.inc> is included into <filename|Makefile>, then
 946   extracted files can be updated with this command:
 947
 948   <verbatim|make fangle_sources>
 949
 950   otherwise, with:
 951
 952   <verbatim|make -f Makefile.inc fangle_sources>
 953
 954   <section|Boot-strapping the extraction>
 955
 956   As well as having the makefile extract or update the source files as part
 957   of it's operation, it also seems convenient to have the makefile
 958   re-extracted itself from <em|this> document.
 959
 960   It would also be convenient to have the code that extracts the makefile
 961   from this document to also be part of this document, however we have to
 962   start somewhere and this unfortunately requires us to type at least a few
 963   words by hand to start things off.
 964
 965   Therefore we will have a minimal root fragment, which, when extracted, can
 966   cope with extracting the rest of the source. This shell script fragment can
 967   do that. It's name is <verbatim|*> <emdash> out of regard for <name|Noweb>,
 968   but when extracted might better be called <verbatim|autoupdate>.
 969
 970   <todo|De-lyxify>
 971
 972   <\nf-chunk|*>
 973     <item>#! /bin/sh
 974
 975     <item>
 976
 977     <item>MAKE_SRC="${1:-${NW_LYX:-../../noweb-lyx/noweb-lyx3.lyx}}"
 978
 979     <item>MAKE_SRC=`dirname "$MAKE_SRC"`/`basename "$MAKE_SRC" .lyx`
 980
 981     <item>NOWEB_SRC="${2:-${NOWEB_SRC:-$MAKE_SRC.lyx}}"
 982
 983     <item>lyx -e latex $MAKE_SRC
 984
 985     <item>
 986
 987     <item>fangle -R./Makefile.inc ${MAKE_SRC}.tex \\
 988
 989     <item> \ \| sed "/FANGLE_SOURCE=/s/^/#/;T;aNOWEB_SOURCE=$FANGLE_SRC" \\
 990
 991     <item> \ \| cpif ./Makefile.inc
 992
 993     <item>
 994
 995     <item>make -f ./Makefile.inc fangle_sources
 996   </nf-chunk|sh|>
 997
 998   The general Makefile can be invoked with <filename|./autoboot> and can also
 999   be included into any automake file to automatically re-generate the source
1000   files.
1001
1002   The <em|autoboot> can be extracted with this command:
1003
1004   <\verbatim>
1005     lyx -e latex fangle.lyx && \\
1006
1007     \ \ fangle fangle.lyx \<gtr\> ./autoboot
1008   </verbatim>
1009
1010   This looks simple enough, but as mentioned, fangle has to be had from
1011   somewhere before it can be extracted.
1012
1013   On a unix system this will extract <filename|fangle.module> and the
1014   <filename|fangle> awk script, and run some basic tests.\
1015
1016   <todo|cross-ref to test chapter when it is a chapter all on its own>
1017
1018   <section|Incorporating Makefile.inc into existing projects>
1019
1020   If you are writing a literate module of an existing non-literate program
1021   you may find it easier to use a slight recursive make instead of directly
1022   including <verbatim|Makefile.inc> in the projects makefile.\
1023
1024   This way there is less chance of definitions in <verbatim|Makefile.inc>
1025   interfering with definitions in the main makefile, or with definitions in
1026   other <verbatim|Makefile.inc> from other literate modules of the same
1027   project.
1028
1029   To do this we add some <em|glue> to the project makefile that invokes
1030   Makefile.inc in the right way. The glue works by adding a <verbatim|.PHONY>
1031   target to call the recursive make, and adding this target as an additional
1032   pre-requisite to the existing targets.
1033
1034   <paragraph|Example>Sub-module of existing system
1035
1036   In this example, we are building <verbatim|module.so> as a literate module
1037   of a larger project.
1038
1039   We will show the sort glue that can be inserted into the projects Makefile
1040   <emdash> or more likely <emdash> a regular Makefile included in or invoked
1041   by the projects Makefile.
1042
1043   <\nf-chunk|makefile-glue>
1044     <item>module_srcdir=modules/module
1045
1046     <item>MODULE_SOURCE=module.tm
1047
1048     <item>MODULE_STAMP=$(MODULE_SOURCE).stamp
1049   </nf-chunk||>
1050
1051   The existing build system may already have a build target for
1052   <filename|module.o>, but we just add another pre-requisite to that. In this
1053   case we use <filename|module.tm.stamp> as a pre-requisite, the stamp file's
1054   modified time indicating when all sources were extracted<\footnote>
1055     If the projects build system does not know how to build the module from
1056     the extracted sources, then just add build actions here as normal.
1057   </footnote>.
1058
1059   <\nf-chunk|makefile-glue>
1060     <item>$(module_srcdir)/module.o: $(module_srcdir)/$(MODULE_STAMP)
1061   </nf-chunk|make|>
1062
1063   The target for this new pre-requisite will be generated by a recursive make
1064   using <filename|Makefile.inc> which will make sure that the source is up to
1065   date, before it is built by the main projects makefile.
1066
1067   <\nf-chunk|makefile-glue>
1068     <item>$(module_srcdir)/$(MODULE_STAMP): $(module_srcdir)/$(MODULE_SOURCE)
1069
1070     <item><nf-tab>$(MAKE) -C $(module_srcdir) -f Makefile.inc fangle_sources
1071     LITERATE_SOURCE=$(MODULE_SOURCE)
1072   </nf-chunk||>
1073
1074   We can do similar glue for the docs, clean and distclean targets. In this
1075   example the main prject was using a double colon for these targets, so we
1076   must use the same in our glue.
1077
1078   <\nf-chunk|makefile-glue>
1079     <item>docs:: docs_module
1080
1081     <item>.PHONY: docs_module
1082
1083     <item>docs_module:
1084
1085     <item><nf-tab>$(MAKE) -C $(module_srcdir) -f Makefile.inc docs
1086     LITERATE_SOURCE=$(MODULE_SOURCE)
1087
1088     <item>
1089
1090     <item>clean:: clean_module
1091
1092     <item>.PHONEY: clean_module
1093
1094     <item>clean_module:
1095
1096     <item><nf-tab>$(MAKE) -C $(module_srcdir) -f Makefile.inc clean
1097     LITERATE_SOURCE=$(MODULE_SOURCE)
1098
1099     <item>
1100
1101     <item>distclean:: distclean_module
1102
1103     <item>.PHONY: distclean_module
1104
1105     <item>distclean_module:
1106
1107     <item><nf-tab>$(MAKE) -C $(module_srcdir) -f Makefile.inc distclean
1108     LITERATE_SOURCE=$(MODULE_SOURCE)
1109   </nf-chunk||>
1110
1111   We could do similarly for install targets to install the generated docs.
1112
1113   <part|Source Code>
1114
1115   <chapter|Fangle awk source code>
1116
1117   We use the copyright notice from chapter <reference|License>.
1118
1119   <\nf-chunk|./fangle>
1120     <item>#! /usr/bin/awk -f
1121
1122     <item># <nf-ref|gpl3-copyright|>
1123   </nf-chunk|awk|>
1124
1125   We also use code from <person|Arnold Robbins> public domain getopt (1993
1126   revision) defined in <reference|getopt>, and naturally want to attribute
1127   this appropriately.
1128
1129   <\nf-chunk|./fangle>
1130     <item># NOTE: Arnold Robbins public domain getopt for awk is also used:
1131
1132     <item><nf-ref|getopt.awk-header|>
1133
1134     <item><nf-ref|getopt.awk-getopt()|>
1135
1136     <item>
1137   </nf-chunk||>
1138
1139   And include the following chunks (which are explained further on) to make
1140   up the program:
1141
1142   <\nf-chunk|./fangle>
1143     <item><nf-ref|helper-functions|>
1144
1145     <item><nf-ref|mode-tracker|>
1146
1147     <item><nf-ref|parse_chunk_args|>
1148
1149     <item><nf-ref|chunk-storage-functions|>
1150
1151     <item><nf-ref|output_chunk_names()|>
1152
1153     <item><nf-ref|output_chunks()|>
1154
1155     <item><nf-ref|write_chunk()|>
1156
1157     <item><nf-ref|expand_chunk_args()|>
1158
1159     <item>
1160
1161     <item><nf-ref|begin|>
1162
1163     <item><nf-ref|recognize-chunk|>
1164
1165     <item><nf-ref|end|>
1166   </nf-chunk||>
1167
1168   <section|AWK tricks>
1169
1170   The portable way to erase an array in awk is to split the empty string, so
1171   we define a fangle macro that can split an array, like this:
1172
1173   <\nf-chunk|awk-delete-array>
1174     <item>split("", <nf-arg|ARRAY>);
1175   </nf-chunk|awk|<tuple|ARRAY>>
1176
1177   For debugging it is sometimes convenient to be able to dump the contents of
1178   an array to <verbatim|stderr>, and so this macro is also useful.
1179
1180   <\nf-chunk|dump-array>
1181     <item>print "\\nDump: <nf-arg|ARRAY>\\n--------\\n" \<gtr\>
1182     "/dev/stderr";
1183
1184     <item>for (_x in <nf-arg|ARRAY>) {
1185
1186     <item> \ print _x "=" <nf-arg|ARRAY>[_x] "\\n" \<gtr\> "/dev/stderr";
1187
1188     <item>}
1189
1190     <item>print "========\\n" \<gtr\> "/dev/stderr";
1191   </nf-chunk|awk|<tuple|ARRAY>>
1192
1193   <section|Catching errors>
1194
1195   Fatal errors are issued with the error function:
1196
1197   <\nf-chunk|error()>
1198     <item>function error(message)
1199
1200     <item>{
1201
1202     <item> \ print "ERROR: " FILENAME ":" FNR " " message \<gtr\>
1203     "/dev/stderr";
1204
1205     <item> \ exit 1;
1206
1207     <item>}
1208   </nf-chunk|awk|>
1209
1210   and likewise for non-fatal warnings:
1211
1212   <\nf-chunk|error()>
1213     <item>function warning(message)
1214
1215     <item>{
1216
1217     <item> \ print "WARNING: " FILENAME ":" FNR " " message \<gtr\>
1218     "/dev/stderr";
1219
1220     <item> \ warnings++;
1221
1222     <item>}
1223   </nf-chunk|awk|>
1224
1225   and debug output too:
1226
1227   <\nf-chunk|error()>
1228     <item>function debug_log(message)
1229
1230     <item>{
1231
1232     <item> \ print "DEBUG: " FILENAME ":" FNR " " message \<gtr\>
1233     "/dev/stderr";
1234
1235     <item>}
1236   </nf-chunk|awk|>
1237
1238   <todo|append=helper-functions>
1239
1240   <\nf-chunk|helper-functions>
1241     <item><nf-ref|error()|>
1242   </nf-chunk||>
1243
1244   <chapter|<LaTeX> and lstlistings>
1245
1246   <todo|Split LyX and TeXmacs parts>
1247
1248   For <LyX> and <LaTeX>, the <verbatim|lstlistings> package is used to format
1249   the lines of code chunks. You may recal from chapter XXX that arguments to
1250   a chunk definition are pure <LaTeX> code. This means that fangle needs to
1251   be able to parse <LaTeX> a little.
1252
1253   <LaTeX> arguments to <verbatim|lstlistings> macros are a comma seperated
1254   list of key-value pairs, and values containing commas are enclosed in
1255   <verbatim|{> braces <verbatim|}> (which is to be expected for <LaTeX>).
1256
1257   A sample expressions is:
1258
1259   <verbatim|name=thomas, params={a, b}, something, something-else>
1260
1261   but we see that this is just a simpler form of this expression:
1262
1263   <verbatim|name=freddie, foo={bar=baz, quux={quirk, a=fleeg}}, etc>
1264
1265   We may consider that we need a function that can parse such <LaTeX>
1266   expressions and assign the values to an AWK associated array, perhaps using
1267   a recursive parser into a multi-dimensional hash<\footnote>
1268     as AWK doesn't have nested-hash support
1269   </footnote>, resulting in:
1270
1271   <tabular|<tformat|<cwith|2|6|1|2|cell-lborder|0.5pt>|<cwith|2|6|1|2|cell-rborder|0.5pt>|<cwith|2|6|1|2|cell-bborder|0.5pt>|<cwith|2|6|1|2|cell-tborder|0.5pt>|<cwith|1|1|1|2|cell-lborder|0.5pt>|<cwith|1|1|1|2|cell-rborder|0.5pt>|<cwith|1|1|1|2|cell-bborder|0.5pt>|<cwith|1|1|1|2|cell-tborder|0.5pt>|<table|<row|<cell|key>|<cell|value>>|<row|<cell|a[name]>|<cell|freddie>>|<row|<cell|a[foo,
1272   bar]>|<cell|baz>>|<row|<cell|a[foo, quux,
1273   quirk]>|<cell|>>|<row|<cell|a[foo, quux,
1274   a]>|<cell|fleeg>>|<row|<cell|a[etc]>|<cell|>>>>>
1275
1276   Yet, also, on reflection it seems that sometimes such nesting is not
1277   desirable, as the braces are also used to delimit values that contain
1278   commas --- we may consider that
1279
1280   <verbatim|name={williamson, freddie}>
1281
1282   should assign <verbatim|williamson, freddie> to <verbatim|name>.
1283
1284   In fact we are not so interested in the detail so as to be bothered by
1285   this, which turns out to be a good thing for two reasons. Firstly <TeX> has
1286   a malleable parser with no strict syntax, and secondly whether or not
1287   <verbatim|williamson> and <verbatim|freddie> should count as two items will
1288   be context dependant anyway.
1289
1290   We need to parse this latex for only one reason; which is that we are
1291   extending lstlistings to add some additional arguments which will be used
1292   to express chunk parameters and other chunk options.
1293
1294   <section|Additional lstlstings parameters>
1295
1296   Further on we define a <verbatim|\\Chunk> <LaTeX> macro whose arguments
1297   will consist of a the chunk name, optionally followed by a comma and then a
1298   comma separated list of arguments. In fact we will just need to prefix
1299   <verbatim|name=> to the arguments to in order to create valid lstlistings
1300   arguments.\
1301
1302   There will be other arguments supported too;\
1303
1304   <\description-long>
1305     <item*|params>As an extension to many literate-programming styles, fangle
1306     permits code chunks to take parameters and thus operate somewhat like C
1307     pre-processor macros, or like C++ templates. Chunk parameters are
1308     declared with a chunk argument called params, which holds a semi-colon
1309     separated list of parameters, like this:
1310
1311     <verbatim|achunk,language=C,params=name;address>
1312
1313     <item*|addto>a named chunk that this chunk is to be included into. This
1314     saves the effort of having to declare another listing of the named chunk
1315     merely to include this one.
1316   </description-long>
1317
1318   Function get_chunk_args() will accept two paramters, text being the text to
1319   parse, and values being an array to receive the parsed values as described
1320   above. The optional parameter path is used during recursion to build up the
1321   multi-dimensional array path.
1322
1323   <\nf-chunk|./fangle>
1324     <item>=\<less\>\\chunkref{get_chunk_args()}\<gtr\>
1325   </nf-chunk||>
1326
1327   <\nf-chunk|get_chunk_args()>
1328     <item>function get_chunk_args(text, values,
1329
1330     <item> \ # optional parameters
1331
1332     <item> \ path, # hierarchical precursors
1333
1334     <item> \ # local vars
1335
1336     <item> \ a, name)
1337   </nf-chunk||>
1338
1339   The strategy is to parse the name, and then look for a value. If the value
1340   begins with a brace <verbatim|{>, then we recurse and consume as much of
1341   the text as necessary, returning the remaining text when we encounter a
1342   leading close-brace <verbatim|}>. This being the strategy --- and executed
1343   in a loop --- we realise that we must first look for the closing brace
1344   (perhaps preceded by white space) in order to terminate the recursion, and
1345   returning remaining text.
1346
1347   <\nf-chunk|get_chunk_args()>
1348     <item>{
1349
1350     <item> \ split("", next_chunk_args);
1351
1352     <item> \ while(length(text)) {
1353
1354     <item> \ \ \ if (match(text, "^ *}(.*)", a)) {
1355
1356     <item> \ \ \ \ \ return a[1];
1357
1358     <item> \ \ \ }
1359
1360     <item> \ \ \ =\<less\>\\chunkref{parse-chunk-args}\<gtr\>
1361
1362     <item> \ }
1363
1364     <item> \ return text;
1365
1366     <item>}
1367   </nf-chunk||>
1368
1369   We can see that the text could be inspected with this regex:
1370
1371   <\nf-chunk|parse-chunk-args>
1372     <item>if (! match(text, " *([^,=]*[^,= ]) *(([,=]) *(([^,}]*) *,*
1373     *(.*))\|)$", a)) {
1374
1375     <item> \ return text;
1376
1377     <item>}
1378   </nf-chunk||>
1379
1380   and that <verbatim|a> will have the following values:
1381
1382   <tabular|<tformat|<cwith|2|7|1|2|cell-lborder|0.5pt>|<cwith|2|7|1|2|cell-rborder|0.5pt>|<cwith|2|7|1|2|cell-bborder|0.5pt>|<cwith|2|7|1|2|cell-tborder|0.5pt>|<cwith|1|1|1|2|cell-lborder|0.5pt>|<cwith|1|1|1|2|cell-rborder|0.5pt>|<cwith|1|1|1|2|cell-bborder|0.5pt>|<cwith|1|1|1|2|cell-tborder|0.5pt>|<table|<row|<cell|a[n]>|<cell|assigned
1383   text>>|<row|<cell|1>|<cell|freddie>>|<row|<cell|2>|<cell|=freddie,
1384   foo={bar=baz, quux={quirk, a=fleeg}}, etc>>|<row|<cell|3>|<cell|=>>|<row|<cell|4>|<cell|freddie,
1385   foo={bar=baz, quux={quirk, a=fleeg}}, etc>>|<row|<cell|5>|<cell|freddie>>|<row|<cell|6>|<cell|,
1386   foo={bar=baz, quux={quirk, a=fleeg}}, etc>>>>>
1387
1388   <verbatim|a[3]> will be either <verbatim|=> or <verbatim|,> and signify
1389   whether the option named in <verbatim|a[1]> has a value or not
1390   (respectively).
1391
1392   If the option does have a value, then if the expression
1393   <verbatim|substr(a[4],1,1)> returns a brace <verbatim|{> it will signify
1394   that we need to recurse:
1395
1396   <\nf-chunk|parse-chunk-args>
1397     <item>name=a[1];
1398
1399     <item>if (a[3] == "=") {
1400
1401     <item> \ if (substr(a[4],1,1) == "{") {
1402
1403     <item> \ \ \ text = get_chunk_args(substr(a[4],2), values, path name
1404     SUBSEP);
1405
1406     <item> \ } else {
1407
1408     <item> \ \ \ values[path name]=a[5];
1409
1410     <item> \ \ \ text = a[6];
1411
1412     <item> \ }
1413
1414     <item>} else {
1415
1416     <item> \ values[path name]="";
1417
1418     <item> \ text = a[2];
1419
1420     <item>}
1421   </nf-chunk||>
1422
1423   We can test this function like this:
1424
1425   <\nf-chunk|gca-test.awk>
1426     <item>=\<less\>\\chunkref{get_chunk_args()}\<gtr\>
1427
1428     <item>BEGIN {
1429
1430     <item> \ SUBSEP=".";
1431
1432     <item>
1433
1434     <item> \ print get_chunk_args("name=freddie, foo={bar=baz, quux={quirk,
1435     a=fleeg}}, etc", a);
1436
1437     <item> \ for (b in a) {
1438
1439     <item> \ \ \ print "a[" b "] =\<gtr\> " a[b];
1440
1441     <item> \ }
1442
1443     <item>}
1444   </nf-chunk||>
1445
1446   which should give this output:
1447
1448   <\nf-chunk|gca-test.awk-results>
1449     <item>a[foo.quux.quirk] =\<gtr\>\
1450
1451     <item>a[foo.quux.a] =\<gtr\> fleeg
1452
1453     <item>a[foo.bar] =\<gtr\> baz
1454
1455     <item>a[etc] =\<gtr\>\
1456
1457     <item>a[name] =\<gtr\> freddie
1458   </nf-chunk||>
1459
1460   <section|Parsing chunk arguments><label|Chunk Arguments>
1461
1462   Arguments to paramterized chunks are expressed in round brackets as a comma
1463   separated list of optional arguments. For example, a chunk that is defined
1464   with:
1465
1466   <verbatim|\\Chunk{achunk, params=name ; address}>
1467
1468   could be invoked as:
1469
1470   <verbatim|\\chunkref{achunk}(John Jones, jones@example.com)>
1471
1472   An argument list may be as simple as in <verbatim|\\chunkref{pull}(thing,
1473   otherthing)> or as complex as:
1474
1475   <verbatim|\\chunkref{pull}(things[x, y], get_other_things(a, "(all)"))>
1476
1477   --- which for all it's commas and quotes and parenthesis represents only
1478   two parameters: <verbatim|things[x, y]> and <verbatim|get_other_things(a,
1479   "(all)")>.
1480
1481   If we simply split parameter list on commas, then the comma in
1482   <verbatim|things[x,y]> would split into two seperate arguments:
1483   <verbatim|things[x> and <verbatim|y]>--- neither of which make sense on
1484   their own.
1485
1486   One way to prevent this would be by refusing to split text between matching
1487   delimiters, such as <verbatim|[>, <verbatim|]>, <verbatim|(>, <verbatim|)>,
1488   <verbatim|{>, <verbatim|}> and most likely also <verbatim|">, <verbatim|">
1489   and <verbatim|'>, <verbatim|'>. Of course this also makes it impossible to
1490   pass such mis-matched code fragments as parameters, but I think that it
1491   would be hard for readers to cope with authors who would pass such code
1492   unbalanced fragments as chunk parameters<\footnote>
1493     I know that I couldn't cope with users doing such things, and although
1494     the GPL3 license prevents me from actually forbidding anyone from trying,
1495     if they want it to work they'll have to write the code themselves and not
1496     expect any support from me.
1497   </footnote>.
1498
1499   Unfortunately, the full set of matching delimiters may vary from language
1500   to language. In certain C++ template contexts, <verbatim|\<less\>> and
1501   <verbatim|\<gtr\>> would count as delimiters, and yet in other contexts
1502   they would not.
1503
1504   This puts me in the unfortunate position of having to parse-somewhat all
1505   programming languages without knowing what they are!
1506
1507   However, if this universal mode-tracking is possible, then parsing the
1508   arguments would be trivial. Such a mode tracker is described in chapter
1509   <reference|modes> and used here with simplicity.
1510
1511   <\nf-chunk|parse_chunk_args>
1512     <item>function parse_chunk_args(language, text, values, mode,
1513
1514     <item> \ # local vars
1515
1516     <item> \ c, context, rest)
1517
1518     <item>{
1519
1520     <item> \ =\<less\>\\chunkref{new-mode-tracker}(context, language,
1521     mode)\<gtr\>
1522
1523     <item> \ rest = mode_tracker(context, text, values);
1524
1525     <item> \ # extract values
1526
1527     <item> \ for(c=1; c \<less\>= context[0, "values"]; c++) {
1528
1529     <item> \ \ \ values[c] = context[0, "values", c];
1530
1531     <item> \ }
1532
1533     <item> \ return rest;
1534
1535     <item>}
1536   </nf-chunk||>
1537
1538   <section|Expanding parameters in the text>
1539
1540   Within the body of the chunk, the parameters are referred to with:
1541   <verbatim|${name}> and <verbatim|${address}>. There is a strong case that a
1542   <LaTeX> style notation should be used, like <verbatim|\\param{name}> which
1543   would be expressed in the listing as <verbatim|=\<less\>\\param{name}\<gtr\>>
1544   and be rendered as <verbatim|<nf-arg|name>>. Such notation would make me go
1545   blind, but I do intend to adopt it.
1546
1547   We therefore need a function <verbatim|expand_chunk_args> which will take a
1548   block of text, a list of permitted parameters, and the arguments which must
1549   substitute for the parameters.\
1550
1551   Here we split the text on <verbatim|${> which means that all parts except
1552   the first will begin with a parameter name which will be terminated by
1553   <verbatim|}>. The split function will consume the literal <verbatim|${> in
1554   each case.
1555
1556   <\nf-chunk|expand_chunk_args()>
1557     <item>function expand_chunk_args(text, params, args, \
1558
1559     <item> \ p, text_array, next_text, v, t, l)
1560
1561     <item>{
1562
1563     <item> \ if (split(text, text_array, "\\\\${")) {
1564
1565     <item> \ \ \ <nf-ref|substitute-chunk-args|>
1566
1567     <item> \ }
1568
1569     <item>
1570
1571     <item> \ return text;
1572
1573     <item>}
1574   </nf-chunk||>
1575
1576   First, we produce an associative array of substitution values indexed by
1577   parameter names. This will serve as a cache, allowing us to look up the
1578   replacement values as we extract each name.
1579
1580   <\nf-chunk|substitute-chunk-args>
1581     <item>for(p in params) {
1582
1583     <item> \ v[params[p]]=args[p];
1584
1585     <item>}
1586   </nf-chunk||>
1587
1588   We accumulate substituted text in the variable text. As the first part of
1589   the split function is the part before the delimiter --- which is
1590   <verbatim|${> in our case --- this part will never contain a parameter
1591   reference, so we assign this directly to the result kept in
1592   <verbatim|$text>.
1593
1594   <\nf-chunk|substitute-chunk-args>
1595     <item>text=text_array[1];
1596   </nf-chunk||>
1597
1598   We then iterate over the remaining values in the array<\footnote>
1599     I don't know why I think that it will enumerate the array in order, but
1600     it seems to work
1601   </footnote><todo|fix or prove it>, and substitute each reference for it's
1602   argument.
1603
1604   <\nf-chunk|substitute-chunk-args>
1605     <item>for(t=2; t in text_array; t++) {
1606
1607     <item> \ =\<less\>\\chunkref{substitute-chunk-arg}\<gtr\>
1608
1609     <item>}
1610   </nf-chunk||>
1611
1612   After the split on <verbatim|${> a valid parameter reference will consist
1613   of valid parameter name terminated by a close-brace <verbatim|}>. A valid
1614   character name begins with the underscore or a letter, and may contain
1615   letters, digits or underscores.
1616
1617   A valid looking reference that is not actually the name of a parameter will
1618   be and not substituted. This is good because there is nothing to substitute
1619   anyway, and it avoids clashes when writing code for languages where
1620   <verbatim|${...}> is a valid construct --- such constructs will not be
1621   interfered with unless the parameter name also matches.
1622
1623   <\nf-chunk|substitute-chunk-arg>
1624     <item>if (match(text_array[t], "^([a-zA-Z_][a-zA-Z0-9_]*)}", l) &&
1625
1626     <item> \ \ \ l[1] in v)\
1627
1628     <item>{
1629
1630     <item> \ text = text v[l[1]] substr(text_array[t], length(l[1])+2);
1631
1632     <item>} else {
1633
1634     <item> \ text = text "${" text_array[t];
1635
1636     <item>}
1637   </nf-chunk||>
1638
1639   <chapter|Language Modes & Quoting><label|modes>
1640
1641   <section|Modes>
1642
1643   <verbatim|lstlistings> and <verbatim|fangle> both recognize source
1644   languages, and perform some basic parsing. <verbatim|lstlistings> can
1645   detect strings and comments within a language definition and perform
1646   suitable rendering, such as italics for comments, and visible-spaces within
1647   strings.
1648
1649   Fangle similarly can recognize strings, and comments, etc, within a
1650   language, so that any chunks included with <verbatim|\\chunkref> can be
1651   suitably escape or quoted.
1652
1653   <subsection|Modes to keep code together>
1654
1655   As an example, in the C language there are a few parse modes, affecting the
1656   interpretation of characters.
1657
1658   One parse mode is the strings mode. The string mode is commenced by an
1659   un-escaped quotation mark <verbatim|"> and terminated by the same. Within
1660   the string mode, only one additional mode can be commenced, it is the
1661   backslash mode <verbatim|\\>, which is always terminated after the folloing
1662   character.
1663
1664   Another mode is <verbatim|[> which is terminated by a <verbatim|]> (unless
1665   it occurs in a string).
1666
1667   Consider this fragment of C code:
1668
1669   \;
1670
1671   <math|things<wide|<around|[|x, y|]>|\<wide-overbrace\>><rsup|1. [ mode>,
1672   get_other_things<wide|<around|(|a, <wide*|<text|"><around|(|all|)><text|">|\<wide-underbrace\>><rsub|3.
1673   " mode>|)>|\<wide-overbrace\>><rsup|2. ( mode>>
1674
1675   \;
1676
1677   Mode nesting prevents the close parenthesis in the quoted string (part 3)
1678   from terminating the parenthesis mode (part 2).
1679
1680   Each language has a set of modes, the default mode being the null mode.
1681   Each mode can lead to other modes.
1682
1683   <subsection|Modes affect included chunks>
1684
1685   For instance, consider this chunk with language=perl:
1686
1687   <nf-chunk|example-perl|print "hello world $0\\n";|perl|>
1688
1689   If it were included in a chunk with <verbatim|language=sh>, like this:
1690
1691   <nf-chunk|example-sh|perl -e "=\<less\>\\chunkref{example-perl}\<gtr\>"|sh|>
1692
1693   fangle would <em|want> to generate output like this:
1694
1695   <verbatim|perl -e "print \\"hello world \\$0\\\\n\\";" >
1696
1697   See that the double quote <verbatim|">, back-slash <verbatim|\\> and
1698   <verbatim|$> have been quoted with a back-slash to protect them from shell
1699   interpretation.
1700
1701   If that were then included in a chunk with language=make, like this:
1702
1703   <\nf-chunk|example-makefile>
1704     <item>target: pre-req
1705
1706     <item><htab|5mm>=\<less\>\\chunkref{example-sh}\<gtr\>
1707   </nf-chunk|make|>
1708
1709   We would need the output to look like this --- note the <verbatim|$$>:
1710
1711   <\verbatim>
1712     target: pre-req
1713
1714     \ \ \ \ \ \ \ \ perl -e "print \\"hello world \\$$0\\\\n\\";"
1715   </verbatim>
1716
1717   In order to make this work, we need to define a mode-tracker supporting
1718   each language, that can detect the various quoting modes, and provide a
1719   transformation that must be applied to any included text so that included
1720   text will be interpreted correctly after any interpolation that it may be
1721   subject to at run-time.
1722
1723   For example, the sed transformation for text to be inserted into shell
1724   double-quoted strings would be something like:
1725
1726   <verbatim|s/\\\\/\\\\\\\\/g;s/$/\\\\$/g;s/"/\\\\"/g;>
1727
1728   which protects <verbatim|\\ $ ">.
1729
1730   <todo|I don't think this example is true>The mode tracker must also track
1731   nested mode-changes, as in this sh example.
1732
1733   <verbatim|echo "hello `id ...`">
1734
1735   <phantom|<verbatim|echo "hello `id >><math|\<uparrow\>>
1736
1737   Any characters inserted at the point marked <math|\<uparrow\>> would need
1738   to be escaped, including <verbatim|`> <verbatim|\|> <verbatim|*> among
1739   others. First it would need escaping for the back-ticks <verbatim|`>, and
1740   then for the double-quotes <verbatim|">.
1741
1742   <todo|MAYBE>Escaping need not occur if the format and mode of the included
1743   chunk matches that of the including chunk.
1744
1745   As each chunk is output a new mode tracker for that language is initialized
1746   in it's normal state. As text is output for that chunk the output mode is
1747   tracked. When a new chunk is included, a transformation appropriate to that
1748   mode is selected and pushed onto a stack of transformations. Any text to be
1749   output is first passed through this stack of transformations.
1750
1751   It remains to consider if the chunk-include function should return it's
1752   generated text so that the caller can apply any transformations (and
1753   formatting), or if it should apply the stack of transformations itself.
1754
1755   Note that the transformed text should have the property of not being able
1756   to change the mode in the current chunk.
1757
1758   <todo|Note chunk parameters should probably also be transformed>
1759
1760   <section|Language Mode Definitions>
1761
1762   All modes are stored in a single multi-dimensional hash. The first index is
1763   the language, and the second index is the mode-identifier. The third
1764   indexes are terminators, and optionally, submodes, and delimiters.
1765
1766   A useful set of mode definitions for a nameless general C-type language is
1767   shown here. (Don't be confused by the double backslash escaping needed in
1768   awk. One set of escaping is for the string, and the second set of escaping
1769   is for the regex).
1770
1771   <\todo>
1772     TODO: Add =\<less\>\\mode{}\<gtr\> command which will allow us to signify
1773     that a string is
1774
1775     \ regex and thus fangle will quote it for us.
1776   </todo>
1777
1778   Submodes are entered by the characters \ <verbatim|"> <verbatim|'>
1779   <verbatim|{> <verbatim|(> <verbatim|[> <verbatim|/*>
1780
1781   <\nf-chunk|common-mode-definitions>
1782     <item>modes[${language}, "", \ "submodes"]="\\\\\\\\\|\\"\|'\|{\|\\\\(\|\\\\[";
1783   </nf-chunk||<tuple|language>>
1784
1785   In the default mode, a comma surrounded by un-important white space is a
1786   delimiter of language items<\footnote>
1787     whatever a <em|language item> might be
1788   </footnote>.
1789
1790   <\nf-chunk|common-mode-definitions>
1791     <item>modes[${language}, "", \ "delimiters"]=" *, *";
1792   </nf-chunk||language>
1793
1794   and should pass this test:<todo|Why do the tests run in ?(? mode and not ??
1795   mode>
1796
1797   <\nf-chunk|test:mode-definitions>
1798     <item>parse_chunk_args("c-like", "1,2,3", a, "");
1799
1800     <item>if (a[1] != "1") e++;
1801
1802     <item>if (a[2] != "2") e++;
1803
1804     <item>if (a[3] != "3") e++;
1805
1806     <item>if (length(a) != 3) e++;
1807
1808     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
1809
1810     <item>
1811
1812     <item>parse_chunk_args("c-like", "joe, red", a, "");
1813
1814     <item>if (a[1] != "joe") e++;
1815
1816     <item>if (a[2] != "red") e++;
1817
1818     <item>if (length(a) != 2) e++;
1819
1820     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
1821
1822     <item>
1823
1824     <item>parse_chunk_args("c-like", "${colour}", a, "");
1825
1826     <item>if (a[1] != "${colour}") e++;
1827
1828     <item>if (length(a) != 1) e++;
1829
1830     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
1831   </nf-chunk||>
1832
1833   Nested modes are identified by a backslash, a double or single quote,
1834   various bracket styles or a <verbatim|/*> comment.
1835
1836   For each of these sub-modes modes we must also identify at a mode
1837   terminator, and any sub-modes or delimiters that may be entered<\footnote>
1838     Because we are using the sub-mode characters as the mode identifier it
1839     means we can't currently have a mode character dependant on it's context;
1840     i.e. <verbatim|{> can't behave differently when it is inside
1841     <verbatim|[>.
1842   </footnote>.
1843
1844   <subsection|Backslash>
1845
1846   The backslash mode has no submodes or delimiters, and is terminated by any
1847   character. Note that we are not so much interested in evaluating or
1848   interpolating content as we are in delineating content. It is no matter
1849   that a double backslash (<verbatim|\\\\>) may represent a single backslash
1850   while a backslash-newline may represent white space, but it does matter
1851   that the newline in a backslash newline should not be able to terminate a C
1852   pre-processor statement; and so the newline will be consumed by the
1853   backslash however it is to be interpreted.
1854
1855   <\nf-chunk|common-mode-definitions>
1856     <item>modes[${language}, "\\\\", "terminators"]=".";
1857   </nf-chunk||>
1858
1859   <subsection|Strings>
1860
1861   Common languages support two kinds of strings quoting, double quotes and
1862   single quotes.
1863
1864   In a string we have one special mode, which is the backslash. This may
1865   escape an embedded quote and prevent us thinking that it should terminate
1866   the string.
1867
1868   <\nf-chunk|mode:common-string>
1869     <item>modes[${language}, ${quote}, "submodes"]="\\\\\\\\";
1870   </nf-chunk||<tuple|language|quote>>
1871
1872   Otherwise, the string will be terminated by the same character that
1873   commenced it.
1874
1875   <\nf-chunk|mode:common-string>
1876     <item>modes[${language}, ${quote}, "terminators"]=${quote};
1877   </nf-chunk||language>
1878
1879   In C type languages, certain escape sequences exist in strings. We need to
1880   define mechanism to enclode any chunks included in this mode using those
1881   escape sequences. These are expressed in two parts, s meaning search, and r
1882   meaning replace.
1883
1884   The first substitution is to replace a backslash with a double backslash.
1885   We do this first as other substitutions may introduce a backslash which we
1886   would not then want to escape again here.
1887
1888   Note: Backslashes need double-escaping in the search pattern but not in the
1889   replacement string, hence we are replacing a literal <verbatim|\\> with a
1890   literal <verbatim|\\\\>.
1891
1892   <\nf-chunk|mode:common-string>
1893     <item>escapes[${language}, ${quote}, ++escapes[${language}, ${quote}],
1894     "s"]="\\\\\\\\";
1895
1896     <item>escapes[${language}, ${quote}, \ \ escapes[${language}, ${quote}],
1897     "r"]="\\\\\\\\";
1898   </nf-chunk||language>
1899
1900   If the quote character occurs in the text, it should be preceded by a
1901   backslash, otherwise it would terminate the string unexpectedly.
1902
1903   <\nf-chunk|mode:common-string>
1904     <item>escapes[${language}, ${quote}, ++escapes[${language}, ${quote}],
1905     "s"]=${quote};
1906
1907     <item>escapes[${language}, ${quote}, \ \ escapes[${language}, ${quote}],
1908     "r"]="\\\\" ${quote};
1909   </nf-chunk||language>
1910
1911   Any newlines in the string, must be replaced by <verbatim|\\n>.
1912
1913   <\nf-chunk|mode:common-string>
1914     <item>escapes[${language}, ${quote}, ++escapes[${language}, ${quote}],
1915     "s"]="\\n";
1916
1917     <item>escapes[${language}, ${quote}, \ \ escapes[${language}, ${quote}],
1918     "r"]="\\\\n";
1919   </nf-chunk||language>
1920
1921   For the common modes, we define this string handling for double and single
1922   quotes.
1923
1924   <\nf-chunk|common-mode-definitions>
1925     <item>=\<less\>\\chunkref{mode:common-string}(${language},
1926     "\\textbackslash{}"")\<gtr\>
1927
1928     <item>=\<less\>\\chunkref{mode:common-string}(${language}, "'")\<gtr\>
1929   </nf-chunk||>
1930
1931   Working strings should pass this test:
1932
1933   <\nf-chunk|test:mode-definitions>
1934     <item>parse_chunk_args("c-like", "say \\"I said, \\\\\\"Hello, how are
1935     you\\\\\\".\\", for me", a, "");
1936
1937     <item>if (a[1] != "say \\"I said, \\\\\\"Hello, how are you\\\\\\".\\"")
1938     e++;
1939
1940     <item>if (a[2] != "for me") e++;
1941
1942     <item>if (length(a) != 2) e++;
1943
1944     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
1945   </nf-chunk||>
1946
1947   <subsection|Parentheses, Braces and Brackets>
1948
1949   Where quotes are closed by the same character, parentheses, brackets and
1950   braces are closed by an alternate character.
1951
1952   <\nf-chunk|mode:common-brackets>
1953     <item>modes[<nf-arg|language>, <nf-arg|open>, \ "submodes"
1954     ]="\\\\\\\\\|\\"\|{\|\\\\(\|\\\\[\|'\|/\\\\*";
1955
1956     <item>modes[<nf-arg|language>, <nf-arg|open>, \ "delimiters"]=" *, *";
1957
1958     <item>modes[<nf-arg|language>, <nf-arg|open>,
1959     \ "terminators"]=<nf-arg|close>;
1960   </nf-chunk||<tuple|language|open|close>>
1961
1962   Note that the open is NOT a regex but the close token IS. <todo|When we can
1963   quote regex we won't have to put the slashes in here>
1964
1965   <\nf-chunk|common-mode-definitions>
1966     <item>=\<less\>\\chunkref{mode:common-brackets}(${language}, "{",
1967     "}")\<gtr\>
1968
1969     <item>=\<less\>\\chunkref{mode:common-brackets}(${language}, "[",
1970     "\\textbackslash{}\\textbackslash{}]")\<gtr\>
1971
1972     <item>=\<less\>\\chunkref{mode:common-brackets}(${language}, "(",
1973     "\\textbackslash{}\\textbackslash{})")\<gtr\>
1974   </nf-chunk||>
1975
1976   <subsection|Customizing Standard Modes>
1977
1978   <\nf-chunk|mode:add-submode>
1979     <item>modes[${language}, ${mode}, "submodes"] = modes[${language},
1980     ${mode}, "submodes"] "\|" ${submode};
1981   </nf-chunk||<tuple|language|mode|submode>>
1982
1983   <\nf-chunk|mode:add-escapes>
1984     <item>escapes[${language}, ${mode}, ++escapes[${language}, ${mode}],
1985     "s"]=${search};
1986
1987     <item>escapes[${language}, ${mode}, \ \ escapes[${language}, ${mode}],
1988     "r"]=${replace};
1989   </nf-chunk||<tuple|language|mode|search|replace>>
1990
1991   \;
1992
1993   <subsection|Comments>
1994
1995   We can define <verbatim|/* comment */> style comments and
1996   <verbatim|//comment> style comments to be added to any language:
1997
1998   <\nf-chunk|mode:multi-line-comments>
1999     <item>=\<less\>\\chunkref{mode:add-submode}(${language}, "",
2000     "/\\textbackslash{}\\textbackslash{}*")\<gtr\>
2001
2002     <item>modes[${language}, "/*", "terminators"]="\\\\*/";
2003   </nf-chunk||<tuple|language>>
2004
2005   <\nf-chunk|mode:single-line-slash-comments>
2006     <item>=\<less\>\\chunkref{mode:add-submode}(${language}, "", "//")\<gtr\>
2007
2008     <item>modes[${language}, "//", "terminators"]="\\n";
2009
2010     <item>=\<less\>\\chunkref{mode:add-escapes}(${language}, "//",
2011     "\\textbackslash{}n", "\\textbackslash{}n//")\<gtr\>
2012   </nf-chunk||language>
2013
2014   We can also define <verbatim|# comment> style comments (as used in awk and
2015   shell scripts) in a similar manner.
2016
2017   <todo|I'm having to use # for hash and \textbackslash{} for \ and have
2018   hacky work-arounds in the parser for now>
2019
2020   <\nf-chunk|mode:add-hash-comments>
2021     <item>=\<less\>\\chunkref{mode:add-submode}(${language}, "",
2022     "\\#")\<gtr\>
2023
2024     <item>modes[${language}, "#", "terminators"]="\\n";
2025
2026     <item>=\<less\>\\chunkref{mode:add-escapes}(${language}, "\\#",
2027     "\\textbackslash{}n", "\\textbackslash{}n\\#")\<gtr\>
2028   </nf-chunk||<tuple|language>>
2029
2030   In C, the <verbatim|#> denotes pre-processor directives which can be
2031   multi-line
2032
2033   <\nf-chunk|mode:add-hash-defines>
2034     <item>=\<less\>\\chunkref{mode:add-submode}(${language}, "",
2035     "\\#")\<gtr\>
2036
2037     <item>modes[${language}, "#", "submodes" ]="\\\\\\\\";
2038
2039     <item>modes[${language}, "#", "terminators"]="\\n";
2040
2041     <item>=\<less\>\\chunkref{mode:add-escapes}(${language}, "\\#",
2042     "\\textbackslash{}n", "\\textbackslash{}\\textbackslash{}\\textbackslash{}\\textbackslash{}\\textbackslash{}n")\<gtr\>
2043   </nf-chunk||<tuple|language>>
2044
2045   <\nf-chunk|mode:quote-dollar-escape>
2046     <item>escapes[${language}, ${quote}, ++escapes[${language}, ${quote}],
2047     "s"]="\\\\$";
2048
2049     <item>escapes[${language}, ${quote}, \ \ escapes[${language}, ${quote}],
2050     "r"]="\\\\$";
2051   </nf-chunk||<tuple|language|quote>>
2052
2053   We can add these definitions to various languages
2054
2055   <\nf-chunk|mode-definitions>
2056     <item><nf-ref|common-mode-definitions|<tuple|"c-like">>
2057
2058     <item>
2059
2060     <item><nf-ref|common-mode-definitions|<tuple|"c">>
2061
2062     <item>=\<less\>\\chunkref{mode:multi-line-comments}("c")\<gtr\>
2063
2064     <item>=\<less\>\\chunkref{mode:single-line-slash-comments}("c")\<gtr\>
2065
2066     <item>=\<less\>\\chunkref{mode:add-hash-defines}("c")\<gtr\>
2067
2068     <item>
2069
2070     <item>=\<less\>\\chunkref{common-mode-definitions}("awk")\<gtr\>
2071
2072     <item>=\<less\>\\chunkref{mode:add-hash-comments}("awk")\<gtr\>
2073
2074     <item>=\<less\>\\chunkref{mode:add-naked-regex}("awk")\<gtr\>
2075   </nf-chunk||>
2076
2077   The awk definitions should allow a comment block like this:
2078
2079   <nf-chunk|test:comment-quote|<item># Comment:
2080   =\<less\>\\chunkref{test:comment-text}\<gtr\>|awk|>
2081
2082   <\nf-chunk|test:comment-text>
2083     <item>Now is the time for
2084
2085     <item>the quick brown fox to bring lemonade
2086
2087     <item>to the party
2088   </nf-chunk||>
2089
2090   to come out like this:
2091
2092   <\nf-chunk|test:comment-quote:result>
2093     <item># Comment: Now is the time for
2094
2095     <item>#the quick brown fox to bring lemonade
2096
2097     <item>#to the party
2098   </nf-chunk||>
2099
2100   The C definition for such a block should have it come out like this:
2101
2102   <\nf-chunk|test:comment-quote:C-result>
2103     <item># Comment: Now is the time for\\
2104
2105     <item>the quick brown fox to bring lemonade\\
2106
2107     <item>to the party
2108   </nf-chunk||>
2109
2110   <subsection|Regex>
2111
2112   This pattern is incomplete, but meant to detect naked regular expressions
2113   in awk and perl; e.g. <verbatim|/.*$/>, however required capabilities are
2114   not present.
2115
2116   Current it only detects regexes anchored with ^ as used in fangle.
2117
2118   For full regex support, modes need to be named not after their starting
2119   character, but some other more fully qualified name.
2120
2121   <\nf-chunk|mode:add-naked-regex>
2122     <item>=\<less\>\\chunkref{mode:add-submode}(${language}, "",
2123     "/\\textbackslash{}\\textbackslash{}\\^")\<gtr\>
2124
2125     <item>modes[${language}, "/^", "terminators"]="/";
2126   </nf-chunk||<tuple|language>>
2127
2128   <subsection|Perl>
2129
2130   <\nf-chunk|mode-definitions>
2131     <item>=\<less\>\\chunkref{common-mode-definitions}("perl")\<gtr\>
2132
2133     <item>=\<less\>\\chunkref{mode:multi-line-comments}("perl")\<gtr\>
2134
2135     <item>=\<less\>\\chunkref{mode:add-hash-comments}("perl")\<gtr\>
2136   </nf-chunk||>
2137
2138   Still need to add add <verbatim|s/>, submode <verbatim|/>, terminate both
2139   with <verbatim|//>. This is likely to be impossible as perl regexes can
2140   contain perl.
2141
2142   <subsection|sh>
2143
2144   <\nf-chunk|mode-definitions>
2145     <item>=\<less\>\\chunkref{common-mode-definitions}("sh")\<gtr\>
2146
2147     <item>#\<less\>\\chunkref{mode:common-string}("sh",
2148     "\\textbackslash{}"")\<gtr\>
2149
2150     <item>#\<less\>\\chunkref{mode:common-string}("sh", "'")\<gtr\>
2151
2152     <item>=\<less\>\\chunkref{mode:add-hash-comments}("sh")\<gtr\>
2153
2154     <item>=\<less\>\\chunkref{mode:quote-dollar-escape}("sh", "\\"")\<gtr\>
2155   </nf-chunk||>
2156
2157   <section|Some tests>
2158
2159   Also, the parser must return any spare text at the end that has not been
2160   processed due to a mode terminator being found.
2161
2162   <\nf-chunk|test:mode-definitions>
2163     <item>rest = parse_chunk_args("c-like", "1, 2, 3) spare", a, "(");
2164
2165     <item>if (a[1] != 1) e++;
2166
2167     <item>if (a[2] != 2) e++;
2168
2169     <item>if (a[3] != 3) e++;
2170
2171     <item>if (length(a) != 3) e++;
2172
2173     <item>if (rest != " spare") e++;
2174
2175     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
2176   </nf-chunk||>
2177
2178   We must also be able to parse the example given earlier.
2179
2180   <\nf-chunk|test:mode-definitions>
2181     <item>parse_chunk_args("c-like", "things[x, y], get_other_things(a,
2182     \\"(all)\\"), 99", a, "(");
2183
2184     <item>if (a[1] != "things[x, y]") e++;
2185
2186     <item>if (a[2] != "get_other_things(a, \\"(all)\\")") e++;
2187
2188     <item>if (a[3] != "99") e++;
2189
2190     <item>if (length(a) != 3) e++;
2191
2192     <item>=\<less\>\\chunkref{pca-test.awk:summary}\<gtr\>
2193   </nf-chunk||>
2194
2195   <section|A non-recursive mode tracker>
2196
2197   <subsection|Constructor>
2198
2199   The mode tracker holds its state in a stack based on a numerically indexed
2200   hash. This function, when passed an empty hash, will intialize it.
2201
2202   <\nf-chunk|new_mode_tracker()>
2203     <item>function new_mode_tracker(context, language, mode) {
2204
2205     <item> \ context[""] = 0;
2206
2207     <item> \ context[0, "language"] = language;
2208
2209     <item> \ context[0, "mode"] = mode;
2210
2211     <item>}
2212   </nf-chunk||>
2213
2214   Because awk functions cannot return an array, we must create the array
2215   first and pass it in, so we have a fangle macro to do this:
2216
2217   <\nf-chunk|new-mode-tracker>
2218     <item><nf-ref|awk-delete-array|<tuple|context>>
2219
2220     <item>new_mode_tracker(${context}, ${language}, ${mode});
2221   </nf-chunk|awk|<tuple|context|language|mode>>
2222
2223   <subsection|Management>
2224
2225   And for tracking modes, we dispatch to a mode-tracker action based on the
2226   current language
2227
2228   <\nf-chunk|mode_tracker>
2229     <item>function push_mode_tracker(context, language, mode,
2230
2231     <item> \ # local vars
2232
2233     <item> \ top)
2234
2235     <item>{
2236
2237     <item> \ if (! ("" in context)) {
2238
2239     <item> \ \ \ <nf-ref|new-mode-tracker|<tuple|context|language|mode>>
2240
2241     <item> \ } else {
2242
2243     <item> \ \ \ top = context[""];
2244
2245     <item> \ \ \ if (context[top, "language"] == language && mode=="") mode =
2246     context[top, "mode"];
2247
2248     <item> \ \ \ top++;
2249
2250     <item> \ \ \ context[top, "language"] = language;
2251
2252     <item> \ \ \ context[top, "mode"] = mode;
2253
2254     <item> \ \ \ context[""] = top;
2255
2256     <item> \ }
2257
2258     <item>}
2259   </nf-chunk|awk|>
2260
2261   <\nf-chunk|mode_tracker>
2262     <item>function dump_mode_tracker(context, \
2263
2264     <item> \ c, d)
2265
2266     <item>{
2267
2268     <item> \ for(c=0; c \<less\>= context[""]; c++) {
2269
2270     <item> \ \ \ printf(" %2d \ \ %s:%s\\n", c, context[c, "language"],
2271     context[c, "mode"]) \<gtr\> "/dev/stderr";
2272
2273     <item> \ \ \ for(d=1; ( (c, "values", d) in context); d++) {
2274
2275     <item> \ \ \ \ \ printf(" \ \ %2d %s\\n", d, context[c, "values", d])
2276     \<gtr\> "/dev/stderr";
2277
2278     <item> \ \ \ }
2279
2280     <item> \ }
2281
2282     <item>}
2283   </nf-chunk||>
2284
2285   <\nf-chunk|mode_tracker>
2286     <item>function finalize_mode_tracker(context)
2287
2288     <item>{
2289
2290     <item> \ if ( ("" in context) && context[""] != 0) return 0;
2291
2292     <item> \ return 1;
2293
2294     <item>}
2295   </nf-chunk||>
2296
2297   This implies that any chunk must be syntactically whole; for instance, this
2298   is fine:
2299
2300   <\nf-chunk|test:whole-chunk>
2301     <item>if (1) {
2302
2303     <item> \ =\<less\>\\chunkref{test:say-hello}\<gtr\>
2304
2305     <item>}
2306   </nf-chunk||>
2307
2308   <\nf-chunk|test:say-hello>
2309     <item>print "hello";
2310   </nf-chunk||>
2311
2312   But this is not fine; the chunk <nf-ref|test:hidden-else|> is not properly
2313   cromulent.
2314
2315   <\nf-chunk|test:partial-chunk>
2316     <item>if (1) {
2317
2318     <item> \ =\<less\>\\chunkref{test:hidden-else}\<gtr\>
2319
2320     <item>}
2321   </nf-chunk||>
2322
2323   <\nf-chunk|test:hidden-else>
2324     <item> \ print "I'm fine";
2325
2326     <item>} else {
2327
2328     <item> \ print "I'm not";
2329   </nf-chunk||>
2330
2331   These tests will check for correct behaviour:
2332
2333   <\nf-chunk|test:cromulence>
2334     <item>echo Cromulence test
2335
2336     <item>passtest $FANGLE -Rtest:whole-chunk $TEX_SRC &\<gtr\>/dev/null \|\|
2337     ( echo "Whole chunk failed" && exit 1 )
2338
2339     <item>failtest $FANGLE -Rtest:partial-chunk $TEX_SRC &\<gtr\>/dev/null
2340     \|\| ( echo "Partial chunk failed" && exit 1 )
2341   </nf-chunk||>
2342
2343   <subsection|Tracker>
2344
2345   We must avoid recursion as a language construct because we intend to employ
2346   mode-tracking to track language mode of emitted code, and the code is
2347   emitted from a function which is itself recursive, so instead we implement
2348   psuedo-recursion using our own stack based on a hash.
2349
2350   <\nf-chunk|mode_tracker()>
2351     <item>function mode_tracker(context, text, values,\
2352
2353     <item> \ # optional parameters
2354
2355     <item> \ # local vars
2356
2357     <item> \ mode, submodes, language,
2358
2359     <item> \ cindex, c, a, part, item, name, result, new_values, new_mode,\
2360
2361     <item> \ delimiters, terminators)
2362
2363     <item>{
2364   </nf-chunk|awk|>
2365
2366   We could be re-commencing with a valid context, so we need to setup the
2367   state according to the last context.
2368
2369   <\nf-chunk|mode_tracker()>
2370     <item> \ cindex = context[""] + 0;
2371
2372     <item> \ mode = context[cindex, "mode"];
2373
2374     <item> \ language = context[cindex, "language" ];
2375   </nf-chunk||>
2376
2377   First we construct a single large regex combining the possible sub-modes
2378   for the current mode along with the terminators for the current mode.
2379
2380   <\nf-chunk|parse_chunk_args-reset-modes>
2381     <item> \ submodes=modes[language, mode, "submodes"];
2382
2383     <item>
2384
2385     <item> \ if ((language, mode, "delimiters") in modes) {
2386
2387     <item> \ \ \ delimiters = modes[language, mode, "delimiters"];
2388
2389     <item> \ \ \ if (length(submodes)\<gtr\>0) submodes = submodes "\|";
2390
2391     <item> \ \ \ submodes=submodes delimiters;
2392
2393     <item> \ } else delimiters="";
2394
2395     <item> \ if ((language, mode, "terminators") in modes) {
2396
2397     <item> \ \ \ terminators = modes[language, mode, "terminators"];
2398
2399     <item> \ \ \ if (length(submodes)\<gtr\>0) submodes = submodes "\|";
2400
2401     <item> \ \ \ submodes=submodes terminators;
2402
2403     <item> \ } else terminators="";
2404   </nf-chunk||>
2405
2406   If we don't find anything to match on --- probably because the language is
2407   not supported --- then we return the entire text without matching anything.
2408
2409   <\nf-chunk|parse_chunk_args-reset-modes>
2410     <item> if (! length(submodes)) return text;
2411   </nf-chunk||>
2412
2413   <\nf-chunk|mode_tracker()>
2414     <item>=\<less\>\\chunkref{parse_chunk_args-reset-modes}\<gtr\>
2415   </nf-chunk||>
2416
2417   We then iterate the text (until there is none left) looking for sub-modes
2418   or terminators in the regex.
2419
2420   <\nf-chunk|mode_tracker()>
2421     <item> \ while((cindex \<gtr\>= 0) && length(text)) {
2422
2423     <item> \ \ \ if (match(text, "(" submodes ")", a)) {
2424   </nf-chunk||>
2425
2426   A bug that creeps in regularly during development is bad regexes of zero
2427   length which result in an infinite loop (as no text is consumed), so I
2428   catch that right away with this test.
2429
2430   <\nf-chunk|mode_tracker()>
2431     <item> \ \ \ \ \ if (RLENGTH\<less\>1) {
2432
2433     <item> \ \ \ \ \ \ \ error(sprintf("Internal error, matched zero length
2434     submode, should be impossible - likely regex computation error\\n" \\
2435
2436     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ "Language=%s\\nmode=%s\\nmatch=%s\\n",
2437     language, mode, submodes));
2438
2439     <item> \ \ \ \ \ }
2440   </nf-chunk||>
2441
2442   part is defined as the text up to the sub-mode or terminator, and this is
2443   appended to item --- which is the current text being gathered. If a mode
2444   has a delimiter, then item is reset each time a delimiter is found.
2445
2446   <math|<wide|<with|mode|prog|"><wide*|hello|\<wide-underbrace\>><rsub|item>,
2447   <wide*|there|\<wide-underbrace\>><rsub|item><with|mode|prog|">|\<wide-overbrace\>><rsup|item>,
2448   \ <wide|he said.|\<wide-overbrace\>><rsup|item>>
2449
2450   <\nf-chunk|mode_tracker()>
2451     <item> \ \ \ \ \ part = substr(text, 1, RSTART -1);
2452
2453     <item> \ \ \ \ \ item = item part;
2454   </nf-chunk||>
2455
2456   We must now determine what was matched. If it was a terminator, then we
2457   must restore the previous mode.
2458
2459   <\nf-chunk|mode_tracker()>
2460     <item> \ \ \ \ \ if (match(a[1], "^" terminators "$")) {
2461
2462     <item>#printf("%2d EXIT \ MODE [%s] by [%s] [%s]\\n", cindex, mode, a[1],
2463     text) \<gtr\> "/dev/stderr"
2464
2465     <item> \ \ \ \ \ \ \ context[cindex, "values", ++context[cindex,
2466     "values"]] = item;
2467
2468     <item> \ \ \ \ \ \ \ delete context[cindex];
2469
2470     <item> \ \ \ \ \ \ \ context[""] = --cindex;
2471
2472     <item> \ \ \ \ \ \ \ if (cindex\<gtr\>=0) {
2473
2474     <item> \ \ \ \ \ \ \ \ \ mode = context[cindex, "mode"];
2475
2476     <item> \ \ \ \ \ \ \ \ \ language = context[cindex, "language"];
2477
2478     <item> \ \ \ \ \ \ \ \ \ =\<less\>\\chunkref{parse_chunk_args-reset-modes}\<gtr\>
2479
2480     <item> \ \ \ \ \ \ \ }
2481
2482     <item> \ \ \ \ \ \ \ item = item a[1];
2483
2484     <item> \ \ \ \ \ \ \ text = substr(text, 1 + length(part) +
2485     length(a[1]));
2486
2487     <item> \ \ \ \ \ }
2488   </nf-chunk||>
2489
2490   If a delimiter was matched, then we must store the current item in the
2491   parsed values array, and reset the item.
2492
2493   <\nf-chunk|mode_tracker()>
2494     <item> \ \ \ \ \ else if (match(a[1], "^" delimiters "$")) {
2495
2496     <item> \ \ \ \ \ \ \ if (cindex==0) {
2497
2498     <item> \ \ \ \ \ \ \ \ \ context[cindex, "values", ++context[cindex,
2499     "values"]] = item;
2500
2501     <item> \ \ \ \ \ \ \ \ \ item = "";
2502
2503     <item> \ \ \ \ \ \ \ } else {
2504
2505     <item> \ \ \ \ \ \ \ \ \ item = item a[1];
2506
2507     <item> \ \ \ \ \ \ \ }
2508
2509     <item> \ \ \ \ \ \ \ text = substr(text, 1 + length(part) +
2510     length(a[1]));
2511
2512     <item> \ \ \ \ \ }
2513   </nf-chunk||>
2514
2515   otherwise, if a new submode is detected (all submodes have terminators), we
2516   must create a nested parse context until we find the terminator for this
2517   mode.
2518
2519   <\nf-chunk|mode_tracker()>
2520     <item> else if ((language, a[1], "terminators") in modes) {
2521
2522     <item> \ \ \ \ \ \ \ #check if new_mode is defined
2523
2524     <item> \ \ \ \ \ \ \ item = item a[1];
2525
2526     <item>#printf("%2d ENTER MODE [%s] in [%s]\\n", cindex, a[1], text)
2527     \<gtr\> "/dev/stderr"
2528
2529     <item> \ \ \ \ \ \ \ text = substr(text, 1 + length(part) +
2530     length(a[1]));
2531
2532     <item> \ \ \ \ \ \ \ context[""] = ++cindex;
2533
2534     <item> \ \ \ \ \ \ \ context[cindex, "mode"] = a[1];
2535
2536     <item> \ \ \ \ \ \ \ context[cindex, "language"] = language;
2537
2538     <item> \ \ \ \ \ \ \ mode = a[1];
2539
2540     <item> \ \ \ \ \ \ \ =\<less\>\\chunkref{parse_chunk_args-reset-modes}\<gtr\>
2541
2542     <item> \ \ \ \ \ } else {
2543
2544     <item> \ \ \ \ \ \ \ error(sprintf("Submode '%s' set unknown mode in
2545     text: %s\\nLanguage %s Mode %s\\n", a[1], text, language, mode));
2546
2547     <item> \ \ \ \ \ \ \ text = substr(text, 1 + length(part) +
2548     length(a[1]));
2549
2550     <item> \ \ \ \ \ }
2551
2552     <item> \ \ \ }
2553   </nf-chunk||>
2554
2555   In the final case, we parsed to the end of the string. If the string was
2556   entire, then we should have no nested mode context, but if the string was
2557   just a fragment we may have a mode context which must be preserved for the
2558   next fragment. Todo: Consideration ought to be given if sub-mode strings
2559   are split over two fragments.
2560
2561   <\nf-chunk|mode_tracker()>
2562     <item>else {
2563
2564     <item> \ \ \ \ \ context[cindex, "values", ++context[cindex, "values"]] =
2565     item text;
2566
2567     <item> \ \ \ \ \ text = "";
2568
2569     <item> \ \ \ \ \ item = "";
2570
2571     <item> \ \ \ }
2572
2573     <item> \ }
2574
2575     <item>
2576
2577     <item> \ context["item"] = item;
2578
2579     <item>
2580
2581     <item> \ if (length(item)) context[cindex, "values", ++context[cindex,
2582     "values"]] = item;
2583
2584     <item> \ return text;
2585
2586     <item>}
2587   </nf-chunk||>
2588
2589   <subsubsection|One happy chunk>
2590
2591   All the mode tracker chunks are referred to here:
2592
2593   <\nf-chunk|mode-tracker>
2594     <item><nf-ref|new_mode_tracker()|>
2595
2596     <item><nf-ref|mode_tracker()|>
2597   </nf-chunk||>
2598
2599   <subsubsection|Tests>
2600
2601   We can test this function like this:
2602
2603   <\nf-chunk|pca-test.awk>
2604     <item>=\<less\>\\chunkref{error()}\<gtr\>
2605
2606     <item>=\<less\>\\chunkref{mode-tracker}\<gtr\>
2607
2608     <item>=\<less\>\\chunkref{parse_chunk_args()}\<gtr\>
2609
2610     <item>BEGIN {
2611
2612     <item> \ SUBSEP=".";
2613
2614     <item> \ =\<less\>\\chunkref{mode-definitions}\<gtr\>
2615
2616     <item>
2617
2618     <item> \ =\<less\>\\chunkref{test:mode-definitions}\<gtr\>
2619
2620     <item>}
2621   </nf-chunk|awk|>
2622
2623   <\nf-chunk|pca-test.awk:summary>
2624     <item>if (e) {
2625
2626     <item> \ printf "Failed " e
2627
2628     <item> \ for (b in a) {
2629
2630     <item> \ \ \ print "a[" b "] =\<gtr\> " a[b];
2631
2632     <item> \ }
2633
2634     <item>} else {
2635
2636     <item> \ print "Passed"
2637
2638     <item>}
2639
2640     <item>split("", a);
2641
2642     <item>e=0;
2643   </nf-chunk|awk|>
2644
2645   which should give this output:
2646
2647   <\nf-chunk|pca-test.awk-results>
2648     <item>a[foo.quux.quirk] =\<gtr\>\
2649
2650     <item>a[foo.quux.a] =\<gtr\> fleeg
2651
2652     <item>a[foo.bar] =\<gtr\> baz
2653
2654     <item>a[etc] =\<gtr\>\
2655
2656     <item>a[name] =\<gtr\> freddie
2657   </nf-chunk||>
2658
2659   <section|Escaping and Quoting>
2660
2661   For the time being and to get around <TeXmacs> inability to export a
2662   <kbd|TAB> character, the right arrow whose UTF-8 sequence is ...
2663
2664   <todo|complete>
2665
2666   Another special character is used, the left-arrow with UTF-8 sequence 0xE2
2667   0x86 0xA4 is used to strip any preceding white space as a way of un-tabbing
2668   and removing indent that has been applied <emdash> this is important for
2669   bash here documents, and the like. It's a filthy hack.
2670
2671   <todo|remove the hack>
2672
2673   <\nf-chunk|mode_tracker>
2674     \;
2675
2676     <item>function untab(text) {
2677
2678     <item> \ gsub("[[:space:]]*\\xE2\\x86\\xA4","", text);
2679
2680     <item> \ return text;
2681
2682     <item>}
2683   </nf-chunk||>
2684
2685   Each nested mode can optionally define a set of transforms to be applied to
2686   any text that is included from another language.
2687
2688   This code can perform transforms
2689
2690   <\nf-chunk|mode_tracker>
2691     <item>function transform_escape(s, r, text,
2692
2693     <item> \ \ \ # optional
2694
2695     <item> \ \ \ max,\
2696
2697     <item> \ \ \ \ \ \ \ # local vars
2698
2699     <item> \ \ \ \ \ \ \ c)
2700
2701     <item>{
2702
2703     <item> \ for(c=1; c \<less\>= max && (c in s); c++) {
2704
2705     <item> \ \ \ gsub(s[c], r[c], text);
2706
2707     <item> \ }
2708
2709     <item> \ return text;
2710
2711     <item>}
2712   </nf-chunk|awk|>
2713
2714   This function must append from index c onwards, and escape transforms from
2715   the supplied context, and return c + number of new transforms.
2716
2717   <\nf-chunk|mode_tracker>
2718     <item>function mode_escaper(context, s, r, src,
2719
2720     <item> \ c, cp, cpl)
2721
2722     <item>{
2723
2724     <item> \ for(c = context[""]; c \<gtr\>= 0; c--) {
2725
2726     <item> \ \ \ if ( (context[c, "language"], context[c, "mode"]) in
2727     escapes) {
2728
2729     <item> \ \ \ \ \ cpl = escapes[context[c, "language"], context[c,
2730     "mode"]];
2731
2732     <item> \ \ \ \ \ for (cp = 1; cp \<less\>= cpl; cp ++) {
2733
2734     <item> \ \ \ \ \ \ \ ++src;
2735
2736     <item> \ \ \ \ \ \ \ s[src] = escapes[context[c, "language"], context[c,
2737     "mode"], cp, "s"];
2738
2739     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ r[src]
2740     = escapes[context[c, "language"], context[c, "mode"], cp, "r"];
2741
2742     <item> \ \ \ \ \ }
2743
2744     <item> \ \ \ }
2745
2746     <item> \ }
2747
2748     <item> \ return src;
2749
2750     <item>}
2751
2752     <item>function dump_escaper(c, s, r, cc) {
2753
2754     <item> \ for(cc=1; cc\<less\>=c; cc++) {
2755
2756     <item> \ \ \ printf("%2d s[%s] r[%s]\\n", cc, s[cc], r[cc]) \<gtr\>
2757     "/dev/stderr"
2758
2759     <item> \ }
2760
2761     <item>}
2762   </nf-chunk|awk|>
2763
2764   <\nf-chunk|test:escapes>
2765     <item>echo escapes test
2766
2767     <item>passtest $FANGLE -Rtest:comment-quote $TEX_SRC &\<gtr\>/dev/null
2768     \|\| ( echo "Comment-quote failed" && exit 1 )
2769   </nf-chunk|sh|>
2770
2771   <chapter|Recognizing Chunks>
2772
2773   Fangle recognizes noweb chunks, but as we also want better <LaTeX>
2774   integration we will recognize any of these:
2775
2776   <\itemize>
2777     <item>notangle chunks matching the pattern
2778     <verbatim|^\<less\>\<less\>.*?\<gtr\>\<gtr\>=>
2779
2780     <item>chunks beginning with <verbatim|\\begin{lstlistings}>, possibly
2781     with <verbatim|\\Chunk{...}> on the previous line
2782
2783     <item>an older form I have used, beginning with
2784     <verbatim|\\begin{Chunk}[options]> --- also more suitable for plain
2785     <LaTeX> users<\footnote>
2786       Is there such a thing as plain <LaTeX>?
2787     </footnote>.
2788   </itemize>
2789
2790   <section|Chunk start>
2791
2792   The variable chunking is used to signify that we are processing a code
2793   chunk and not document. In such a state, input lines will be assigned to
2794   the current chunk; otherwise they are ignored.
2795
2796   <subsection|<TeXmacs> hackery>
2797
2798   We don't handle <TeXmacs> files natively but instead emit unicode character
2799   sequences to mark up the text-export file which we work on.
2800
2801   These hacks detect such sequences and retro-fit in the old <TeX> parsing.
2802
2803   <\nf-chunk|recognize-chunk>
2804     \;
2805
2806     <item>#/\\n/ {
2807
2808     <item># \ gsub("\\n*$","");
2809
2810     <item># \ gsub("\\n", " ");
2811
2812     <item>#}
2813
2814     <item>#===
2815
2816     <item>/\\xE2\\x86\\xA6/ {
2817
2818     <item> \ gsub("\\\\xE2\\\\x86\\\\xA6", "\\x09");
2819
2820     <item>}
2821
2822     <item>
2823
2824     <item>/\\xE2\\x80\\x98/ {
2825
2826     <item> \ gsub("\\\\xE2\\\\x80\\\\x98", "`");
2827
2828     <item>}
2829
2830     <item>
2831
2832     <item>/\\xE2\\x89\\xA1/ {
2833
2834     <item> \ if (match($0, "^ *([^[ ]* \|)\<less\>([^[
2835     ]*)\\\\[[0-9]*\\\\][(](.*)[)].*, lang=([^ ]*)", line)) {
2836
2837     <item> \ \ \ next_chunk_name=line[2];
2838
2839     <item> \ \ \ gsub(",",";",line[3]);
2840
2841     <item> \ \ \ params="params=" line[3];
2842
2843     <item> \ \ \ if ((line[4])) {
2844
2845     <item> \ \ \ \ \ params = params ",language=" line[4]
2846
2847     <item> \ \ \ }
2848
2849     <item> \ \ \ get_chunk_args(params, next_chunk_args);
2850
2851     <item> \ \ \ new_chunk(next_chunk_name, next_chunk_args);
2852
2853     <item> \ \ \ texmacs_chunking = 1;
2854
2855     <item> \ } else {
2856
2857     <item>#print "Unexpected\
2858
2859     <item>#print\
2860
2861     <item>#exit 1
2862
2863     <item> \ }
2864
2865     <item> \ next;
2866
2867     <item>}
2868
2869     <item>#===
2870   </nf-chunk||>
2871
2872   <subsection|lstlistings>
2873
2874   Our current scheme is to recognize the new lstlisting chunks, but these may
2875   be preceded by a <verbatim|\\Chunk> command which in <LyX> is a more
2876   convenient way to pass the chunk name to the
2877   <verbatim|\\begin{lstlistings}> command, and a more visible way to specify
2878   other <verbatim|lstset> settings.
2879
2880   The arguments to the <verbatim|\\Chunk> command are a name, and then a
2881   comma-seperated list of key-value pairs after the manner of
2882   <verbatim|\\lstset>. (In fact within the <LaTeX> <verbatim|\\Chunk> macro
2883   (section <reference|sub:The-chunk-command>) the text <verbatim|name=> is
2884   prefixed to the argument which is then literally passed to
2885   <verbatim|\\lstset>).
2886
2887   <\nf-chunk|recognize-chunk>
2888     <item>/^\\\\Chunk{/ {
2889
2890     <item> \ if (match($0, "^\\\\\\\\Chunk{ *([^ ,}]*),?(.*)}", line)) {
2891
2892     <item> \ \ \ next_chunk_name = line[1];
2893
2894     <item> \ \ \ get_chunk_args(line[2], next_chunk_args);
2895
2896     <item> \ }
2897
2898     <item> \ next;
2899
2900     <item>}
2901   </nf-chunk|awk|>
2902
2903   We also make a basic attempt to parse the name out of the
2904   <verbatim|\\lstlistings[name=chunk-name]> text, otherwise we fall back to
2905   the name found in the previous chunk command. This attempt is very basic
2906   and doesn't support commas or spaces or square brackets as part of the
2907   chunkname. We also recognize <verbatim|\\begin{Chunk}> which is convenient
2908   for some users<\footnote>
2909     but not yet supported in the <LaTeX> macros
2910   </footnote>.
2911
2912   <\nf-chunk|recognize-chunk>
2913     <item>/^\\\\begin{lstlisting}\|^\\\\begin{Chunk}/ {
2914
2915     <item> \ if (match($0, "}.*[[,] *name= *{? *([^], }]*)", line)) {
2916
2917     <item> \ \ \ new_chunk(line[1]);
2918
2919     <item> \ } else {
2920
2921     <item> \ \ \ new_chunk(next_chunk_name, next_chunk_args);
2922
2923     <item> \ }
2924
2925     <item> \ chunking=1;
2926
2927     <item> \ next;
2928
2929     <item>}
2930   </nf-chunk||>
2931
2932   <subsection|<TeXmacs>>
2933
2934   \;
2935
2936   <\nf-chunk|recognize-chunk>
2937     <item>#===
2938
2939     <item>/^ *\\\|____________*/ && texmacs_chunking {
2940
2941     <item> \ active_chunk="";
2942
2943     <item> \ texmacs_chunking=0;
2944
2945     <item> \ chunking=0;
2946
2947     <item>}
2948
2949     <item>/^ *\\\|\\/\\\\/ && texmacs_chunking {
2950
2951     <item> \ texmacs_chunking=0;
2952
2953     <item> \ chunking=0;
2954
2955     <item> \ active_chunk="";
2956
2957     <item>}
2958
2959     <item>texmacs_chunk=0;
2960
2961     <item>/^ *[1-9][0-9]* *\\\| / {
2962
2963     <item> \ if (texmacs_chunking) {
2964
2965     <item> \ \ \ chunking=1;
2966
2967     <item> \ \ \ texmacs_chunk=1;
2968
2969     <item> \ \ \ gsub("^ *[1-9][0-9]* *\\\\\| ", "")
2970
2971     <item> \ }
2972
2973     <item>}
2974
2975     <item>/^ *\\.\\/\\\\/ && texmacs_chunking {
2976
2977     <item> \ next;
2978
2979     <item>}
2980
2981     <item>/^ *__*$/ && texmacs_chunking {
2982
2983     <item> \ next;
2984
2985     <item>}
2986
2987     <item>
2988
2989     <item>texmacs_chunking {
2990
2991     <item> \ if (! texmacs_chunk) {
2992
2993     <item> \ \ \ # must be a texmacs continued line
2994
2995     <item> \ \ \ chunking=1;
2996
2997     <item> \ \ \ texmacs_chunk=1;
2998
2999     <item> \ }
3000
3001     <item>}
3002
3003     <item>! texmacs_chunk {
3004
3005     <item># \ texmacs_chunking=0;
3006
3007     <item> \ chunking=0;
3008
3009     <item>}
3010
3011     <item>
3012
3013     <item>#===
3014   </nf-chunk||>
3015
3016   <subsection|Noweb>
3017
3018   We recognize notangle style chunks too:
3019
3020   <\nf-chunk|recognize-chunk>
3021     <item>/^[\<less\>]\<less\>.*[\<gtr\>]\<gtr\>=/ {
3022
3023     <item> \ if (match($0, "^[\<less\>]\<less\>(.*)[\<gtr\>]\<gtr\>= *$",
3024     line)) {
3025
3026     <item> \ \ \ chunking=1;
3027
3028     <item> \ \ \ notangle_mode=1;
3029
3030     <item> \ \ \ new_chunk(line[1]);
3031
3032     <item> \ \ \ next;
3033
3034     <item> \ }
3035
3036     <item>}
3037   </nf-chunk|awk|>
3038
3039   <section|Chunk end>
3040
3041   Likewise, we need to recognize when a chunk ends.
3042
3043   <subsection|lstlistings>
3044
3045   The <verbatim|e> in <verbatim|[e]nd{lislisting}> is surrounded by square
3046   brackets so that when this document is processed, this chunk doesn't
3047   terminate early when the lstlistings package recognizes it's own
3048   end-string!<\footnote>
3049     This doesn't make sense as the regex is anchored with ^, which this line
3050     does not begin with!
3051   </footnote>
3052
3053   <\nf-chunk|recognize-chunk>
3054     <item>/^\\\\[e]nd{lstlisting}\|^\\\\[e]nd{Chunk}/ {
3055
3056     <item> \ chunking=0;
3057
3058     <item> \ active_chunk="";
3059
3060     <item> \ next;
3061
3062     <item>}
3063   </nf-chunk||>
3064
3065   <subsection|noweb>
3066
3067   <\nf-chunk|recognize-chunk>
3068     <item>/^@ *$/ {
3069
3070     <item> \ chunking=0;
3071
3072     <item> \ active_chunk="";
3073
3074     <item>}
3075   </nf-chunk||>
3076
3077   All other recognizers are only of effect if we are chunking; there's no
3078   point in looking at lines if they aren't part of a chunk, so we just ignore
3079   them as efficiently as we can.
3080
3081   <\nf-chunk|recognize-chunk>
3082     <item>! chunking { next; }
3083   </nf-chunk||>
3084
3085   <section|Chunk contents>
3086
3087   Chunk contents are any lines read while <verbatim|chunking> is true. Some
3088   chunk contents are special in that they refer to other chunks, and will be
3089   replaced by the contents of these chunks when the file is generated.
3090
3091   <label|sub:ORS-chunk-text>We add the output record separator <verbatim|ORS>
3092   to the line now, because we will set <verbatim|ORS> to the empty string
3093   when we generate the output<\footnote>
3094     So that we can partial print lines using <verbatim|print> instead of
3095     <verbatim|printf>. <todo|This does't make sense>
3096   </footnote>.
3097
3098   <\nf-chunk|recognize-chunk>
3099     <item>length(active_chunk) {
3100
3101     <item> \ =\<less\>\\chunkref{process-chunk-tabs}\<gtr\>
3102
3103     <item> \ =\<less\>\\chunkref{process-chunk}\<gtr\>
3104
3105     <item>}
3106   </nf-chunk||>
3107
3108   If a chunk just consisted of plain text, we could handle the chunk like
3109   this:
3110
3111   <\nf-chunk|process-chunk-simple>
3112     <item>chunk_line(active_chunk, $0 ORS);
3113   </nf-chunk||>
3114
3115   but in fact a chunk can include references to other chunks. Chunk includes
3116   are traditionally written as <verbatim|\<less\>\<less\>chunk-name\<gtr\>\<gtr\>>
3117   but we support other variations, some of which are more suitable for
3118   particular editing systems.
3119
3120   However, we also process tabs at this point, a tab at input can be replaced
3121   by a number of spaces defined by the <verbatim|tabs> variable, set by the
3122   <verbatim|-T> option. Of course this is poor tab behaviour, we should
3123   probably have the option to use proper counted tab-stops and process this
3124   on output.
3125
3126   <\nf-chunk|process-chunk-tabs>
3127     <item>if (length(tabs)) {
3128
3129     <item> \ gsub("\\t", tabs);
3130
3131     <item>}
3132   </nf-chunk||>
3133
3134   <subsection|lstlistings><label|sub:lst-listings-includes>
3135
3136   If <verbatim|\\lstset{escapeinside={=\<less\>}{\<gtr\>}}> is set, then we
3137   can use <verbatim|=\<less\>\\chunkref{chunk-name}\<gtr\>> in listings. The
3138   sequence <verbatim|=\<less\>> was chosen because:
3139
3140   <\enumerate>
3141     <item>it is a better mnemonic than <verbatim|\<less\>\<less\>chunk-name\<gtr\>\<gtr\>>
3142     in that the <verbatim|=> sign signifies equivalence or substitutability.
3143
3144     <item>and because <verbatim|=\<less\>> is not valid in C or any language
3145     I can think of.
3146
3147     <item>and also because lstlistings doesn't like <verbatim|\<gtr\>\<gtr\>>
3148     as an end delimiter for the <em|texcl> escape, so we must make do with a
3149     single <verbatim|\<gtr\>> which is better complemented by
3150     <verbatim|=\<less\>> than by <verbatim|\<less\>\<less\>>.
3151   </enumerate>
3152
3153   Unfortunately the <verbatim|=\<less\>...\<gtr\>> that we use re-enters a
3154   <LaTeX> parsing mode in which some characters are special, e.g. <verbatim|#
3155   \\> and so these cause trouble if used in arguments to
3156   <verbatim|\\chunkref>. At some point I must fix the <LaTeX> command
3157   <verbatim|\\chunkref> so that it can accept these literally, but until
3158   then, when writing chunkref argumemts that need these characters, I must
3159   use the forms <verbatim|\\textbackslash{}> and <verbatim|\\#>; so I also
3160   define a hacky chunk <verbatim|delatex> to be used further on whose purpose
3161   it is to remove these from any arguments parsed by fangle.
3162
3163   <\nf-chunk|delatex>
3164     <item># FILTHY HACK
3165
3166     <item>gsub("\\\\\\\\#", "#", ${text});
3167
3168     <item>gsub("\\\\\\\\textbackslash{}", "\\\\", ${text});
3169
3170     <item>gsub("\\\\\\\\\\\\^", "^", ${text});
3171   </nf-chunk||<tuple|text>>
3172
3173   As each chunk line may contain more than one chunk include, we will split
3174   out chunk includes in an iterative fashion<\footnote>
3175     Contrary to our use of split when substituting parameters in chapter
3176     <reference|Here-we-split>
3177   </footnote>.
3178
3179   First, as long as the chunk contains a <verbatim|\\chunkref> command we
3180   take as much as we can up to the first <verbatim|\\chunkref> command.
3181
3182   <\nf-chunk|process-chunk>
3183     <item>chunk = $0;
3184
3185     <item>indent = 0;
3186
3187     <item>while(match(chunk,"(\\xC2\\xAB)([^\\xC2]*) [^\\xC2]*\\xC2\\xBB",
3188     line) \|\|
3189
3190     <item> \ \ \ \ \ match(chunk,\
3191
3192     <item> \ \ \ \ \ \ \ \ \ \ \ "([=]\<less\>\\\\\\\\chunkref{([^}\<gtr\>]*)}(\\\\(.*\\\\)\|)\<gtr\>\|\<less\>\<less\>([a-zA-Z_][-a-zA-Z0-9_]*)\<gtr\>\<gtr\>)",\
3193
3194     <item> \ \ \ \ \ \ \ \ \ \ \ line)\\
3195
3196     <item>) {
3197
3198     <item> \ chunklet = substr(chunk, 1, RSTART - 1);
3199   </nf-chunk||>
3200
3201   We keep track of the indent count, by counting the number of literal
3202   characters found. We can then preserve this indent on each output line when
3203   multi-line chunks are expanded.
3204
3205   We then process this first part literal text, and set the chunk which is
3206   still to be processed to be the text after the <verbatim|\\chunkref>
3207   command, which we will process next as we continue around the loop.
3208
3209   <\nf-chunk|process-chunk>
3210     <item> \ indent += length(chunklet);
3211
3212     <item> \ chunk_line(active_chunk, chunklet);
3213
3214     <item> \ chunk = substr(chunk, RSTART + RLENGTH);
3215   </nf-chunk||>
3216
3217   We then consider the type of chunk command we have found, whether it is the
3218   fangle style command beginning with <verbatim|=\<less\>> the older notangle
3219   style beginning with <verbatim|\<less\>\<less\>>.
3220
3221   Fangle chunks may have parameters contained within square brackets. These
3222   will be matched in <verbatim|line[3]> and are considered at this stage of
3223   processing to be part of the name of the chunk to be included.
3224
3225   <\nf-chunk|process-chunk>
3226     <item> \ if (substr(line[1], 1, 1) == "=") {
3227
3228     <item> \ \ \ # chunk name up to }
3229
3230     <item> \ \ \ \ \ \ \ =\<less\>\\chunkref{delatex}(line[3])\<gtr\>
3231
3232     <item> \ \ \ chunk_include(active_chunk, line[2] line[3], indent);
3233
3234     <item> \ } else if (substr(line[1], 1, 1) == "\<less\>") {
3235
3236     <item> \ \ \ chunk_include(active_chunk, line[4], indent);
3237
3238     <item> \ } else if (line[1] == "\\xC2\\xAB") {
3239
3240     <item> \ \ \ chunk_include(active_chunk, line[2], indent);
3241
3242     <item> \ } else {
3243
3244     <item> \ \ \ error("Unknown chunk fragment: " line[1]);
3245
3246     <item> \ }
3247   <|nf-chunk>
3248     \;
3249   </nf-chunk|>
3250
3251   The loop will continue until there are no more chunkref statements in the
3252   text, at which point we process the final part of the chunk.
3253
3254   <\nf-chunk|process-chunk>
3255     <item>}
3256
3257     <item>chunk_line(active_chunk, chunk);
3258   </nf-chunk||>
3259
3260   <label|lone-newline>We add the newline character as a chunklet on it's own,
3261   to make it easier to detect new lines and thus manage indentation when
3262   processing the output.
3263
3264   <\nf-chunk|process-chunk>
3265     <item>chunk_line(active_chunk, "\\n");
3266   <|nf-chunk>
3267     \;
3268   </nf-chunk|>
3269
3270   We will also permit a chunk-part number to follow in square brackets, so
3271   that <verbatim|=\<less\>\\chunkref{chunk-name[1]}\<gtr\>> will refer to the
3272   first part only. This can make it easy to include a C function prototype in
3273   a header file, if the first part of the chunk is just the function
3274   prototype without the trailing semi-colon. The header file would include
3275   the prototype with the trailing semi-colon, like this:
3276
3277   <verbatim|=\<less\>\\chunkref{chunk-name[1]}\<gtr\>>
3278
3279   This is handled in section <reference|sub:Chunk-parts>
3280
3281   We should perhaps introduce a notion of language specific chunk options; so
3282   that perhaps we could specify:
3283
3284   <verbatim|=\<less\>\\chunkref{chunk-name[function-declaration]}>
3285
3286   which applies a transform <verbatim|function-declaration> to the chunk ---
3287   which in this case would extract a function prototype from a function.
3288   <todo|Do it>
3289
3290   <chapter|Processing Options>
3291
3292   At the start, first we set the default options.
3293
3294   <\nf-chunk|default-options>
3295     <item>debug=0;
3296
3297     <item>linenos=0;
3298
3299     <item>notangle_mode=0;
3300
3301     <item>root="*";
3302
3303     <item>tabs = "";
3304   </nf-chunk||>
3305
3306   Then we use getopt the standard way, and null out ARGV afterwards in the
3307   normal AWK fashion.
3308
3309   <\nf-chunk|read-options>
3310     <item>Optind = 1 \ \ \ # skip ARGV[0]
3311
3312     <item>while(getopt(ARGC, ARGV, "R:LdT:hr")!=-1) {
3313
3314     <item> \ =\<less\>\\chunkref{handle-options}\<gtr\>
3315
3316     <item>}
3317
3318     <item>for (i=1; i\<less\>Optind; i++) { ARGV[i]=""; }
3319   </nf-chunk||>
3320
3321   This is how we handle our options:
3322
3323   <\nf-chunk|handle-options>
3324     <item>if (Optopt == "R") root = Optarg;
3325
3326     <item>else if (Optopt == "r") root="";
3327
3328     <item>else if (Optopt == "L") linenos = 1;
3329
3330     <item>else if (Optopt == "d") debug = 1;
3331
3332     <item>else if (Optopt == "T") tabs = indent_string(Optarg+0);
3333
3334     <item>else if (Optopt == "h") help();
3335
3336     <item>else if (Optopt == "?") help();
3337   </nf-chunk||>
3338
3339   We do all of this at the beginning of the program
3340
3341   <\nf-chunk|begin>
3342     <item>BEGIN {
3343
3344     <item> \ =\<less\>\\chunkref{constants}\<gtr\>
3345
3346     <item> \ =\<less\>\\chunkref{mode-definitions}\<gtr\>
3347
3348     <item> \ =\<less\>\\chunkref{default-options}\<gtr\>
3349
3350     <item>
3351
3352     <item> \ =\<less\>\\chunkref{read-options}\<gtr\>
3353
3354     <item>}
3355   </nf-chunk||>
3356
3357   And have a simple help function
3358
3359   <\nf-chunk|help()>
3360     <item>function help() {
3361
3362     <item> \ print "Usage:"
3363
3364     <item> \ print " \ fangle [-L] -R\<less\>rootname\<gtr\> [source.tex
3365     ...]"
3366
3367     <item> \ print " \ fangle -r [source.tex ...]"
3368
3369     <item> \ print " \ If the filename, source.tex is not specified then
3370     stdin is used"
3371
3372     <item> \ print
3373
3374     <item> \ print "-L causes the C statement: #line \<less\>lineno\<gtr\>
3375     \\"filename\\"" to be issued"
3376
3377     <item> \ print "-R causes the named root to be written to stdout"
3378
3379     <item> \ print "-r lists all roots in the file (even those used
3380     elsewhere)"
3381
3382     <item> \ exit 1;
3383
3384     <item>}
3385   </nf-chunk||>
3386
3387   <chapter|Generating the Output>
3388
3389   We generate output by calling output_chunk, or listing the chunk names.
3390
3391   <\nf-chunk|generate-output>
3392     <item>if (length(root)) output_chunk(root);
3393
3394     <item>else output_chunk_names();
3395   </nf-chunk||>
3396
3397   We also have some other output debugging:
3398
3399   <\nf-chunk|debug-output>
3400     <item>if (debug) {
3401
3402     <item> \ print "------ chunk names "
3403
3404     <item> \ output_chunk_names();
3405
3406     <item> \ print "====== chunks"
3407
3408     <item> \ output_chunks();
3409
3410     <item> \ print "++++++ debug"
3411
3412     <item> \ for (a in chunks) {
3413
3414     <item> \ \ \ print a "=" chunks[a];
3415
3416     <item> \ }
3417
3418     <item>}
3419   </nf-chunk||>
3420
3421   We do both of these at the end. We also set <verbatim|ORS=""> because each
3422   chunklet is not necessarily a complete line, and we already added
3423   <verbatim|ORS> to each input line in section
3424   <reference|sub:ORS-chunk-text>.
3425
3426   <\nf-chunk|end>
3427     <item>END {
3428
3429     <item> \ =\<less\>\\chunkref{debug-output}\<gtr\>
3430
3431     <item> \ ORS="";
3432
3433     <item> \ =\<less\>\\chunkref{generate-output}\<gtr\>
3434
3435     <item>}
3436   </nf-chunk||>
3437
3438   We write chunk names like this. If we seem to be running in notangle
3439   compatibility mode, then we enclose the name like this
3440   <verbatim|\<less\>\<less\>name\<gtr\>\<gtr\>> the same way notangle does:
3441
3442   <\nf-chunk|output_chunk_names()>
3443     <item>function output_chunk_names( \ \ c, prefix, suffix)\
3444
3445     <item>{
3446
3447     <item> \ if (notangle_mode) {
3448
3449     <item> \ \ \ prefix="\<less\>\<less\>";
3450
3451     <item> \ \ \ suffix="\<gtr\>\<gtr\>";
3452
3453     <item> \ }
3454
3455     <item> \ for (c in chunk_names) {
3456
3457     <item> \ \ \ print prefix c suffix "\\n";
3458
3459     <item> \ }
3460
3461     <item>}
3462   </nf-chunk||>
3463
3464   This function would write out all chunks
3465
3466   <\nf-chunk|output_chunks()>
3467     <item>function output_chunks( \ a)\
3468
3469     <item>{
3470
3471     <item> \ for (a in chunk_names) {
3472
3473     <item> \ \ \ output_chunk(a);
3474
3475     <item> \ }
3476
3477     <item>}
3478
3479     <item>
3480
3481     <item>function output_chunk(chunk) {
3482
3483     <item> \ newline = 1;
3484
3485     <item> \ lineno_needed = linenos;
3486
3487     <item>
3488
3489     <item> \ write_chunk(chunk);
3490
3491     <item>}
3492
3493     <item>
3494   </nf-chunk||>
3495
3496   <section|Assembling the Chunks>
3497
3498   <verbatim|chunk_path> holds a string consisting of the names of all the
3499   chunks that resulted in this chunk being output. It should probably also
3500   contain the source line numbers at which each inclusion also occured.
3501
3502   We first initialize the mode tracker for this chunk.
3503
3504   <\nf-chunk|write_chunk()>
3505     <item>function write_chunk(chunk_name) {
3506
3507     <item> \ =\<less\>\\chunkref{awk-delete-array}(context)\<gtr\>
3508
3509     <item> \ return write_chunk_r(chunk_name, context);
3510
3511     <item>}
3512
3513     <item>
3514
3515     <item>function write_chunk_r(chunk_name, context, indent, tail,
3516
3517     <item> \ # optional vars
3518
3519     <item> \ <with|font-shape|italic|chunk_path>, chunk_args,\
3520
3521     <item> \ s, r, src, new_src,\
3522
3523     <item> \ # local vars
3524
3525     <item> \ chunk_params, part, max_part, part_line, frag, max_frag, text,\
3526
3527     <item> \ chunklet, only_part, call_chunk_args, new_context)
3528
3529     <item>{
3530
3531     <item> \ if (debug) debug_log("write_chunk_r(", chunk_name, ")");
3532   </nf-chunk||>
3533
3534   <subsection|Chunk Parts><label|sub:Chunk-parts>
3535
3536   As mentioned in section <reference|sub:lstlistings-includes>, a chunk name
3537   may contain a part specifier in square brackets, limiting the parts that
3538   should be emitted.
3539
3540   <\nf-chunk|write_chunk()>
3541     <item> \ if (match(chunk_name, "^(.*)\\\\[([0-9]*)\\\\]$",
3542     chunk_name_parts)) {
3543
3544     <item> \ \ \ chunk_name = chunk_name_parts[1];
3545
3546     <item> \ \ \ only_part = chunk_name_parts[2];
3547
3548     <item> \ }
3549   </nf-chunk||>
3550
3551   We then create a mode tracker
3552
3553   <\nf-chunk|write_chunk()>
3554     <item> =\<less\>\\chunkref{new-mode-tracker}(context, chunks[chunk_name,
3555     "language"], "")\<gtr\>
3556   </nf-chunk||>
3557
3558   We extract into <verbatim|chunk_params> the names of the parameters that
3559   this chunk accepts, whose values were (optionally) passed in
3560   <verbatim|chunk_args>.
3561
3562   <\nf-chunk|write_chunk()>
3563     <item> split(chunks[chunk_name, "params"], chunk_params, " *; *");
3564   </nf-chunk||>
3565
3566   To assemble a chunk, we write out each part.
3567
3568   <\nf-chunk|write_chunk()>
3569     <item> \ if (! (chunk_name in chunk_names)) {
3570
3571     <item> \ \ \ error(sprintf(_"The root module
3572     \<less\>\<less\>%s\<gtr\>\<gtr\> was not defined.\\nUsed by: %s",\\
3573
3574     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ chunk_name, chunk_path));
3575
3576     <item> \ }
3577
3578     <item>
3579
3580     <item> \ max_part = chunks[chunk_name, "part"];
3581
3582     <item> \ for(part = 1; part \<less\>= max_part; part++) {
3583
3584     <item> \ \ \ if (! only_part \|\| part == only_part) {
3585
3586     <item> \ \ \ \ \ =\<less\>\\chunkref{write-part}\<gtr\>
3587
3588     <item> \ \ \ }
3589
3590     <item> \ }
3591
3592     <item> \ if (! finalize_mode_tracker(context)) {
3593
3594     <item> \ \ \ dump_mode_tracker(context);
3595
3596     <item> \ \ \ error(sprintf(_"Module %s did not close context
3597     properly.\\nUsed by: %s\\n", chunk_name, chunk_path));
3598
3599     <item> \ }
3600
3601     <item>}
3602   </nf-chunk||>
3603
3604   A part can either be a chunklet of lines, or an include of another chunk.
3605
3606   Chunks may also have parameters, specified in LaTeX style with braces after
3607   the chunk name --- looking like this in the document: chunkname{param1,
3608   param2}. Arguments are passed in square brackets:
3609   <verbatim|\\chunkref{chunkname}[arg1, arg2]>.
3610
3611   Before we process each part, we check that the source position hasn't
3612   changed unexpectedly, so that we can know if we need to output a new
3613   file-line directive.
3614
3615   <\nf-chunk|write-part>
3616     <item>=\<less\>\\chunkref{check-source-jump}\<gtr\>
3617
3618     <item>
3619
3620     <item>chunklet = chunks[chunk_name, "part", part];
3621
3622     <item>if (chunks[chunk_name, "part", part, "type"] == part_type_chunk) {
3623
3624     <item> \ =\<less\>\\chunkref{write-included-chunk}\<gtr\>
3625
3626     <item>} else if (chunklet SUBSEP "line" in chunks) {
3627
3628     <item> \ =\<less\>\\chunkref{write-chunklets}\<gtr\>
3629
3630     <item>} else {
3631
3632     <item> \ # empty last chunklet
3633
3634     <item>}
3635   </nf-chunk||>
3636
3637   To write an included chunk, we must detect any optional chunk arguments in
3638   parenthesis. Then we recurse calling <verbatim|write_chunk()>.
3639
3640   <\nf-chunk|write-included-chunk>
3641     <item>if (match(chunklet, "^([^\\\\[\\\\(]*)\\\\((.*)\\\\)$",
3642     chunklet_parts)) {
3643
3644     <item> \ chunklet = chunklet_parts[1];
3645
3646     <item> \ parse_chunk_args("c-like", chunklet_parts[2], call_chunk_args,
3647     "(");
3648
3649     <item> \ for (c in call_chunk_args) {
3650
3651     <item> \ \ \ call_chunk_args[c] = expand_chunk_args(call_chunk_args[c],
3652     chunk_params, chunk_args);
3653
3654     <item> \ }
3655
3656     <item>} else {
3657
3658     <item> \ split("", call_chunk_args);
3659
3660     <item>}
3661
3662     <item># update the transforms arrays
3663
3664     <item>new_src = mode_escaper(context, s, r, src);
3665
3666     <item>=\<less\>\\chunkref{awk-delete-array}(new_context)\<gtr\>
3667
3668     <item>write_chunk_r(chunklet, new_context,
3669
3670     <item> \ \ \ \ \ \ \ \ \ \ \ chunks[chunk_name, "part", part, "indent"]
3671     indent,
3672
3673     <item> \ \ \ \ \ \ \ \ \ \ \ chunks[chunk_name, "part", part, "tail"],
3674
3675     <item> \ \ \ \ \ \ \ \ \ \ \ chunk_path "\\n \ \ \ \ \ \ \ \ "
3676     chunk_name,
3677
3678     <item> \ \ \ \ \ \ \ \ \ \ \ call_chunk_args,
3679
3680     <item> \ \ \ \ \ \ \ \ \ \ \ s, r, new_src);
3681   </nf-chunk||>
3682
3683   Before we output a chunklet of lines, we first emit the file and line
3684   number if we have one, and if it is safe to do so.
3685
3686   Chunklets are generally broken up by includes, so the start of a chunklet
3687   is a good place to do this. Then we output each line of the chunklet.
3688
3689   When it is not safe, such as in the middle of a multi-line macro
3690   definition, <verbatim|lineno_suppressed> is set to true, and in such a case
3691   we note that we want to emit the line statement when it is next safe.
3692
3693   <\nf-chunk|write-chunklets>
3694     <item>max_frag = chunks[chunklet, "line"];
3695
3696     <item>for(frag = 1; frag \<less\>= max_frag; frag++) {
3697
3698     <item> \ =\<less\>\\chunkref{write-file-line}\<gtr\>
3699   </nf-chunk||>
3700
3701   We then extract the chunklet text and expand any arguments.
3702
3703   <\nf-chunk|write-chunklets>
3704     <item>
3705
3706     <item> \ text = chunks[chunklet, frag];
3707
3708     <item>\
3709
3710     <item> \ /* check params */
3711
3712     <item> \ text = expand_chunk_args(text, chunk_params, chunk_args);
3713   </nf-chunk||>
3714
3715   If the text is a single newline (which we keep separate - see
3716   <reference|lone-newline>) then we increment the line number. In the case
3717   where this is the last line of a chunk and it is not a top-level chunk we
3718   replace the newline with an empty string --- because the chunk that
3719   included this chunk will have the newline at the end of the line that
3720   included this chunk.
3721
3722   We also note by <verbatim|newline = 1> that we have started a new line, so
3723   that indentation can be managed with the following piece of text.
3724
3725   <\nf-chunk|write-chunklets>
3726     <item>
3727
3728     <item> if (text == "\\n") {
3729
3730     <item> \ \ \ lineno++;
3731
3732     <item> \ \ \ if (part == max_part && frag == max_frag &&
3733     length(chunk_path)) {
3734
3735     <item> \ \ \ \ \ text = "";
3736
3737     <item> \ \ \ \ \ break;
3738
3739     <item> \ \ \ } else {
3740
3741     <item> \ \ \ \ \ newline = 1;
3742
3743     <item> \ \ \ }
3744   </nf-chunk||>
3745
3746   If this text does not represent a newline, but we see that we are the first
3747   piece of text on a newline, then we prefix our text with the current
3748   indent.\
3749
3750   <\note>
3751     <verbatim|newline> is a global output-state variable, but the
3752     <verbatim|indent> is not.
3753   </note>
3754
3755   <\nf-chunk|write-chunklets>
3756     <item> \ } else if (length(text) \|\| length(tail)) {
3757
3758     <item> \ \ \ if (newline) text = indent text;
3759
3760     <item> \ \ \ newline = 0;
3761
3762     <item> \ }
3763
3764     <item>
3765   </nf-chunk||>
3766
3767   Tail will soon no longer be relevant once mode-detection is in place.
3768
3769   <\nf-chunk|write-chunklets>
3770     <item> \ text = text tail;
3771
3772     <item> \ mode_tracker(context, text);
3773
3774     <item> \ print untab(transform_escape(s, r, text, src));
3775   </nf-chunk||>
3776
3777   If a line ends in a backslash --- suggesting continuation --- then we
3778   supress outputting file-line as it would probably break the continued
3779   lines.
3780
3781   <\nf-chunk|write-chunklets>
3782     <item> \ if (linenos) {
3783
3784     <item> \ \ \ lineno_suppressed = substr(lastline, length(lastline)) ==
3785     "\\\\";
3786
3787     <item> \ }
3788
3789     <item>}
3790   </nf-chunk||>
3791
3792   Of course there is no point in actually outputting the source filename and
3793   line number (file-line) if they don't say anything new! We only need to
3794   emit them if they aren't what is expected, or if we we not able to emit one
3795   when they had changed.
3796
3797   <\nf-chunk|write-file-line>
3798     <item>if (newline && lineno_needed && ! lineno_suppressed) {
3799
3800     <item> \ filename = a_filename;
3801
3802     <item> \ lineno = a_lineno;
3803
3804     <item> \ print "#line " lineno " \\"" filename "\\"\\n"
3805
3806     <item> \ lineno_needed = 0;
3807
3808     <item>}
3809   </nf-chunk||>
3810
3811   We check if a new file-line is needed by checking if the source line
3812   matches what we (or a compiler) would expect.
3813
3814   <\nf-chunk|check-source-jump>
3815     <item>if (linenos && (chunk_name SUBSEP "part" SUBSEP part SUBSEP
3816     "FILENAME" in chunks)) {
3817
3818     <item> \ a_filename = chunks[chunk_name, "part", part, "FILENAME"];
3819
3820     <item> \ a_lineno = chunks[chunk_name, "part", part, "LINENO"];
3821
3822     <item> \ if (a_filename != filename \|\| a_lineno != lineno) {
3823
3824     <item> \ \ \ lineno_needed++;
3825
3826     <item> \ }
3827
3828     <item>}
3829   </nf-chunk||>
3830
3831   <chapter|Storing Chunks>
3832
3833   Awk has pretty limited data structures, so we will use two main hashes.
3834   Uninterrupted sequences of a chunk will be stored in chunklets and the
3835   chunklets used in a chunk will be stored in <verbatim|chunks>.
3836
3837   <\nf-chunk|constants>
3838     <item>part_type_chunk=1;
3839
3840     <item>SUBSEP=",";
3841   </nf-chunk||>
3842
3843   The params mentioned are not chunk parameters for parameterized chunks, as
3844   mentioned in <reference|Chunk Arguments>, but the lstlistings style
3845   parameters used in the <verbatim|\\Chunk> command<\footnote>
3846     The <verbatim|params> parameter is used to hold the parameters for
3847     parameterized chunks
3848   </footnote>.
3849
3850   <\nf-chunk|chunk-storage-functions>
3851     <item>function new_chunk(chunk_name, params,
3852
3853     <item> \ # local vars
3854
3855     <item> \ p, append )
3856
3857     <item>{
3858
3859     <item> \ # HACK WHILE WE CHANGE TO ( ) for PARAM CHUNKS
3860
3861     <item> \ gsub("\\\\(\\\\)$", "", chunk_name);
3862
3863     <item> \ if (! (chunk_name in chunk_names)) {
3864
3865     <item> \ \ \ if (debug) print "New chunk " chunk_name;
3866
3867     <item> \ \ \ chunk_names[chunk_name];
3868
3869     <item> \ \ \ for (p in params) {
3870
3871     <item> \ \ \ \ \ chunks[chunk_name, p] = params[p];
3872
3873     <item> \ \ \ \ \ if (debug) print "chunks[" chunk_name "," p "] = "
3874     params[p];
3875
3876     <item> \ \ \ }
3877
3878     <item> \ \ \ if ("append" in params) {
3879
3880     <item> \ \ \ \ \ append=params["append"];
3881
3882     <item> \ \ \ \ \ if (! (append in chunk_names)) {
3883
3884     <item> \ \ \ \ \ \ \ warning("Chunk " chunk_name " is appended to chunk "
3885     append " which is not defined yet");
3886
3887     <item> \ \ \ \ \ \ \ new_chunk(append);
3888
3889     <item> \ \ \ \ \ }
3890
3891     <item> \ \ \ \ \ chunk_include(append, chunk_name);
3892
3893     <item> \ \ \ \ \ chunk_line(append, ORS);
3894
3895     <item> \ \ \ }
3896
3897     <item> \ }
3898
3899     <item> \ active_chunk = chunk_name;
3900
3901     <item> \ prime_chunk(chunk_name);
3902
3903     <item>}
3904   </nf-chunk||>
3905
3906   <\nf-chunk|chunk-storage-functions>
3907     <item>
3908
3909     <item>function prime_chunk(chunk_name)
3910
3911     <item>{
3912
3913     <item> \ chunks[chunk_name, "part", ++chunks[chunk_name, "part"] ] = \\
3914
3915     <item> \ \ \ \ \ \ \ \ chunk_name SUBSEP "chunklet" SUBSEP ""
3916     ++chunks[chunk_name, "chunklet"];
3917
3918     <item> \ chunks[chunk_name, "part", chunks[chunk_name, "part"],
3919     "FILENAME"] = FILENAME;
3920
3921     <item> \ chunks[chunk_name, "part", chunks[chunk_name, "part"], "LINENO"]
3922     = FNR + 1;
3923
3924     <item>}
3925
3926     <item>
3927
3928     <item>function chunk_line(chunk_name, line){
3929
3930     <item> \ chunks[chunk_name, "chunklet", chunks[chunk_name, "chunklet"],
3931
3932     <item> \ \ \ \ \ \ \ \ ++chunks[chunk_name, "chunklet",
3933     chunks[chunk_name, "chunklet"], "line"] \ ] = line;
3934
3935     <item>}
3936
3937     <item>
3938   </nf-chunk||>
3939
3940   Chunk include represents a <em|chunkref> statement, and stores the
3941   requirement to include another chunk. The parameter indent represents the
3942   quanity of literal text characters that preceded this <em|chunkref>
3943   statement and therefore by how much additional lines of the included chunk
3944   should be indented.
3945
3946   <\nf-chunk|chunk-storage-functions>
3947     <item>function chunk_include(chunk_name, chunk_ref, indent, tail)
3948
3949     <item>{
3950
3951     <item> \ chunks[chunk_name, "part", ++chunks[chunk_name, "part"] ] =
3952     chunk_ref;
3953
3954     <item> \ chunks[chunk_name, "part", chunks[chunk_name, "part"], "type" ]
3955     = part_type_chunk;
3956
3957     <item> \ chunks[chunk_name, "part", chunks[chunk_name, "part"], "indent"
3958     ] = indent_string(indent);
3959
3960     <item> \ chunks[chunk_name, "part", chunks[chunk_name, "part"], "tail" ]
3961     = tail;
3962
3963     <item> \ prime_chunk(chunk_name);
3964
3965     <item>}
3966
3967     <item>
3968   </nf-chunk||>
3969
3970   The indent is calculated by indent_string, which may in future convert some
3971   spaces into tab characters. This function works by generating a printf
3972   padded format string, like <verbatim|%22s> for an indent of 22, and then
3973   printing an empty string using that format.
3974
3975   <\nf-chunk|chunk-storage-functions>
3976     <item>function indent_string(indent) {
3977
3978     <item> \ return sprintf("%" indent "s", "");
3979
3980     <item>}
3981   </nf-chunk||>
3982
3983   <chapter|getopt><label|cha:getopt>
3984
3985   I use Arnold Robbins public domain getopt (1993 revision). This is probably
3986   the same one that is covered in chapter 12 of “Edition 3 of GAWK:
3987   Effective AWK Programming: A User's Guide for GNU Awk” but as that is
3988   licensed under the GNU Free Documentation License, Version 1.3, which
3989   conflicts with the GPL3, I can't use it from there (or it's accompanying
3990   explanations), so I do my best to explain how it works here.
3991
3992   The getopt.awk header is:
3993
3994   <\nf-chunk|getopt.awk-header>
3995     <item># getopt.awk --- do C library getopt(3) function in awk
3996
3997     <item>#
3998
3999     <item># Arnold Robbins, arnold@skeeve.com, Public Domain
4000
4001     <item>#
4002
4003     <item># Initial version: March, 1991
4004
4005     <item># Revised: May, 1993
4006
4007     <item>
4008   </nf-chunk||>
4009
4010   The provided explanation is:
4011
4012   <\nf-chunk|getopt.awk-notes>
4013     <item># External variables:
4014
4015     <item># \ \ \ Optind -- index in ARGV of first nonoption argument
4016
4017     <item># \ \ \ Optarg -- string value of argument to current option
4018
4019     <item># \ \ \ Opterr -- if nonzero, print our own diagnostic
4020
4021     <item># \ \ \ Optopt -- current option letter
4022
4023     <item>
4024
4025     <item># Returns:
4026
4027     <item># \ \ \ -1 \ \ \ \ at end of options
4028
4029     <item># \ \ \ ? \ \ \ \ \ for unrecognized option
4030
4031     <item># \ \ \ \<less\>c\<gtr\> \ \ \ a character representing the current
4032     option
4033
4034     <item>
4035
4036     <item># Private Data:
4037
4038     <item># \ \ \ _opti \ -- index in multi-flag option, e.g., -abc
4039
4040     <item>
4041   </nf-chunk||>
4042
4043   The function follows. The final two parameters, <verbatim|thisopt> and
4044   <verbatim|i> are local variables and not parameters --- as indicated by the
4045   multiple spaces preceding them. Awk doesn't care, the multiple spaces are a
4046   convention to help us humans.
4047
4048   <\nf-chunk|getopt.awk-getopt()>
4049     <item>function getopt(argc, argv, options, \ \ \ thisopt, i)
4050
4051     <item>{
4052
4053     <item> \ \ \ if (length(options) == 0) \ \ \ # no options given
4054
4055     <item> \ \ \ \ \ \ \ return -1
4056
4057     <item> \ \ \ if (argv[Optind] == "--") { \ # all done
4058
4059     <item> \ \ \ \ \ \ \ Optind++
4060
4061     <item> \ \ \ \ \ \ \ _opti = 0
4062
4063     <item> \ \ \ \ \ \ \ return -1
4064
4065     <item> \ \ \ } else if (argv[Optind] !~ /^-[^: \\t\\n\\f\\r\\v\\b]/) {
4066
4067     <item> \ \ \ \ \ \ \ _opti = 0
4068
4069     <item> \ \ \ \ \ \ \ return -1
4070
4071     <item> \ \ \ }
4072
4073     <item> \ \ \ if (_opti == 0)
4074
4075     <item> \ \ \ \ \ \ \ _opti = 2
4076
4077     <item> \ \ \ thisopt = substr(argv[Optind], _opti, 1)
4078
4079     <item> \ \ \ Optopt = thisopt
4080
4081     <item> \ \ \ i = index(options, thisopt)
4082
4083     <item> \ \ \ if (i == 0) {
4084
4085     <item> \ \ \ \ \ \ \ if (Opterr)
4086
4087     <item> \ \ \ \ \ \ \ \ \ \ \ printf("%c -- invalid option\\n",
4088
4089     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ thisopt)
4090     \<gtr\> "/dev/stderr"
4091
4092     <item> \ \ \ \ \ \ \ if (_opti \<gtr\>= length(argv[Optind])) {
4093
4094     <item> \ \ \ \ \ \ \ \ \ \ \ Optind++
4095
4096     <item> \ \ \ \ \ \ \ \ \ \ \ _opti = 0
4097
4098     <item> \ \ \ \ \ \ \ } else
4099
4100     <item> \ \ \ \ \ \ \ \ \ \ \ _opti++
4101
4102     <item> \ \ \ \ \ \ \ return "?"
4103
4104     <item> \ \ \ }
4105   </nf-chunk||>
4106
4107   At this point, the option has been found and we need to know if it takes
4108   any arguments.
4109
4110   <\nf-chunk|getopt.awk-getopt()>
4111     <item> \ \ \ if (substr(options, i + 1, 1) == ":") {
4112
4113     <item> \ \ \ \ \ \ \ # get option argument
4114
4115     <item> \ \ \ \ \ \ \ if (length(substr(argv[Optind], _opti + 1)) \<gtr\>
4116     0)
4117
4118     <item> \ \ \ \ \ \ \ \ \ \ \ Optarg = substr(argv[Optind], _opti + 1)
4119
4120     <item> \ \ \ \ \ \ \ else
4121
4122     <item> \ \ \ \ \ \ \ \ \ \ \ Optarg = argv[++Optind]
4123
4124     <item> \ \ \ \ \ \ \ _opti = 0
4125
4126     <item> \ \ \ } else
4127
4128     <item> \ \ \ \ \ \ \ Optarg = ""
4129
4130     <item> \ \ \ if (_opti == 0 \|\| _opti \<gtr\>= length(argv[Optind])) {
4131
4132     <item> \ \ \ \ \ \ \ Optind++
4133
4134     <item> \ \ \ \ \ \ \ _opti = 0
4135
4136     <item> \ \ \ } else
4137
4138     <item> \ \ \ \ \ \ \ _opti++
4139
4140     <item> \ \ \ return thisopt
4141
4142     <item>}
4143   </nf-chunk||>
4144
4145   A test program is built in, too
4146
4147   <\nf-chunk|getopt.awk-begin>
4148     <item>BEGIN {
4149
4150     <item> \ \ \ Opterr = 1 \ \ \ # default is to diagnose
4151
4152     <item> \ \ \ Optind = 1 \ \ \ # skip ARGV[0]
4153
4154     <item> \ \ \ # test program
4155
4156     <item> \ \ \ if (_getopt_test) {
4157
4158     <item> \ \ \ \ \ \ \ while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
4159
4160     <item> \ \ \ \ \ \ \ \ \ \ \ printf("c = \<less\>%c\<gtr\>, optarg =
4161     \<less\>%s\<gtr\>\\n",
4162
4163     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ _go_c,
4164     Optarg)
4165
4166     <item> \ \ \ \ \ \ \ printf("non-option arguments:\\n")
4167
4168     <item> \ \ \ \ \ \ \ for (; Optind \<less\> ARGC; Optind++)
4169
4170     <item> \ \ \ \ \ \ \ \ \ \ \ printf("\\tARGV[%d] = \<less\>%s\<gtr\>\\n",
4171
4172     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Optind,
4173     ARGV[Optind])
4174
4175     <item> \ \ \ }
4176
4177     <item>}
4178   </nf-chunk||>
4179
4180   The entire getopt.awk is made out of these chunks in order
4181
4182   <\nf-chunk|getopt.awk>
4183     <item>=\<less\>\\chunkref{getopt.awk-header}\<gtr\>
4184
4185     <item>
4186
4187     <item>=\<less\>\\chunkref{getopt.awk-notes}\<gtr\>
4188
4189     <item>=\<less\>\\chunkref{getopt.awk-getopt()}\<gtr\>
4190
4191     <item>=\<less\>\\chunkref{getopt.awk-begin}\<gtr\>
4192   </nf-chunk||>
4193
4194   Although we only want the header and function:
4195
4196   <\nf-chunk|getopt>
4197     <item># try: locate getopt.awk for the full original file
4198
4199     <item># as part of your standard awk installation
4200
4201     <item>=\<less\>\\chunkref{getopt.awk-header}\<gtr\>
4202
4203     <item>
4204
4205     <item>=\<less\>\\chunkref{getopt.awk-getopt()}\<gtr\>
4206   </nf-chunk||>
4207
4208   <chapter|Fangle LaTeX source code><label|latex-source>
4209
4210   <section|fangle module>
4211
4212   Here we define a <LyX> <verbatim|.module> file that makes it convenient to
4213   use <LyX> for writing such literate programs.
4214
4215   This file <verbatim|./fangle.module> can be installed in your personal
4216   <verbatim|.lyx/layouts> folder. You will need to Tools Reconfigure so that
4217   <LyX> notices it. It adds a new format Chunk, which should precede every
4218   listing and contain the chunk name.
4219
4220   <\nf-chunk|./fangle.module>
4221     <item>#\\DeclareLyXModule{Fangle Literate Listings}
4222
4223     <item>#DescriptionBegin
4224
4225     <item># \ Fangle literate listings allow one to write
4226
4227     <item># \ \ literate programs after the fashion of noweb, but without
4228     having
4229
4230     <item># \ \ to use noweave to generate the documentation. Instead the
4231     listings
4232
4233     <item># \ \ package is extended in conjunction with the noweb package to
4234     implement
4235
4236     <item># \ \ to code formating directly as latex.
4237
4238     <item># \ The fangle awk script
4239
4240     <item>#DescriptionEnd
4241
4242     <item>
4243
4244     <item>=\<less\>\\chunkref{gpl3-copyright.hashed}\<gtr\>
4245
4246     <item>
4247
4248     <item>Format 11
4249
4250     <item>
4251
4252     <item>AddToPreamble
4253
4254     <item>=\<less\>\\chunkref{./fangle.sty}\<gtr\>
4255
4256     <item>EndPreamble
4257
4258     <item>
4259
4260     <item>=\<less\>\\chunkref{chunkstyle}\<gtr\>
4261
4262     <item>
4263
4264     <item>=\<less\>\\chunkref{chunkref}\<gtr\>
4265   </nf-chunk|lyx-module|>
4266
4267   Because <LyX> modules are not yet a language supported by fangle or
4268   lstlistings, we resort to this fake awk chunk below in order to have each
4269   line of the GPL3 license commence with a #
4270
4271   <\nf-chunk|gpl3-copyright.hashed>
4272     <item>#=\<less\>\\chunkref{gpl3-copyright}\<gtr\>
4273
4274     <item>
4275   </nf-chunk|awk|>
4276
4277   <subsection|The Chunk style>
4278
4279   The purpose of the <name|chunk> style is to make it easier for <LyX> users
4280   to provide the name to <verbatim|lstlistings>. Normally this requires
4281   right-clicking on the listing, choosing settings, advanced, and then typing
4282   <verbatim|name=chunk-name>. This has the further disadvantage that the name
4283   (and other options) are not generally visible during document editing.
4284
4285   The chunk style is defined as a <LaTeX> command, so that all text on the
4286   same line is passed to the <verbatim|LaTeX> command <verbatim|Chunk>. This
4287   makes it easy to parse using <verbatim|fangle>, and easy to pass these
4288   options on to the listings package. The first word in a chunk section
4289   should be the chunk name, and will have <verbatim|name=> prepended to it.
4290   Any other words are accepted arguments to <verbatim|lstset>.
4291
4292   We set PassThru to 1 because the user is actually entering raw latex.
4293
4294   <\nf-chunk|chunkstyle>
4295     <item>Style Chunk
4296
4297     <item> \ LatexType \ \ \ \ \ \ \ \ \ \ \ \ Command
4298
4299     <item> \ LatexName \ \ \ \ \ \ \ \ \ \ \ \ Chunk
4300
4301     <item> \ Margin \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ First_Dynamic
4302
4303     <item> \ LeftMargin \ \ \ \ \ \ \ \ \ \ \ Chunk:xxx
4304
4305     <item> \ LabelSep \ \ \ \ \ \ \ \ \ \ \ \ \ xx
4306
4307     <item> \ LabelType \ \ \ \ \ \ \ \ \ \ \ \ Static
4308
4309     <item> \ LabelString \ \ \ \ \ \ \ \ \ \ "Chunk:"
4310
4311     <item> \ Align \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Left
4312
4313     <item> \ PassThru \ \ \ \ \ \ \ \ \ \ \ \ \ 1
4314
4315     <item>
4316   </nf-chunk||>
4317
4318   To make the label very visible we choose a larger font coloured red.
4319
4320   <\nf-chunk|chunkstyle>
4321     <item> \ LabelFont
4322
4323     <item> \ \ \ Family \ \ \ \ \ \ \ \ \ \ \ \ \ Sans
4324
4325     <item> \ \ \ Size \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Large
4326
4327     <item> \ \ \ Series \ \ \ \ \ \ \ \ \ \ \ \ \ Bold
4328
4329     <item> \ \ \ Shape \ \ \ \ \ \ \ \ \ \ \ \ \ \ Italic
4330
4331     <item> \ \ \ Color \ \ \ \ \ \ \ \ \ \ \ \ \ \ red
4332
4333     <item> \ EndFont
4334
4335     <item>End
4336   </nf-chunk||>
4337
4338   <subsection|The chunkref style>
4339
4340   We also define the Chunkref style which can be used to express cross
4341   references to chunks.
4342
4343   <\nf-chunk|chunkref>
4344     <item>InsetLayout Chunkref
4345
4346     <item> \ LyxType \ \ \ \ \ \ \ \ \ \ \ \ \ \ charstyle
4347
4348     <item> \ LatexType \ \ \ \ \ \ \ \ \ \ \ \ Command
4349
4350     <item> \ LatexName \ \ \ \ \ \ \ \ \ \ \ \ chunkref
4351
4352     <item> \ PassThru \ \ \ \ \ \ \ \ \ \ \ \ \ 1
4353
4354     <item> \ LabelFont \ \ \ \ \ \ \ \ \ \ \ \
4355
4356     <item> \ \ \ Shape \ \ \ \ \ \ \ \ \ \ \ \ \ \ Italic
4357
4358     <item> \ \ \ Color \ \ \ \ \ \ \ \ \ \ \ \ \ \ red
4359
4360     <item> \ EndFont
4361
4362     <item>End
4363   </nf-chunk||>
4364
4365   <section|Latex Macros><label|sec:Latex-Macros>
4366
4367   We require the listings, noweb and xargs packages. As noweb defines it's
4368   own <verbatim|\\code> environment, we re-define the one that <LyX> logical
4369   markup module expects here.
4370
4371   <\nf-chunk|./fangle.sty>
4372     <item>\\usepackage{listings}%
4373
4374     <item>\\usepackage{noweb}%
4375
4376     <item>\\usepackage{xargs}%
4377
4378     <item>\\renewcommand{\\code}[1]{\\texttt{#1}}%
4379   </nf-chunk|tex|>
4380
4381   We also define a <verbatim|CChunk> macro, for use as:
4382   <verbatim|\\begin{CChunk}> which will need renaming to
4383   <verbatim|\\begin{Chunk}> when I can do this without clashing with
4384   <verbatim|\\Chunk>.
4385
4386   <\nf-chunk|./fangle.sty>
4387     <item>\\lstnewenvironment{Chunk}{\\relax}{\\relax}%
4388   </nf-chunk||>
4389
4390   We also define a suitable <verbatim|\\lstset> of parameters that suit the
4391   literate programming style after the fashion of <name|noweave>.
4392
4393   <\nf-chunk|./fangle.sty>
4394     <item>\\lstset{numbers=left, stepnumber=5, numbersep=5pt,
4395
4396     <item> \ \ \ \ \ \ \ breaklines=false,basicstyle=\\ttfamily,
4397
4398     <item> \ \ \ \ \ \ \ numberstyle=\\tiny, language=C}%
4399   </nf-chunk||>
4400
4401   We also define a notangle-like mechanism for escaping to <LaTeX> from the
4402   listing, and by which we can refer to other listings. We declare the
4403   <verbatim|=\<less\>...\<gtr\>> sequence to contain <LaTeX> code, and
4404   include another like this chunk: <verbatim|=\<less\>\\chunkref{chunkname}\<gtr\>>.
4405   However, because <verbatim|=\<less\>...\<gtr\>> is already defined to
4406   contain <LaTeX> code for this document --- this is a fangle document after
4407   all --- the code fragment below effectively contains the <LaTeX> code:
4408   <verbatim|}{>. To avoid problems with document generation, I had to declare
4409   an lstlistings property: <verbatim|escapeinside={}> for this listing only;
4410   which in <LyX> was done by right-clicking the listings inset, choosing
4411   settings-\<gtr\>advanced. Therefore <verbatim|=\<less\>> isn't interpreted
4412   literally here, in a listing when the escape sequence is already defined as
4413   shown... we need to somehow escape this representation...
4414
4415   <\nf-chunk|./fangle.sty>
4416     <item>\\lstset{escapeinside={=\<less\>}{\<gtr\>}}%
4417   </nf-chunk||>
4418
4419   Although our macros will contain the <verbatim|@> symbol, they will be
4420   included in a <verbatim|\\makeatletter> section by <LyX>; however we keep
4421   the commented out <verbatim|\\makeatletter> as a reminder. The listings
4422   package likes to centre the titles, but noweb titles are specially
4423   formatted and must be left aligned. The simplest way to do this turned out
4424   to be by removing the definition of <verbatim|\\lst@maketitle>. This may
4425   interact badly if other listings want a regular title or caption. We
4426   remember the old maketitle in case we need it.
4427
4428   <\nf-chunk|./fangle.sty>
4429     <item>%\\makeatletter
4430
4431     <item>%somehow re-defining maketitle gives us a left-aligned title
4432
4433     <item>%which is extactly what our specially formatted title needs!
4434
4435     <item>\\global\\let\\fangle@lst@maketitle\\lst@maketitle%
4436
4437     <item>\\global\\def\\lst@maketitle{}%
4438   </nf-chunk||>
4439
4440   <subsection|The chunk command><label|sub:The-chunk-command>
4441
4442   Our chunk command accepts one argument, and calls <verbatim|\\ltset>.
4443   Although <verbatim|\\ltset> will note the name, this is erased when the
4444   next <verbatim|\\lstlisting> starts, so we make a note of this in
4445   <verbatim|\\lst@chunkname> and restore in in lstlistings Init hook.
4446
4447   <\nf-chunk|./fangle.sty>
4448     <item>\\def\\Chunk#1{%
4449
4450     <item> \ \\lstset{title={\\fanglecaption},name=#1}%
4451
4452     <item> \ \\global\\edef\\lst@chunkname{\\lst@intname}%
4453
4454     <item>}%
4455
4456     <item>\\def\\lst@chunkname{\\empty}%
4457   </nf-chunk||>
4458
4459   <subsubsection|Chunk parameters>
4460
4461   Fangle permits parameterized chunks, and requires the paramters to be
4462   specified as listings options. The fangle script uses this, and although we
4463   don't do anything with these in the <LaTeX> code right now, we need to stop
4464   the listings package complaining.
4465
4466   <\nf-chunk|./fangle.sty>
4467     <item>\\lst@Key{params}\\relax{\\def\\fangle@chunk@params{#1}}%
4468   </nf-chunk||>
4469
4470   As it is common to define a chunk which then needs appending to another
4471   chunk, and annoying to have to declare a single line chunk to manage the
4472   include, we support an append= option.
4473
4474   <\nf-chunk|./fangle.sty>
4475     <item>\\lst@Key{append}\\relax{\\def\\fangle@chunk@append{#1}}%
4476   </nf-chunk||>
4477
4478   <subsection|The noweb styled caption>
4479
4480   We define a public macro <verbatim|\\fanglecaption> which can be set as a
4481   regular title. By means of <verbatim|\\protect>, It expands to
4482   <verbatim|\\fangle@caption> at the appopriate time when the caption is
4483   emitted.
4484
4485   <nf-chunk|./fangle.sty|\\def\\fanglecaption{\\protect\\fangle@caption}%||>
4486
4487   <\big-figure>
4488     22c <math|\<langle\>>some-chunk 19b<math|\<rangle\>><math|\<equiv\>>+
4489     \ \ <math|\<vartriangleleft\>>22b 24d<math|\<vartriangleright\>>
4490
4491     \;
4492
4493     In this example, the current chunk is 22c, and therefore the third chunk
4494     on page 22.
4495
4496     It's name is some-chunk.\
4497
4498     The first chunk with this name (19b) occurs as the second chunk on page
4499     19.
4500
4501     The previous chunk (22d) with the same name is the second chunk on page
4502     22.
4503
4504     The next chunk (24d) is the fourth chunk on page 24.
4505   </big-figure|Noweb Heading<label|noweb heading>>
4506
4507   The general noweb output format compactly identifies the current chunk, and
4508   references to the first chunk, and the previous and next chunks that have
4509   the same name.
4510
4511   This means that we need to keep a counter for each chunk-name, that we use
4512   to count chunks of the same name.
4513
4514   <subsection|The chunk counter>
4515
4516   It would be natural to have a counter for each chunk name, but TeX would
4517   soon run out of counters<\footnote>
4518     ...soon did run out of counters and so I had to re-write the LaTeX macros
4519     to share a counter as described here.
4520   </footnote>, so we have one counter which we save at the end of a chunk and
4521   restore at the beginning of a chunk.
4522
4523   <\nf-chunk|./fangle.sty>
4524     <item>\\newcounter{fangle@chunkcounter}%
4525   </nf-chunk||>
4526
4527   We construct the name of this variable to store the counter to be the text
4528   <verbatim|lst-chunk-> prefixed onto the chunks own name, and store it in
4529   <verbatim|\\chunkcount>.\
4530
4531   We save the counter like this:
4532
4533   <nf-chunk|save-counter|\\global\\expandafter\\edef\\csname
4534   \\chunkcount\\endcsname{\\arabic{fangle@chunkcounter}}%||>
4535
4536   and restore the counter like this:
4537
4538   <nf-chunk|restore-counter|\\setcounter{fangle@chunkcounter}{\\csname
4539   \\chunkcount\\endcsname}%||>
4540
4541   If there does not already exist a variable whose name is stored in
4542   <verbatim|\\chunkcount>, then we know we are the first chunk with this
4543   name, and then define a counter.\
4544
4545   Although chunks of the same name share a common counter, they must still be
4546   distinguished. We use is the internal name of the listing, suffixed by the
4547   counter value. So the first chunk might be <verbatim|something-1> and the
4548   second chunk be <verbatim|something-2>, etc.
4549
4550   We also calculate the name of the previous chunk if we can (before we
4551   increment the chunk counter). If this is the first chunk of that name, then
4552   <verbatim|\\prevchunkname> is set to <verbatim|\\relax> which the noweb
4553   package will interpret as not existing.
4554
4555   <\nf-chunk|./fangle.sty>
4556     <item>\\def\\fangle@caption{%
4557
4558     <item> \ \\edef\\chunkcount{lst-chunk-\\lst@intname}%
4559
4560     <item> \ \\@ifundefined{\\chunkcount}{%
4561
4562     <item> \ \ \ \\expandafter\\gdef\\csname \\chunkcount\\endcsname{0}%
4563
4564     <item> \ \ \ \\setcounter{fangle@chunkcounter}{\\csname
4565     \\chunkcount\\endcsname}%
4566
4567     <item> \ \ \ \\let\\prevchunkname\\relax%
4568
4569     <item> \ }{%
4570
4571     <item> \ \ \ \\setcounter{fangle@chunkcounter}{\\csname
4572     \\chunkcount\\endcsname}%
4573
4574     <item> \ \ \ \\edef\\prevchunkname{\\lst@intname-\\arabic{fangle@chunkcounter}}%
4575
4576     <item> \ }%
4577   </nf-chunk||>
4578
4579   After incrementing the chunk counter, we then define the name of this
4580   chunk, as well as the name of the first chunk.
4581
4582   <\nf-chunk|./fangle.sty>
4583     <item> \ \\addtocounter{fangle@chunkcounter}{1}%
4584
4585     <item> \ \\global\\expandafter\\edef\\csname
4586     \\chunkcount\\endcsname{\\arabic{fangle@chunkcounter}}%
4587
4588     <item> \ \\edef\\chunkname{\\lst@intname-\\arabic{fangle@chunkcounter}}%
4589
4590     <item> \ \\edef\\firstchunkname{\\lst@intname-1}%
4591   </nf-chunk||>
4592
4593   We now need to calculate the name of the next chunk. We do this by
4594   temporarily skipping the counter on by one; however there may not actually
4595   be another chunk with this name! We detect this by also defining a label
4596   for each chunk based on the chunkname. If there is a next chunkname then it
4597   will define a label with that name. As labels are persistent, we can at
4598   least tell the second time <LaTeX> is run. If we don't find such a defined
4599   label then we define <verbatim|\\nextchunkname> to <verbatim|\\relax>.
4600
4601   <\nf-chunk|./fangle.sty>
4602     <item> \ \\addtocounter{fangle@chunkcounter}{1}%
4603
4604     <item> \ \\edef\\nextchunkname{\\lst@intname-\\arabic{fangle@chunkcounter}}%
4605
4606     <item> \ \\@ifundefined{r@label-\\nextchunkname}{\\let\\nextchunkname\\relax}{}%
4607   </nf-chunk||>
4608
4609   The noweb package requires that we define a <verbatim|\\sublabel> for every
4610   chunk, with a unique name, which is then used to print out it's navigation
4611   hints.
4612
4613   We also define a regular label for this chunk, as was mentioned above when
4614   we calculated <verbatim|\\nextchunkname>. This requires <LaTeX> to be run
4615   at least twice after new chunk sections are added --- but noweb requried
4616   that anyway.
4617
4618   <\nf-chunk|./fangle.sty>
4619     <item> \ \\sublabel{\\chunkname}%
4620
4621     <item>% define this label for every chunk instance, so we
4622
4623     <item>% can tell when we are the last chunk of this name
4624
4625     <item> \ \\label{label-\\chunkname}%
4626   </nf-chunk||>
4627
4628   We also try and add the chunk to the list of listings, but I'm afraid we
4629   don't do very well. We want each chunk name listing once, with all of it's
4630   references.
4631
4632   <\nf-chunk|./fangle.sty>
4633     <item> \ \\addcontentsline{lol}{lstlisting}{\\lst@name~[\\protect\\subpageref{\\chunkname}]}%
4634   </nf-chunk||>
4635
4636   We then call the noweb output macros in the same way that noweave generates
4637   them, except that we don't need to call <verbatim|\\nwstartdeflinemarkup>
4638   or <verbatim|\\nwenddeflinemarkup> <emdash> and if we do, it messes up the
4639   output somewhat.
4640
4641   <\nf-chunk|./fangle.sty>
4642     <item> \ \\nwmargintag{%
4643
4644     <item> \ \ \ {%
4645
4646     <item> \ \ \ \ \ \\nwtagstyle{}%
4647
4648     <item> \ \ \ \ \ \\subpageref{\\chunkname}%
4649
4650     <item> \ \ \ }%
4651
4652     <item> \ }%
4653
4654     <item>%
4655
4656     <item> \ \\moddef{%
4657
4658     <item> \ \ \ {\\lst@name}%
4659
4660     <item> \ \ \ {%
4661
4662     <item> \ \ \ \ \ \\nwtagstyle{}\\/%
4663
4664     <item> \ \ \ \ \ \\@ifundefined{fangle@chunk@params}{}{%
4665
4666     <item> \ \ \ \ \ \ \ (\\fangle@chunk@params)%
4667
4668     <item> \ \ \ \ \ }%
4669
4670     <item> \ \ \ \ \ [\\csname \\chunkcount\\endcsname]~%
4671
4672     <item> \ \ \ \ \ \\subpageref{\\firstchunkname}%
4673
4674     <item> \ \ \ }%
4675
4676     <item> \ \ \ \\@ifundefined{fangle@chunk@append}{}{%
4677
4678     <item> \ \ \ \\ifx{}\\fangle@chunk@append{x}\\else%
4679
4680     <item> \ \ \ \ \ \ \ ,~add~to~\\fangle@chunk@append%
4681
4682     <item> \ \ \ \\fi%
4683
4684     <item> \ \ \ }%
4685
4686     <item>\\global\\def\\fangle@chunk@append{}%
4687
4688     <item>\\lstset{append=x}%
4689
4690     <item> \ }%
4691
4692     <item>%
4693
4694     <item> \ \\ifx\\relax\\prevchunkname\\endmoddef\\else\\plusendmoddef\\fi%
4695
4696     <item>% \ \\nwstartdeflinemarkup%
4697
4698     <item> \ \\nwprevnextdefs{\\prevchunkname}{\\nextchunkname}%
4699
4700     <item>% \ \\nwenddeflinemarkup%
4701
4702     <item>}%
4703   </nf-chunk||>
4704
4705   Originally this was developed as a <verbatim|listings> aspect, in the Init
4706   hook, but it was found easier to affect the title without using a hook
4707   <emdash> <verbatim|\\lst@AddToHookExe{PreSet}> is still required to set the
4708   listings name to the name passed to the <verbatim|\\Chunk> command, though.
4709
4710   <\nf-chunk|./fangle.sty>
4711     <item>%\\lst@BeginAspect{fangle}
4712
4713     <item>%\\lst@Key{fangle}{true}[t]{\\lstKV@SetIf{#1}{true}}
4714
4715     <item>\\lst@AddToHookExe{PreSet}{\\global\\let\\lst@intname\\lst@chunkname}
4716
4717     <item>\\lst@AddToHook{Init}{}%\\fangle@caption}
4718
4719     <item>%\\lst@EndAspect
4720   </nf-chunk||>
4721
4722   <subsection|Cross references>
4723
4724   We define the <verbatim|\\chunkref> command which makes it easy to generate
4725   visual references to different code chunks, e.g.
4726
4727   <block|<tformat|<table|<row|<cell|Macro>|<cell|Appearance>>|<row|<cell|<verbatim|\\chunkref{preamble}>>|<cell|>>|<row|<cell|<verbatim|\\chunkref[3]{preamble}>>|<cell|>>|<row|<cell|<verbatim|\\chunkref{preamble}[arg1,
4728   arg2]>>|<cell|>>>>>
4729
4730   Chunkref can also be used within a code chunk to include another code
4731   chunk. The third optional parameter to chunkref is a comma sepatarated list
4732   of arguments, which will replace defined parameters in the chunkref.
4733
4734   <\note>
4735     Darn it, if I have: <verbatim|=\<less\>\\chunkref{new-mode-tracker}[{chunks[chunk_name,
4736     "language"]},{mode}]\<gtr\>> the inner braces (inside [ ]) cause _ to
4737     signify subscript even though we have <verbatim|lst@ReplaceIn>
4738   </note>
4739
4740   <\nf-chunk|./fangle.sty>
4741     <item>\\def\\chunkref@args#1,{%
4742
4743     <item> \ \\def\\arg{#1}%
4744
4745     <item> \ \\lst@ReplaceIn\\arg\\lst@filenamerpl%
4746
4747     <item> \ \\arg%
4748
4749     <item> \ \\@ifnextchar){\\relax}{, \\chunkref@args}%
4750
4751     <item>}%
4752
4753     <item>\\newcommand\\chunkref[2][0]{%
4754
4755     <item> \ \\@ifnextchar({\\chunkref@i{#1}{#2}}{\\chunkref@i{#1}{#2}()}%
4756
4757     <item>}%
4758
4759     <item>\\def\\chunkref@i#1#2(#3){%
4760
4761     <item> \ \\def\\zero{0}%
4762
4763     <item> \ \\def\\chunk{#2}%
4764
4765     <item> \ \\def\\chunkno{#1}%
4766
4767     <item> \ \\def\\chunkargs{#3}%
4768
4769     <item> \ \\ifx\\chunkno\\zero%
4770
4771     <item> \ \ \ \\def\\chunkname{#2-1}%
4772
4773     <item> \ \\else%
4774
4775     <item> \ \ \ \\def\\chunkname{#2-\\chunkno}%
4776
4777     <item> \ \\fi%
4778
4779     <item> \ \\let\\lst@arg\\chunk%
4780
4781     <item> \ \\lst@ReplaceIn\\chunk\\lst@filenamerpl%
4782
4783     <item> \ \\LA{%\\moddef{%
4784
4785     <item> \ \ \ {\\chunk}%
4786
4787     <item> \ \ \ {%
4788
4789     <item> \ \ \ \ \ \\nwtagstyle{}\\/%
4790
4791     <item> \ \ \ \ \ \\ifx\\chunkno\\zero%
4792
4793     <item> \ \ \ \ \ \\else%
4794
4795     <item> \ \ \ \ \ [\\chunkno]%
4796
4797     <item> \ \ \ \ \ \\fi%
4798
4799     <item> \ \ \ \ \ \\ifx\\chunkargs\\empty%
4800
4801     <item> \ \ \ \ \ \\else%
4802
4803     <item> \ \ \ \ \ \ \ (\\chunkref@args #3,)%
4804
4805     <item> \ \ \ \ \ \\fi%
4806
4807     <item> \ \ \ \ \ ~\\subpageref{\\chunkname}%
4808
4809     <item> \ \ \ }%
4810
4811     <item> \ }%
4812
4813     <item> \ \\RA%\\endmoddef%
4814
4815     <item>}%
4816   </nf-chunk||>
4817
4818   <subsection|The end>
4819
4820   <\nf-chunk|./fangle.sty>
4821     <item>%
4822
4823     <item>%\\makeatother
4824   </nf-chunk||>
4825
4826   <chapter|Extracting fangle>
4827
4828   <section|Extracting from Lyx>
4829
4830   To extract from <LyX>, you will need to configure <LyX> as explained in
4831   section <reference|Configuring-the-build>.
4832
4833   <label|lyx-build-script>And this lyx-build scrap will extract fangle for
4834   me.
4835
4836   <\nf-chunk|lyx-build>
4837     <item>#! /bin/sh
4838
4839     <item>set -x
4840
4841     <item>
4842
4843     <item>=\<less\>\\chunkref{lyx-build-helper}\<gtr\>
4844
4845     <item>cd $PROJECT_DIR \|\| exit 1
4846
4847     <item>
4848
4849     <item>/usr/local/bin/fangle -R./fangle $TEX_SRC \<gtr\> ./fangle
4850
4851     <item>/usr/local/bin/fangle -R./fangle.module $TEX_SRC \<gtr\>
4852     ./fangle.module
4853
4854     <item>
4855
4856     <item>=\<less\>\\chunkref{test:helpers}\<gtr\>
4857
4858     <item>export FANGLE=./fangle
4859
4860     <item>export TMP=${TMP:-/tmp}
4861
4862     <item>=\<less\>\\chunkref{test:run-tests}\<gtr\>
4863
4864     <item># Now check that we can extract a fangle that also passes the
4865     tests!
4866
4867     <item>$FANGLE -R./fangle $TEX_SRC \<gtr\> ./new-fangle
4868
4869     <item>export FANGLE=./new-fangle
4870
4871     <item>=\<less\>\\chunkref{test:run-tests}\<gtr\>
4872   </nf-chunk|sh|>
4873
4874   <\nf-chunk|test:run-tests>
4875     <item># run tests
4876
4877     <item>$FANGLE -Rpca-test.awk $TEX_SRC \| awk -f - \|\| exit 1
4878
4879     <item>=\<less\>\\chunkref{test:cromulence}\<gtr\>
4880
4881     <item>=\<less\>\\chunkref{test:escapes}\<gtr\>
4882
4883     <item>=\<less\>\\chunkref{test:chunk-params}\<gtr\>
4884   </nf-chunk|sh|>
4885
4886   With a lyx-build-helper
4887
4888   <\nf-chunk|lyx-build-helper>
4889     <item>PROJECT_DIR="$LYX_r"
4890
4891     <item>LYX_SRC="$PROJECT_DIR/${LYX_i%.tex}.lyx"
4892
4893     <item>TEX_DIR="$LYX_p"
4894
4895     <item>TEX_SRC="$TEX_DIR/$LYX_i"
4896   </nf-chunk|sh|>
4897
4898   <section|Extracting documentation>
4899
4900   <\nf-chunk|./gen-www>
4901     <item>#python -m elyxer --css lyx.css $LYX_SRC \| \\
4902
4903     <item># \ iconv -c -f utf-8 -t ISO-8859-1//TRANSLIT \| \\
4904
4905     <item># \ sed 's/UTF-8"\\(.\\)\<gtr\>/ISO-8859-1"\\1\<gtr\>/' \<gtr\>
4906     www/docs/fangle.html
4907
4908     <item>
4909
4910     <item>python -m elyxer --css lyx.css --iso885915 --html --destdirectory
4911     www/docs/fangle.e \\
4912
4913     <item> \ \ \ \ \ \ fangle.lyx \<gtr\> www/docs/fangle.e/fangle.html
4914
4915     <item>
4916
4917     <item>( mkdir -p www/docs/fangle && cd www/docs/fangle && \\
4918
4919     <item> \ lyx -e latex ../../../fangle.lyx && \\
4920
4921     <item> \ htlatex ../../../fangle.tex "xhtml,fn-in" && \\
4922
4923     <item> \ sed -i -e 's/\<less\>!--l\\. [0-9][0-9]* *--\<gtr\>//g'
4924     fangle.html
4925
4926     <item>)
4927
4928     <item>
4929
4930     <item>( mkdir -p www/docs/literate && cd www/docs/literate && \\
4931
4932     <item> \ lyx -e latex ../../../literate.lyx && \\
4933
4934     <item> \ htlatex ../../../literate.tex "xhtml,fn-in" && \\
4935
4936     <item> \ sed -i -e 's/\<less\>!--l\\. [0-9][0-9]* *--\<gtr\>$//g'
4937     literate.html
4938
4939     <item>)
4940   </nf-chunk||>
4941
4942   <section|Extracting from the command line>
4943
4944   First you will need the tex output, then you can extract:
4945
4946   <\nf-chunk|lyx-build-manual>
4947     <item>lyx -e latex fangle.lyx
4948
4949     <item>fangle -R./fangle fangle.tex \<gtr\> ./fangle
4950
4951     <item>fangle -R./fangle.module fangle.tex \<gtr\> ./fangle.module
4952   </nf-chunk|sh|>
4953
4954   <section|Testing>
4955
4956   <\nf-chunk|test:helpers>
4957     <item>passtest() {
4958
4959     <item> \ if "$@"
4960
4961     <item> \ then echo "Passed"
4962
4963     <item> \ else echo "Failed"
4964
4965     <item> \ \ \ \ \ \ return 1
4966
4967     <item> \ fi
4968
4969     <item>}
4970
4971     <item>
4972
4973     <item>failtest() {
4974
4975     <item> \ if ! "$@"
4976
4977     <item> \ then echo "Passed"
4978
4979     <item> \ else echo "Failed"
4980
4981     <item> \ \ \ \ \ \ return 1
4982
4983     <item> \ fi
4984
4985     <item>}
4986   </nf-chunk||>
4987
4988   <part|Tests>
4989
4990   <chapter|Chunk Parameters>
4991
4992   <\nf-chunk|test:chunk-params:sub>
4993     <item>I see a ${THING},
4994
4995     <item>a ${THING} of colour ${colour},\
4996
4997     <item>and looking closer =\<less\>\\chunkref{test:chunk-params:sub:sub}(${colour})\<gtr\>
4998   </nf-chunk||<tuple|THING|colour>>
4999
5000   <\nf-chunk|test:chunk-params:sub:sub>
5001     <item>a funny shade of ${colour}
5002   </nf-chunk||<tuple|colour>>
5003
5004   <\nf-chunk|test:chunk-params:text>
5005     <item>What do you see? "=\<less\>\\chunkref{test:chunk-params:sub}(joe,
5006     red)\<gtr\>"
5007
5008     <item>Well, fancy!
5009   </nf-chunk||>
5010
5011   Should generate output:
5012
5013   <\nf-chunk|test:chunk-params:result>
5014     <item>What do you see? "I see a joe,
5015
5016     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ a joe of colour red,\
5017
5018     <item> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ and looking closer a funny shade
5019     of red"
5020
5021     <item>Well, fancy!
5022   </nf-chunk||>
5023
5024   And this chunk will perform the test:
5025
5026   <\nf-chunk|test:chunk-params>
5027     <item>$FANGLE -Rtest:chunk-params:result $TEX_SRC \<gtr\> $TMP/answer
5028     \|\| exit 1
5029
5030     <item>$FANGLE -Rtest:chunk-params:text $TEX_SRC \<gtr\> $TMP/result \|\|
5031     exit 1
5032
5033     <item>passtest diff $TMP/answer $TMP/result \|\| (echo
5034     test:chunk-params:text failed ; exit 1)
5035   </nf-chunk||>
5036
5037   <chapter|Compile-log-lyx><label|Compile-log-lyx>
5038
5039   <\nf-chunk|Chunk:./compile-log-lyx>
5040     <item>#! /bin/sh
5041
5042     <item># can't use gtkdialog -i, cos it uses the "source" command which
5043     ubuntu sh doesn't have
5044
5045     <item>
5046
5047     <item>main() {
5048
5049     <item> \ errors="/tmp/compile.log.$$"
5050
5051     <item># \ if grep '^[^ ]*:\\( In \\\|[0-9][0-9]*: [^ ]*:\\)' \<gtr\>
5052     $errors
5053
5054     <item>if grep '^[^ ]*(\\([0-9][0-9]*\\)) *: *\\(error\\\|warning\\)'
5055     \<gtr\> $errors
5056
5057     <item> \ then
5058
5059     <item> \ \ \ sed -i -e 's/^[^ ]*[/\\\\]\\([^/\\\\]*\\)(\\([ 0-9][
5060     0-9]*\\)) *: */\\1:\\2\|\\2\|/' $errors
5061
5062     <item> \ \ \ COMPILE_DIALOG='
5063
5064     <item> \<less\>vbox\<gtr\>
5065
5066     <item> \ \<less\>text\<gtr\>
5067
5068     <item> \ \ \ \<less\>label\<gtr\>Compiler errors:\<less\>/label\<gtr\>
5069
5070     <item> \ \<less\>/text\<gtr\>
5071
5072     <item> \ \<less\>tree exported_column="0"\<gtr\>
5073
5074     <item> \ \ \ \<less\>variable\<gtr\>LINE\<less\>/variable\<gtr\>
5075
5076     <item> \ \ \ \<less\>height\<gtr\>400\<less\>/height\<gtr\>\<less\>width\<gtr\>800\<less\>/width\<gtr\>
5077
5078     <item> \ \ \ \<less\>label\<gtr\>File \| Line \|
5079     Message\<less\>/label\<gtr\>
5080
5081     <item> \ \ \ \<less\>action\<gtr\>'". $SELF ; "'lyxgoto
5082     $LINE\<less\>/action\<gtr\>
5083
5084     <item> \ \ \ \<less\>input\<gtr\>'"cat $errors"'\<less\>/input\<gtr\>
5085
5086     <item> \ \<less\>/tree\<gtr\>
5087
5088     <item> \ \<less\>hbox\<gtr\>
5089
5090     <item> \ \ \<less\>button\<gtr\>\<less\>label\<gtr\>Build\<less\>/label\<gtr\>
5091
5092     <item> \ \ \ \ \<less\>action\<gtr\>lyxclient -c "LYXCMD:build-program"
5093     &\<less\>/action\<gtr\>
5094
5095     <item> \ \ \<less\>/button\<gtr\>
5096
5097     <item> \ \ \<less\>button ok\<gtr\>\<less\>/button\<gtr\>
5098
5099     <item> \ \<less\>/hbox\<gtr\>
5100
5101     <item> \<less\>/vbox\<gtr\>
5102
5103     <item>'
5104
5105     <item> \ \ \ export COMPILE_DIALOG
5106
5107     <item> \ \ \ ( gtkdialog --program=COMPILE_DIALOG ; rm $errors ) &
5108
5109     <item> \ else
5110
5111     <item> \ \ \ rm $errors
5112
5113     <item> \ fi
5114
5115     <item>}
5116
5117     <item>
5118
5119     <item>lyxgoto() {
5120
5121     <item> \ file="${LINE%:*}"
5122
5123     <item> \ line="${LINE##*:}"
5124
5125     <item> \ extraline=`cat $file \| head -n $line \| tac \| sed
5126     '/^\\\\\\\\begin{lstlisting}/q' \| wc -l`
5127
5128     <item> \ extraline=`expr $extraline - 1`
5129
5130     <item> \ lyxclient -c "LYXCMD:command-sequence server-goto-file-row $file
5131     $line ; char-forward ; repeat $extraline paragraph-down ;
5132     paragraph-up-select"
5133
5134     <item>}
5135
5136     <item>
5137
5138     <item>SELF="$0"
5139
5140     <item>if test -z "$COMPILE_DIALOG"
5141
5142     <item>then main "$@"\
5143
5144     <item>fi
5145   </nf-chunk|sh|>
5146
5147   \;
5148 </body>
5149
5150 <\initial>
5151   <\collection>
5152     <associate|info-flag|short>
5153     <associate|page-medium|paper>
5154     <associate|page-screen-height|982016tmpt>
5155     <associate|page-screen-margin|false>
5156     <associate|page-screen-width|1686528tmpt>
5157     <associate|preamble|false>
5158     <associate|sfactor|5>
5159   </collection>
5160 </initial>
5161
5162