doc/m4.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @comment ========================================================
   3 @comment %**start of header
   4 @setfilename m4.info
   5 @include version.texi
   6 @settitle GNU M4 @value{VERSION} macro processor
   7 @setchapternewpage odd
   8 @ifnothtml
   9 @setcontentsaftertitlepage
  10 @end ifnothtml
  11 @finalout
  12
  13 @set beta
  14
  15 @c @tabchar{}
  16 @c ----------
  17 @c The testsuite expects literal tab output in some examples, but
  18 @c literal tabs in texinfo leads to formatting issues.
  19 @macro tabchar
  20 @       @c
  21 @end macro
  22
  23 @c @ovar{ARG}
  24 @c -------------------
  25 @c The ARG is an optional argument.  To be used for macro arguments in
  26 @c their documentation (@defmac).
  27 @macro ovar{varname}
  28 @r{[}@var{\varname\}@r{]}@c
  29 @end macro
  30
  31 @c @dvar{ARG, DEFAULT}
  32 @c -------------------
  33 @c The ARG is an optional argument, defaulting to DEFAULT.  To be used
  34 @c for macro arguments in their documentation (@defmac).
  35 @macro dvar{varname, default}
  36 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
  37 @end macro
  38
  39 @comment %**end of header
  40 @comment ========================================================
  41
  42 @copying
  43
  44 This manual (@value{UPDATED}) is for GNU M4 (version
  45 @value{VERSION}), a package containing an implementation of the m4 macro
  46 language.
  47
  48 Copyright @copyright{} 1989-1994, 2004-2011, 2013 Free Software Foundation, Inc.
  49
  50 @quotation
  51 Permission is granted to copy, distribute and/or modify this document
  52 under the terms of the GNU Free Documentation License,
  53 Version 1.3 or any later version published by the Free Software
  54 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
  55 Back-Cover Texts.  A copy of the license is included in the section
  56 entitled ``GNU Free Documentation License.''
  57 @end quotation
  58 @end copying
  59
  60 @dircategory Text creation and manipulation
  61 @direntry
  62 * M4: (m4).                     A powerful macro processor.
  63 @end direntry
  64
  65 @titlepage
  66 @title GNU M4, version @value{VERSION}
  67 @subtitle A powerful macro processor
  68 @subtitle Edition @value{EDITION}, @value{UPDATED}
  69 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
  70 @author Gary V. Vaughan, and Eric Blake
  71 @author (@email{bug-m4@@gnu.org})
  72
  73 @page
  74 @vskip 0pt plus 1filll
  75 @insertcopying
  76 @end titlepage
  77
  78 @contents
  79
  80 @ifnottex
  81 @node Top
  82 @top GNU M4
  83 @insertcopying
  84 @end ifnottex
  85
  86 GNU @code{m4} is an implementation of the traditional UNIX macro
  87 processor.  It is mostly SVR4 compatible, although it has some
  88 extensions (for example, handling more than 9 positional parameters
  89 to macros).  @code{m4} also has builtin functions for including
  90 files, running shell commands, doing arithmetic, etc.  Autoconf needs
  91 GNU @code{m4} for generating @file{configure} scripts, but not for
  92 running them.
  93
  94 GNU @code{m4} was originally written by Ren@'e Seindal, with
  95 subsequent changes by Fran@,{c}ois Pinard and other volunteers
  96 on the Internet.  All names and email addresses can be found in the
  97 files @file{m4-@value{VERSION}/@/AUTHORS} and
  98 @file{m4-@value{VERSION}/@/THANKS} from the GNU M4
  99 distribution.
 100
 101 @ifclear beta
 102 This is release @value{VERSION}.  It is now considered stable:  future
 103 releases on this branch are only meant to fix bugs, increase speed, or
 104 improve documentation.
 105 @end ifclear
 106
 107 @ifset beta
 108 This is BETA release @value{VERSION}.  This is a development release,
 109 and as such, is prone to bugs, crashes, unforeseen features, incomplete
 110 documentation@dots{}, therefore, use at your own peril.  In case of
 111 problems, please do not hesitate to report them (see the
 112 @file{m4-@value{VERSION}/@/README} file in the distribution).
 113 @xref{Experiments}.
 114 @end ifset
 115
 116 @menu
 117 * Preliminaries::               Introduction and preliminaries
 118 * Invoking m4::                 Invoking @code{m4}
 119 * Syntax::                      Lexical and syntactic conventions
 120
 121 * Macros::                      How to invoke macros
 122 * Definitions::                 How to define new macros
 123 * Conditionals::                Conditionals, loops, and recursion
 124
 125 * Debugging::                   How to debug macros and input
 126
 127 * Input Control::               Input control
 128 * File Inclusion::              File inclusion
 129 * Diversions::                  Diverting and undiverting output
 130
 131 * Modules::                     Extending M4 with dynamic runtime modules
 132
 133 * Text handling::               Macros for text handling
 134 * Arithmetic::                  Macros for doing arithmetic
 135 * Shell commands::              Macros for running shell commands
 136 * Miscellaneous::               Miscellaneous builtin macros
 137 * Frozen files::                Fast loading of frozen state
 138
 139 * Compatibility::               Compatibility with other versions of @code{m4}
 140 * Answers::                     Correct version of some examples
 141
 142 * Copying This Package::        How to make copies of the overall M4 package
 143 * Copying This Manual::         How to make copies of this manual
 144 * Indices::                     Indices of concepts and macros
 145
 146 @detailmenu
 147  --- The Detailed Node Listing ---
 148
 149 Introduction and preliminaries
 150
 151 * Intro::                       Introduction to @code{m4}
 152 * History::                     Historical references
 153 * Bugs::                        Problems and bugs
 154 * Manual::                      Using this manual
 155
 156 Invoking @code{m4}
 157
 158 * Operation modes::             Command line options for operation modes
 159 * Preprocessor features::       Command line options for preprocessor features
 160 * Limits control::              Command line options for limits control
 161 * Frozen state::                Command line options for frozen state
 162 * Debugging options::           Command line options for debugging
 163 * Command line files::          Specifying input files on the command line
 164
 165 Lexical and syntactic conventions
 166
 167 * Names::                       Macro names
 168 * Quoted strings::              Quoting input to @code{m4}
 169 * Comments::                    Comments in @code{m4} input
 170 * Other tokens::                Other kinds of input tokens
 171 * Input processing::            How @code{m4} copies input to output
 172 * Regular expression syntax::   How @code{m4} interprets regular expressions
 173
 174 How to invoke macros
 175
 176 * Invocation::                  Macro invocation
 177 * Inhibiting Invocation::       Preventing macro invocation
 178 * Macro Arguments::             Macro arguments
 179 * Quoting Arguments::           On Quoting Arguments to macros
 180 * Macro expansion::             Expanding macros
 181
 182 How to define new macros
 183
 184 * Define::                      Defining a new macro
 185 * Arguments::                   Arguments to macros
 186 * Pseudo Arguments::            Special arguments to macros
 187 * Undefine::                    Deleting a macro
 188 * Defn::                        Renaming macros
 189 * Pushdef::                     Temporarily redefining macros
 190 * Renamesyms::                  Renaming macros with regular expressions
 191
 192 * Indir::                       Indirect call of macros
 193 * Builtin::                     Indirect call of builtins
 194 * M4symbols::                   Getting the defined macro names
 195
 196 Conditionals, loops, and recursion
 197
 198 * Ifdef::                       Testing if a macro is defined
 199 * Ifelse::                      If-else construct, or multibranch
 200 * Shift::                       Recursion in @code{m4}
 201 * Forloop::                     Iteration by counting
 202 * Foreach::                     Iteration by list contents
 203 * Stacks::                      Working with definition stacks
 204 * Composition::                 Building macros with macros
 205
 206 How to debug macros and input
 207
 208 * Dumpdef::                     Displaying macro definitions
 209 * Trace::                       Tracing macro calls
 210 * Debugmode::                   Controlling debugging options
 211 * Debuglen::                    Limiting debug output
 212 * Debugfile::                   Saving debugging output
 213
 214 Input control
 215
 216 * Dnl::                         Deleting whitespace in input
 217 * Changequote::                 Changing the quote characters
 218 * Changecom::                   Changing the comment delimiters
 219 * Changeresyntax::              Changing the regular expression syntax
 220 * Changesyntax::                Changing the lexical structure of the input
 221 * M4wrap::                      Saving text until end of input
 222
 223 File inclusion
 224
 225 * Include::                     Including named files
 226 * Search Path::                 Searching for include files
 227
 228 Diverting and undiverting output
 229
 230 * Divert::                      Diverting output
 231 * Undivert::                    Undiverting output
 232 * Divnum::                      Diversion numbers
 233 * Cleardivert::                 Discarding diverted text
 234
 235 Extending M4 with dynamic runtime modules
 236
 237 * M4modules::                   Listing loaded modules
 238 * Standard Modules::            Standard bundled modules
 239
 240 Macros for text handling
 241
 242 * Len::                         Calculating length of strings
 243 * Index macro::                 Searching for substrings
 244 * Regexp::                      Searching for regular expressions
 245 * Substr::                      Extracting substrings
 246 * Translit::                    Translating characters
 247 * Patsubst::                    Substituting text by regular expression
 248 * Format::                      Formatting strings (printf-like)
 249
 250 Macros for doing arithmetic
 251
 252 * Incr::                        Decrement and increment operators
 253 * Eval::                        Evaluating integer expressions
 254 * Mpeval::                      Multiple precision arithmetic
 255
 256 Macros for running shell commands
 257
 258 * Platform macros::             Determining the platform
 259 * Syscmd::                      Executing simple commands
 260 * Esyscmd::                     Reading the output of commands
 261 * Sysval::                      Exit status
 262 * Mkstemp::                     Making temporary files
 263 * Mkdtemp::                     Making temporary directories
 264
 265 Miscellaneous builtin macros
 266
 267 * Errprint::                    Printing error messages
 268 * Location::                    Printing current location
 269 * M4exit::                      Exiting from @code{m4}
 270 * Syncoutput::                  Turning on and off sync lines
 271
 272 Fast loading of frozen state
 273
 274 * Using frozen files::          Using frozen files
 275 * Frozen file format 1::        Frozen file format 1
 276 * Frozen file format 2::        Frozen file format 2
 277
 278 Compatibility with other versions of @code{m4}
 279
 280 * Extensions::                  Extensions in GNU M4
 281 * Incompatibilities::           Other incompatibilities
 282 * Experiments::                 Experimental features in GNU M4
 283
 284 Correct version of some examples
 285
 286 * Improved exch::               Solution for @code{exch}
 287 * Improved forloop::            Solution for @code{forloop}
 288 * Improved foreach::            Solution for @code{foreach}
 289 * Improved copy::               Solution for @code{copy}
 290 * Improved m4wrap::             Solution for @code{m4wrap}
 291 * Improved cleardivert::        Solution for @code{cleardivert}
 292 * Improved capitalize::         Solution for @code{capitalize}
 293 * Improved fatal_error::        Solution for @code{fatal_error}
 294
 295 How to make copies of the overall M4 package
 296
 297 * GNU General Public License::  License for copying the M4 package
 298
 299 How to make copies of this manual
 300
 301 * GNU Free Documentation License::  License for copying this manual
 302
 303 Indices of concepts and macros
 304
 305 * Macro index::                 Index for all @code{m4} macros
 306 * Concept index::               Index for many concepts
 307
 308 @end detailmenu
 309 @end menu
 310
 311 @node Preliminaries
 312 @chapter Introduction and preliminaries
 313
 314 This first chapter explains what GNU @code{m4} is, where @code{m4}
 315 comes from, how to read and use this documentation, how to call the
 316 @code{m4} program, and how to report bugs about it.  It concludes by
 317 giving tips for reading the remainder of the manual.
 318
 319 The following chapters then detail all the features of the @code{m4}
 320 language, as shipped in the GNU M4 package.
 321
 322 @menu
 323 * Intro::                       Introduction to @code{m4}
 324 * History::                     Historical references
 325 * Bugs::                        Problems and bugs
 326 * Manual::                      Using this manual
 327 @end menu
 328
 329 @node Intro
 330 @section Introduction to @code{m4}
 331
 332 @cindex overview of @code{m4}
 333 @code{m4} is a macro processor, in the sense that it copies its
 334 input to the output, expanding macros as it goes.  Macros are either
 335 builtin or user-defined, and can take any number of arguments.
 336 Besides just doing macro expansion, @code{m4} has builtin functions
 337 for including named files, running shell commands, doing integer
 338 arithmetic, manipulating text in various ways, performing recursion,
 339 etc.@dots{}  @code{m4} can be used either as a front-end to a compiler,
 340 or as a macro processor in its own right.
 341
 342 The @code{m4} macro processor is widely available on all UNIXes, and has
 343 been standardized by POSIX.
 344 Usually, only a small percentage of users are aware of its existence.
 345 However, those who find it often become committed users.  The
 346 popularity of GNU Autoconf, which requires GNU
 347 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
 348 for many to install it, while these people will not themselves
 349 program in @code{m4}.  GNU @code{m4} is mostly compatible with the
 350 System V, Release 4 version, except for some minor differences.
 351 @xref{Compatibility}, for more details.
 352
 353 Some people find @code{m4} to be fairly addictive.  They first use
 354 @code{m4} for simple problems, then take bigger and bigger challenges,
 355 learning how to write complex sets of @code{m4} macros along the way.
 356 Once really addicted, users pursue writing of sophisticated @code{m4}
 357 applications even to solve simple problems, devoting more time
 358 debugging their @code{m4} scripts than doing real work.  Beware that
 359 @code{m4} may be dangerous for the health of compulsive programmers.
 360
 361 @node History
 362 @section Historical references
 363
 364 @cindex history of @code{m4}
 365 @cindex GNU M4, history of
 366 Macro languages were invented early in the history of computing.  In the
 367 1950s Alan Perlis suggested that the macro language be independent of the
 368 language being processed.  Techniques such as conditional and recursive
 369 macros, and using macros to define other macros, were described by Doug
 370 McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler
 371 Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20,
 372 @url{http://dx.doi.org/10.1145/367177.367223}.
 373
 374 An important precursor of @code{m4} was GPM; see C. Strachey,
 375 @c The title uses lower case and has no space between "macro" and "generator".
 376 ``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3
 377 (1965), 225--41, @url{http://dx.doi.org/10.1093/comjnl/8.3.225}.  GPM is
 378 also succinctly described in David Gries's book @emph{Compiler
 379 Construction for Digital Computers}, Wiley (1971).  Strachey was a
 380 brilliant programmer: GPM fit into 250 machine instructions!
 381
 382 Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a
 383 model preprocessor in that fit into a page of Snobol 3 code, and McIlroy
 384 and Robert Morris developed a series of further models at Bell Labs.
 385 Andrew D. Hall followed up with M6, a general purpose macro processor
 386 used to port the Fortran source code of the Altran computer algebra
 387 system; see Hall's ``The M6 Macro Processor'', Computing Science
 388 Technical Report #2, Bell Labs (1972),
 389 @url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}.  M6's source code
 390 consisted of about 600 Fortran statements.  Its name was the first of
 391 the @code{m4} line.
 392
 393 The Brian Kernighan and P.J. Plauger book @emph{Software Tools},
 394 Addison-Wesley (1976), describes and implements a Unix
 395 macro-processor language, which inspired Dennis Ritchie to write
 396 @code{m3}, a macro processor for the AP-3 minicomputer.
 397
 398 Kernighan and Ritchie then joined forces to develop the original
 399 @code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories
 400 (1977), @url{http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}.
 401 It had only 21 builtin macros.
 402
 403 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
 404 the true intricacies of real life: macros can be recognized without
 405 being pre-announced, skipping whitespace or end-of-lines is easier,
 406 more constructs are builtin instead of derived, etc.
 407
 408 Originally, the Kernighan and Plauger macro-processor, and then
 409 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
 410 that is, the @code{Ratfor} equivalent of @code{cpp}.  Later, @code{m4}
 411 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
 412
 413 Ren@'e Seindal released his implementation of @code{m4}, GNU
 414 @code{m4},
 415 in 1990, with the aim of removing the artificial limitations in many
 416 of the traditional @code{m4} implementations, such as maximum line
 417 length, macro size, or number of macros.
 418
 419 The late Professor A. Dain Samples described and implemented a further
 420 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
 421 Language: 2nd edition'', Electronic Announcement on comp.compilers
 422 newsgroup (1992).
 423
 424 Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
 425 1992, until 1994 when he released GNU @code{m4} 1.4, which was
 426 the stable release for 10 years.  It was at this time that GNU
 427 Autoconf decided to require GNU @code{m4} as its underlying
 428 engine, since all other implementations of @code{m4} had too many
 429 limitations.
 430
 431 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
 432 addressed some long standing bugs in the venerable 1.4 release.  Then in
 433 2005, Gary V. Vaughan collected together the many patches to
 434 GNU @code{m4} 1.4 that were floating around the net and
 435 released 1.4.3 and 1.4.4.  And in 2006, Eric Blake joined the team and
 436 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
 437 More bug fixes were incorporated in 2007, with releases 1.4.9 and
 438 1.4.10.  Eric continued with some portability fixes for 1.4.11 and
 439 1.4.12 in 2008, 1.4.13 in 2009, 1.4.14 and 1.4.15 in 2010,  and 1.4.16
 440 in 2011.  Following a long hiatus, Gary released 1.4.17 after upgrading
 441 to the latest autotools (and gnulib) along with all the small fixes they
 442 had accumulated.
 443
 444 Additionally, in 2008, Eric rewrote the scanning engine to reduce
 445 recursive evaluation from quadratic to linear complexity.  This was
 446 released as M4 1.6 in 2009.  The 1.x branch series remains open for bug
 447 fixes.
 448
 449 Meanwhile, development was underway for new features for @code{m4},
 450 such as dynamic module loading and additional builtins, practically
 451 rewriting the entire code base.  This development has spurred
 452 improvements to other GNU software, such as GNU
 453 Libtool.  GNU M4 2.0 is the result of this effort.
 454
 455 @node Bugs
 456 @section Problems and bugs
 457
 458 @cindex reporting bugs
 459 @cindex bug reports
 460 @cindex suggestions, reporting
 461 If you have problems with GNU M4 or think you've found a bug,
 462 please report it.  Before reporting a bug, make sure you've actually
 463 found a real bug.  Carefully reread the documentation and see if it
 464 really says you can do what you're trying to do.  If it's not clear
 465 whether you should be able to do something or not, report that too; it's
 466 a bug in the documentation!
 467
 468 Before reporting a bug or trying to fix it yourself, try to isolate it
 469 to the smallest possible input file that reproduces the problem.  Then
 470 send us the input file and the exact results @code{m4} gave you.  Also
 471 say what you expected to occur; this will help us decide whether the
 472 problem was really in the documentation.
 473
 474 Once you've got a precise problem, send e-mail to
 475 @email{bug-m4@@gnu.org}.  Please include the version number of @code{m4}
 476 you are using.  You can get this information with the command
 477 @kbd{m4 --version}.  You can also run @kbd{make check} to generate the
 478 file @file{tests/@/testsuite.log}, useful for including in your report.
 479
 480 Non-bug suggestions are always welcome as well.  If you have questions
 481 about things that are unclear in the documentation or are just obscure
 482 features, please report them too.
 483
 484 @node Manual
 485 @section Using this manual
 486
 487 @cindex examples, understanding
 488 This manual contains a number of examples of @code{m4} input and output,
 489 and a simple notation is used to distinguish input, output and error
 490 messages from @code{m4}.  Examples are set out from the normal text, and
 491 shown in a fixed width font, like this
 492
 493 @comment ignore
 494 @example
 495 This is an example of an example!
 496 @end example
 497
 498 To distinguish input from output, all output from @code{m4} is prefixed
 499 by the string @samp{@result{}}, and all error messages by the string
 500 @samp{@error{}}.  When showing how command line options affect matters,
 501 the command line is shown with a prompt @samp{$ @kbd{like this}},
 502 otherwise, you can assume that a simple @kbd{m4} invocation will work.
 503 Thus:
 504
 505 @comment ignore
 506 @example
 507 $ @kbd{command line to invoke m4}
 508 Example of input line
 509 @result{}Output line from m4
 510 @error{}and an error message
 511 @end example
 512
 513 The sequence @samp{^D} in an example indicates the end of the input
 514 file.  The sequence @samp{@key{NL}} refers to the newline character.
 515 The majority of these examples are self-contained, and you can run them
 516 with similar results.  In fact, the testsuite that is bundled in the
 517 GNU M4 package consists in part of the examples
 518 in this document!  Some of the examples assume that your current
 519 directory is located where you unpacked the installation, so if you plan
 520 on following along, you may find it helpful to do this now:
 521
 522 @comment ignore
 523 @example
 524 $ @kbd{cd m4-@value{VERSION}}
 525 @end example
 526
 527 As each of the predefined macros in @code{m4} is described, a prototype
 528 call of the macro will be shown, giving descriptive names to the
 529 arguments, e.g.,
 530
 531 @deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @
 532   @ovar{argument}@dots{})
 533 This is a sample prototype.  There is not really a macro named
 534 @code{example}, but this documents that if there were, it would be a
 535 Composite macro, rather than a Builtin, and would be provided by the
 536 module @code{none}.
 537
 538 It requires at least one argument, @var{string}.  Remember that in
 539 @code{m4}, there must not be a space between the macro name and the
 540 opening parenthesis, unless it was intended to call the macro without
 541 any arguments.  The brackets around @var{count} and @var{argument} show
 542 that these arguments are optional.  If @var{count} is omitted, the macro
 543 behaves as if count were @samp{1}, whereas if @var{argument} is omitted,
 544 the macro behaves as if it were the empty string.  A blank argument is
 545 not the same as an omitted argument.  For example, @samp{example(`a')},
 546 @samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave
 547 identically with @var{count} set to @samp{1}; while @samp{example(`a',)}
 548 and @samp{example(`a',`')} would explicitly pass the empty string for
 549 @var{count}.  The ellipses (@samp{@dots{}}) show that the macro
 550 processes additional arguments after @var{argument}, rather than
 551 ignoring them.
 552 @end deffn
 553
 554 Each builtin definition will list, in parentheses, the module that must
 555 be loaded to use that macro.  The standard modules include
 556 @samp{m4} (which is always available), @samp{gnu} (for GNU specific
 557 m4 extensions), and @samp{traditional} (for compatibility with System V
 558 m4).  @xref{Modules}.
 559
 560 @cindex numbers
 561 All macro arguments in @code{m4} are strings, but some are given
 562 special interpretation, e.g., as numbers, file names, regular
 563 expressions, etc.  The documentation for each macro will state how the
 564 parameters are interpreted, and what happens if the argument cannot be
 565 parsed according to the desired interpretation.  Unless specified
 566 otherwise, a parameter specified to be a number is parsed as a decimal,
 567 even if the argument has leading zeros; and parsing the empty string as
 568 a number results in 0 rather than an error, although a warning will be
 569 issued.
 570
 571 This document consistently writes and uses @dfn{builtin}, without a
 572 hyphen, as if it were an English word.  This is how the @code{builtin}
 573 primitive is spelled within @code{m4}.
 574
 575 @node Invoking m4
 576 @chapter Invoking @code{m4}
 577
 578 @cindex command line
 579 @cindex invoking @code{m4}
 580 The format of the @code{m4} command is:
 581
 582 @comment ignore
 583 @example
 584 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
 585 @end example
 586
 587 @cindex command line, options
 588 @cindex options, command line
 589 @cindex @env{POSIXLY_CORRECT}
 590 All options begin with @samp{-}, or if long option names are used, with
 591 @samp{--}.  A long option name need not be written completely, any
 592 unambiguous prefix is sufficient.  POSIX requires @code{m4} to
 593 recognize arguments intermixed with files, even when
 594 @env{POSIXLY_CORRECT} is set in the environment.  Most options take
 595 effect at startup regardless of their position, but some are documented
 596 below as taking effect after any files that occurred earlier in the
 597 command line.  The argument @option{--} is a marker to denote the end of
 598 options.
 599
 600 With short options, options that do not take arguments may be combined
 601 into a single command line argument with subsequent options, options
 602 with mandatory arguments may be provided either as a single command line
 603 argument or as two arguments, and options with optional arguments must
 604 be provided as a single argument.  In other words,
 605 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
 606 @kbd{m4 -Q -P -D foo -d ./a -d+f}, although the latter form is
 607 considered canonical.
 608
 609 With long options, options with mandatory arguments may be provided with
 610 an equal sign (@samp{=}) in a single argument, or as two arguments, and
 611 options with optional arguments must be provided as a single argument.
 612 In other words, @kbd{m4 --def foo --debug a} is equivalent to
 613 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
 614 considered canonical (not to mention more robust, in case a future
 615 version of @code{m4} introduces an option named @option{--default}).
 616
 617 @code{m4} understands the following options, grouped by functionality.
 618
 619 @menu
 620 * Operation modes::             Command line options for operation modes
 621 * Preprocessor features::       Command line options for preprocessor features
 622 * Limits control::              Command line options for limits control
 623 * Frozen state::                Command line options for frozen state
 624 * Debugging options::           Command line options for debugging
 625 * Command line files::          Specifying input files on the command line
 626 @end menu
 627
 628 @node Operation modes
 629 @section Command line options for operation modes
 630
 631 Several options control the overall operation of @code{m4}:
 632
 633 @table @code
 634 @item --help
 635 Print a help summary on standard output, then immediately exit
 636 @code{m4} without reading any input files or performing any other
 637 actions.
 638
 639 @item --version
 640 Print the version number of the program on standard output, then
 641 immediately exit @code{m4} without reading any input files or
 642 performing any other actions.
 643
 644 @item -b
 645 @itemx --batch
 646 Makes this invocation of @code{m4} non-interactive.  This means that
 647 output will be buffered, and an interrupt or pipe write error will halt
 648 execution.  If neither
 649 @option{-b} nor @option{-i} are specified, this is activated by default
 650 when any input files are specified, or when either standard input or
 651 standard error is not a terminal.  Note that this means that @kbd{m4}
 652 alone might be interactive, but @kbd{m4 -} is not, even though both
 653 commands process only standard input.  If both @option{-b} and
 654 @option{-i} are specified, only the last one takes effect.
 655
 656 @item -c
 657 @itemx --discard-comments
 658 Discard all comments instead of copying them to the output.
 659
 660 @item -E
 661 @itemx --fatal-warnings
 662 @cindex errors, fatal
 663 @cindex fatal errors
 664 Controls the effect of warnings.  If unspecified, then execution
 665 continues and exit status is unaffected when a warning is printed.  If
 666 specified exactly once, warnings become fatal; when one is issued,
 667 execution continues, but the exit status will be non-zero.  If specified
 668 multiple times, then execution halts with non-zero status the first time
 669 a warning is issued.  The introduction of behavior levels is new to M4
 670 1.4.9; for behavior consistent with earlier versions, you should specify
 671 @option{-E} twice.
 672
 673
 674 For backwards compatibility reasons, using @option{-E} behaves as if an
 675 implicit @option{--debug=-d} option is also present.  This is so that
 676 scripts written for older M4 versions will not fail if they used
 677 constructs that were previously silently allowed, but would now trigger
 678 a warning.
 679
 680 @example
 681 $ @kbd{m4}
 682 defn(`oops')
 683 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
 684 @result{}
 685 ^D
 686 @end example
 687
 688 @comment ignore
 689 @example
 690 $ @kbd{echo $?}
 691 @result{}0
 692 @end example
 693
 694 @comment options: -E
 695 @example
 696 $ @kbd{m4 -E}
 697 defn(`oops')
 698 @result{}
 699 ^D
 700 @end example
 701
 702 @comment ignore
 703 @example
 704 $ @kbd{echo $?}
 705 @result{}0
 706 @end example
 707
 708 @comment options: -E -d
 709 @comment status: 1
 710 @example
 711 $ @kbd{m4 -E -d}
 712 defn(`oops')
 713 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
 714 @result{}
 715 ^D
 716 @end example
 717
 718 @comment ignore
 719 @example
 720 $ @kbd{echo $?}
 721 @result{}1
 722 @end example
 723
 724 @item -i
 725 @itemx --interactive
 726 @itemx -e
 727 Makes this invocation of @code{m4} interactive.  This means that all
 728 output will be unbuffered, interrupts will be ignored, and behavior on
 729 pipe write errors is inherited from the parent process.  If neither
 730 @option{-b} nor @option{-i} are specified, this is activated by default
 731 when no input files are specified, and when both standard input and
 732 standard error are terminals (similar to the way that /bin/sh determines
 733 when to be interactive).  If both @option{-b} and @option{-i} are
 734 specified, only the last one takes effect.  The spelling @option{-e}
 735 exists for compatibility with other @code{m4} implementations, and
 736 issues a warning because it may be withdrawn in a future version of
 737 GNU M4.
 738
 739 @item -P
 740 @itemx --prefix-builtins
 741 Internally modify @emph{all} builtin macro names so they all start with
 742 the prefix @samp{m4_}.  For example, using this option, one should write
 743 @samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}}
 744 instead of @samp{@w{__file__}}.  This option has no effect if @option{-R}
 745 is also specified.
 746
 747 @item -Q
 748 @itemx --quiet
 749 @itemx --silent
 750 Suppress warnings, such as missing or superfluous arguments in macro
 751 calls, or treating the empty string as zero.  Error messages are still
 752 printed.  The distinction between error and warning is fuzzy, and if
 753 you encounter a situation where the message output did not match your
 754 expectations, please report that as a bug.  This option is implied if
 755 @env{POSIXLY_CORRECT} is set in the environment.
 756
 757 @item -r@r{[}@var{resyntax-spec}@r{]}
 758 @itemx --regexp-syntax@r{[}=@var{resyntax-spec}@r{]}
 759 Set the regular expression syntax according to @var{resyntax-spec}.
 760 When this option is not given, or @var{resyntax-spec} is omitted,
 761 GNU M4 uses the flavor @code{GNU_M4}, which provides
 762 emacs-compatible regular expressions.  @xref{Changeresyntax}, for more
 763 details on the format and meaning of @var{resyntax-spec}.  This option
 764 may be given more than once, and order with respect to file names is
 765 significant.
 766
 767 @item --safer
 768 Cripple the following builtins, since each can perform potentially
 769 unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}),
 770 @code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}),
 771 @code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}).
 772 An attempt to use any of these macros will result in an error.  This
 773 option is intended to make it safer to preprocess an input file of
 774 unknown origin.
 775
 776 @item -W
 777 @itemx --warnings
 778 Enable warnings.  Warnings are on by default unless
 779 @env{POSIXLY_CORRECT} was set in the environment; this option exists to
 780 allow overriding @option{--silent}.
 781 @comment FIXME should we accept -Wall, -Wnone, -Wcategory,
 782 @comment -Wno-category...?
 783 @end table
 784
 785 @node Preprocessor features
 786 @section Command line options for preprocessor features
 787
 788 @cindex macro definitions, on the command line
 789 @cindex command line, macro definitions on the
 790 @cindex preprocessor features
 791 Several options allow @code{m4} to behave more like a preprocessor.
 792 Macro definitions and deletions can be made on the command line, the
 793 search path can be altered, and the output file can track where the
 794 input came from.  These features occur with the following options:
 795
 796 @table @code
 797 @item -B @var{directory}
 798 @itemx --prepend-include=@var{directory}
 799 Make @code{m4} search @var{directory} for included files, prior to
 800 searching the current working directory.  @xref{Search Path}, for more
 801 details.  This option may be given more than once.  Some other
 802 implementations of @code{m4} use @option{-B @var{number}} to change their
 803 hard-coded limits, but that is unnecessary in GNU where the
 804 only limit is your hardware capability.  So although it is unlikely that
 805 you will want to include a relative directory whose name is purely
 806 numeric, GNU @code{m4} will warn you about this potential
 807 compatibility issue; you can avoid the warning by using the long
 808 spelling, or by using @samp{./@var{number}} if you really meant it.
 809
 810 @item -D @var{name}@r{[}=@var{value}@r{]}
 811 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
 812 This enters @var{name} into the symbol table.  If @samp{=@var{value}} is
 813 missing, the value is taken to be the empty string.  The @var{value} can
 814 be any string, and the macro can be defined to take arguments, just as
 815 if it was defined from within the input.  This option may be given more
 816 than once; order with respect to file names is significant, and
 817 redefining the same @var{name} loses the previous value.
 818
 819 @item --import-environment
 820 Imports every variable in the environment as a macro.  This is done
 821 before @option{-D} and @option{-U}, so they can override the
 822 environment.
 823
 824 @item -I @var{directory}
 825 @itemx --include=@var{directory}
 826 Make @code{m4} search @var{directory} for included files that are not
 827 found in the current working directory.  @xref{Search Path}, for more
 828 details.  This option may be given more than once.
 829
 830 @item --popdef=@var{name}
 831 This deletes the top-most meaning @var{name} might have.  Obviously,
 832 only predefined macros can be deleted in this way.  This option may be
 833 given more than once; popping a @var{name} that does not have a
 834 definition is silently ignored.  Order is significant with respect to
 835 file names.
 836
 837 @item -p @var{name}@r{[}=@var{value}@r{]}
 838 @itemx --pushdef=@var{name}@r{[}=@var{value}@r{]}
 839 This enters @var{name} into the symbol table.  If @samp{=@var{value}} is
 840 missing, the value is taken to be the empty string.  The @var{value} can
 841 be any string, and the macro can be defined to take arguments, just as
 842 if it was defined from within the input.  This option may be given more
 843 than once; order with respect to file names is significant, and
 844 redefining the same @var{name} adds another definition to its stack.
 845
 846 @item -s
 847 @itemx --synclines
 848 Short for @option{--syncoutput=1}, turning on synchronization lines
 849 (sometimes called @dfn{synclines}).
 850
 851 @item --syncoutput@r{[}=@var{state}@r{]}
 852 @cindex synchronization lines
 853 @cindex location, input
 854 @cindex input location
 855 Control the generation of synchronization lines from the command line.
 856 Synchronization lines are for use by the C preprocessor or other
 857 similar tools.  Order is significant with respect to file names.  This
 858 option is useful, for example, when @code{m4} is used as a
 859 front end to a compiler.  Source file name and line number information
 860 is conveyed by directives of the form @samp{#line @var{linenum}
 861 "@var{file}"}, which are inserted as needed into the middle of the
 862 output.  Such directives mean that the following line originated or was
 863 expanded from the contents of input file @var{file} at line
 864 @var{linenum}.  The @samp{"@var{file}"} part is often omitted when
 865 the file name did not change from the previous directive.
 866
 867 Synchronization directives are always given on complete lines by
 868 themselves.  When a synchronization discrepancy occurs in the middle of
 869 an output line, the associated synchronization directive is delayed
 870 until the next newline that does not occur in the middle of a quoted
 871 string or comment.  @xref{Syncoutput}, for runtime control.  @var{state}
 872 is interpreted the same as the argument to @code{syncoutput}; if
 873 @var{state} is omitted, or @option{--syncoutput} is not used,
 874 synchronization lines are disabled.
 875
 876 @item -U @var{name}
 877 @itemx --undefine=@var{name}
 878 This deletes any predefined meaning @var{name} might have.  Obviously,
 879 only predefined macros can be deleted in this way.  This option may be
 880 given more than once; undefining a @var{name} that does not have a
 881 definition is silently ignored.  Order is significant with respect to
 882 file names.
 883 @end table
 884
 885 @node Limits control
 886 @section Command line options for limits control
 887
 888 There are some limits within @code{m4} that can be tuned.  For
 889 compatibility, @code{m4} also accepts some options that control limits
 890 in other implementations, but which are automatically unbounded (limited
 891 only by your hardware and operating system constraints) in GNU
 892 @code{m4}.
 893
 894 @table @code
 895 @item -g
 896 @itemx --gnu
 897 Enable all the extensions in this implementation.  This is on by
 898 default unless @env{POSIXLY_CORRECT} is set in the environment; it
 899 exists to allow overriding @option{--traditional}.
 900
 901 @item -G
 902 @itemx --posix
 903 @itemx --traditional
 904 Suppress all the extensions made in this implementation, compared to the
 905 System V version.  @xref{Compatibility}, for a list of these.  This
 906 loads the @samp{traditional} module in place of the @samp{gnu} module.
 907 It is implied if @env{POSIXLY_CORRECT} is set in the environment.
 908
 909 @item -L @var{num}
 910 @itemx --nesting-limit=@var{num}
 911 @cindex nesting limit
 912 @cindex limit, nesting
 913 Artificially limit the nesting of macro calls to @var{num} levels,
 914 stopping program execution if this limit is ever exceeded.  When not
 915 specified, nesting is limited to 1024 levels.  A value of zero means
 916 unlimited; but then heavily nested code could potentially cause a stack
 917 overflow.  @var{num} can have an optional scaling suffix.
 918 @comment FIXME - need a node on what scaling suffixes are supported (see
 919 @comment [info coreutils 'block size'] for ideas), and need to consider
 920 @comment whether builtins should also understand scaling suffixes:
 921 @comment eval, mpeval, perhaps format
 922
 923 The precise effect of this option might be more correctly associated
 924 with textual nesting than dynamic recursion.  It has been useful
 925 when some complex @code{m4} input was generated by mechanical means.
 926 Most users would never need this option.  If shown to be obtrusive,
 927 this option (which is still experimental) might well disappear.
 928
 929 @cindex rescanning
 930 This option does @emph{not} have the ability to break endless
 931 rescanning loops, since these do not necessarily consume much memory
 932 or stack space.  Through clever usage of rescanning loops, one can
 933 request complex, time-consuming computations from @code{m4} with useful
 934 results.  Putting limitations in this area would break @code{m4} power.
 935 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
 936 only the simplest example (but @pxref{Compatibility}).  Expecting GNU
 937 @code{m4} to detect these would be a little like expecting a compiler
 938 system to detect and diagnose endless loops: it is a quite @emph{hard}
 939 problem in general, if not undecidable!
 940
 941 @item -H @var{num}
 942 @itemx --hashsize=@var{num}
 943 @itemx --word-regexp=@var{regexp}
 944 These options are present only for compatibility with previous versions
 945 of GNU @code{m4}.  They do nothing except issue a warning, because the
 946 symbol table size is not fixed anymore, and because the new
 947 @code{changesyntax} feature is more efficient than the withdrawn
 948 experimental @code{changeword}.  These options will eventually disappear
 949 in future releases.
 950
 951 @item -S @var{num}
 952 @itemx -T @var{num}
 953 These options are present for compatibility with System V @code{m4}, but
 954 do nothing in this implementation.  They may disappear in future
 955 releases, and issue a warning to that effect.
 956 @end table
 957
 958 @node Frozen state
 959 @section Command line options for frozen state
 960
 961 GNU @code{m4} comes with a feature of freezing internal state
 962 (@pxref{Frozen files}).  This can be used to speed up @code{m4}
 963 execution when reusing a common initialization script.
 964
 965 @table @code
 966 @item -F @var{file}
 967 @itemx --freeze-state=@var{file}
 968 Once execution is finished, write out the frozen state on the specified
 969 @var{file}.  It is conventional, but not required, for @var{file} to end
 970 in @samp{.m4f}.
 971
 972 @item -R @var{file}
 973 @itemx --reload-state=@var{file}
 974 Before execution starts, recover the internal state from the specified
 975 frozen @var{file}.  The options @option{-D}, @option{-U}, @option{-t},
 976 @option{-m}, @option{-r}, and @option{--import-environment} take effect
 977 after state is reloaded, but before the input files are read.
 978 @end table
 979
 980 @node Debugging options
 981 @section Command line options for debugging
 982
 983 Finally, there are several options for aiding in debugging @code{m4}
 984 scripts.
 985
 986 @table @code
 987 @item -d@r{[}@r{[}-@r{|}+@r{]}@var{flags}@r{]}
 988 @itemx --debug@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
 989 @itemx --debugmode@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
 990 Set the debug-level according to the flags @var{flags}.  The debug-level
 991 controls the format and amount of information presented by the debugging
 992 functions.  @xref{Debugmode}, for more details on the format and
 993 meaning of @var{flags}.  If omitted, @var{flags} defaults to
 994 @samp{+adeq}.  If the option occurs multiple times, @var{flags} starting
 995 with @samp{-} or @samp{+} are cumulative, while @var{flags} starting
 996 with a letter override all earlier settings.  The debug-level starts
 997 with @samp{d} enabled and all other flags disabled.  To disable all
 998 previously set flags, specify an explicit @var{flags} of @samp{-V}.  For
 999 backward compatibility reasons, the option @option{--fatal-warnings}
1000 implies @samp{--debug=-d} as part of its effects.  The spelling
1001 @option{--debug} is recognized as an unambiguous option for
1002 compatibility with earlier versions of GNU M4, but for
1003 consistency with the builtin name, you can also use the spelling
1004 @option{--debugmode}.  Order is significant with respect to file names.
1005
1006 The cumulative effect of the various options in this example is
1007 equivalent to a single invocation of @code{debugmode(`adlqx')}:
1008
1009 @comment options: -d-V -d+lx --debug --debugmode=-e
1010 @example
1011 $ @kbd{m4 -d+lx --debug --debugmode=-e}
1012 traceon(`len')
1013 @result{}
1014 len(`123')
1015 @error{}m4trace:2: -1- id 2: len(`123')
1016 @result{}3
1017 @end example
1018
1019 @item --debugfile@r{[}=@var{file}@r{]}
1020 @itemx -o @var{file}
1021 @itemx --error-output=@var{file}
1022 Redirect debug messages and trace output to the
1023 named @var{file}.  Warnings, error messages, and @code{errprint} output
1024 are still printed to standard error.  Output from @code{dumpdef} goes to
1025 this file when the debug level @code{o} is not set (@pxref{Debugmode}).
1026 If these options are not used, or
1027 if @var{file} is unspecified (only possible for @option{--debugfile}),
1028 debug output goes to standard error; if @var{file} is the empty string,
1029 debug output is discarded.  @xref{Debugfile}, for more details.  The
1030 option @option{--debugfile} may be given more than once, and order is
1031 significant with respect to file names.  The spellings @option{-o} and
1032 @option{--error-output} are misleading and
1033 inconsistent with other GNU tools; using those spellings will
1034 evoke a warning, and they may be withdrawn or change semantics in a
1035 future release.
1036
1037 @item -l @var{num}
1038 @itemx --debuglen=@var{num}
1039 @itemx --arglength=@var{num}
1040 Restrict the size of the output generated by macro tracing or by
1041 @code{dumpdef} to @var{num} characters per string.  If unspecified or
1042 zero, output is unlimited.  @xref{Debuglen}, for more details.
1043 @var{num} can have an optional scaling suffix.  The spelling
1044 @option{--arglength} is deprecated, since it does not match the
1045 @code{debuglen} macro; using it will evoke a warning, and it may be
1046 withdrawn in a future release.
1047 @comment FIXME - Should we add an option that controls whether output
1048 @comment strings are sanitized with escape sequences, so that dumpdef is
1049 @comment truly one line per macro?
1050 @comment FIXME - see comment on --nesting-limit about NUM.
1051
1052 @item -t @var{name}
1053 @itemx --trace=@var{name}
1054 @itemx --traceon=@var{name}
1055 This enables tracing for the macro @var{name}, at any point where it is
1056 defined.  @var{name} need not be defined when this option is given.
1057 This option may be given more than once, and order is significant with
1058 respect to file names.  @xref{Trace}, for more details.
1059
1060 @item --traceoff=@var{name}
1061 This disables tracing for the macro @var{name}, at any point where it is
1062 defined.  @var{name} need not be defined when this option is given.
1063 This option may be given more than once, and order is significant with
1064 respect to file names.  @xref{Trace}, for more details.
1065 @end table
1066
1067 @node Command line files
1068 @section Specifying input files on the command line
1069
1070 @cindex command line, file names on the
1071 @cindex file names, on the command line
1072 The remaining arguments on the command line are taken to be input file
1073 names or module names (@pxref{Modules}).  Whether or not any modules
1074 are loaded from command line arguments, when no actual input file names
1075 are given, then standard input is read.  A file name of @file{-} can be
1076 used to denote standard input.  It is conventional, but not required,
1077 for input file names to end in @samp{.m4} and for module names to end
1078 in @samp{.la}.  The input files and modules are attended to in the
1079 sequence given.
1080
1081 Standard input can be read more than once, so the file name @file{-}
1082 may appear multiple times on the command line; this makes a difference
1083 when input is from a terminal or other special file type.  It is an
1084 error if an input file ends in the middle of argument collection, a
1085 comment, or a quoted string.
1086 @comment FIXME - it would be nicer if we let these three things
1087 @comment continue across file boundaries, provided that we warn in
1088 @comment interactive use when switching to stdin in a non-default parse
1089 @comment state.
1090
1091 Various options, such as @option{--define} (@option{-D}), @option{--undefine}
1092 (@option{-U}), @option{--synclines} (@option{-s}), @option{--trace}
1093 (@option{-t}), and @option{--regexp-syntax} (@option{-r}), only take
1094 effect after processing input from any file names that occur earlier
1095 on the command line.  For example, assume the file @file{foo} contains:
1096
1097 @comment file: foo
1098 @example
1099 $ @kbd{cat foo}
1100 bar
1101 @end example
1102
1103 The text @samp{bar} can then be redefined over multiple uses of
1104 @file{foo}:
1105
1106 @comment options: -Dbar=hello foo -Dbar=world foo
1107 @example
1108 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
1109 @result{}hello
1110 @result{}world
1111 @end example
1112
1113 @cindex command line, module names on the
1114 @cindex module names, on the command line
1115 The use of loadable runtime modules in any sense is a GNU M4
1116 extension, so if @option{-G} is also passed or if the @env{POSIXLY_CORRECT}
1117 environment variable is set, even otherwise valid module names will be
1118 treated as though they were input file names (and no doubt cause havoc as
1119 M4 tries to scan and expand the contents as if it were written in @code{m4}).
1120
1121 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
1122 exit status of @code{m4} will be 0 for success, 1 for general failure
1123 (such as problems with reading an input file), and 63 for version
1124 mismatch (@pxref{Using frozen files}).
1125
1126 If you need to read a file whose name starts with a @file{-}, you can
1127 specify it as @samp{./-file}, or use @option{--} to mark the end of
1128 options.
1129
1130 @ignore
1131 @comment Test that 'm4 file/' detects that file is not a directory; we
1132 @comment can assume that the current directory contains a Makefile.
1133 @comment mingw fails with EINVAL rather than ENOTDIR.
1134
1135 @comment status: 1
1136 @comment xerr: ignore
1137 @comment options: Makefile/
1138 @example
1139 @error{}m4: cannot open file 'Makefile/': No such file or directory
1140 @end example
1141
1142 @comment Test that closed stderr does not cause a crash.  Not all
1143 @comment systems have the same message for EBADF.
1144
1145 @comment xerr: ignore
1146 @example
1147 ifdef(`__unix__', ,
1148       `errprint(` skipping: syscmd does not have unix semantics
1149 ')m4exit(`77')')dnl
1150 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1151        `errprint(` skipping: system does not allow closing stdout
1152 ')m4exit(`77')')dnl
1153 changequote(`[', `]')dnl
1154 syscmd([echo | ']__program__[' >&-])dnl
1155 @error{}m4: write error: Bad file descriptor
1156 sysval
1157 @result{}1
1158 @end example
1159
1160 @example
1161 ifdef(`__unix__', ,
1162       `errprint(` skipping: syscmd does not have unix semantics
1163 ')m4exit(`77')')dnl
1164 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1165        `errprint(` skipping: system does not allow closing stdout
1166 ')m4exit(`77')')dnl
1167 changequote(`[', `]')dnl
1168 syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
1169 )d"nl)dnl' > tmp.m4 \
1170   && ']__program__[' tmp.m4 <&- >&- \
1171   && rm tmp.m4])sysval
1172 @error{}hi
1173 @error{}bye
1174 @result{}0
1175 @end example
1176
1177 @comment Test that we obey POSIX semantics with -D interspersed with
1178 @comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
1179
1180 $ @kbd{m4 }
1181 @example
1182 ifdef(`__unix__', ,
1183       `errprint(` skipping: syscmd does not have unix semantics
1184 ')m4exit(`77')')dnl
1185 changequote(`[', `]')dnl
1186 syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
1187 @result{}hello
1188 @result{}world
1189 sysval
1190 @result{}0
1191 @end example
1192 @end ignore
1193
1194 @node Syntax
1195 @chapter Lexical and syntactic conventions
1196
1197 @cindex input tokens
1198 @cindex tokens
1199 As @code{m4} reads its input, it separates it into @dfn{tokens}.  A
1200 token is either a name, a quoted string, or any single character, that
1201 is not a part of either a name or a string.  Input to @code{m4} can also
1202 contain comments.  GNU @code{m4} does not yet understand
1203 multibyte locales; all operations are byte-oriented rather than
1204 character-oriented (although if your locale uses a single byte
1205 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1206 However, @code{m4} is eight-bit clean, so you can
1207 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1208 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1209 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1210
1211 @comment FIXME - each builtin needs to document how it handles NUL, then
1212 @comment update the above paragraph to mention that NUL is now handled
1213 @comment transparently.
1214
1215 @menu
1216 * Names::                       Macro names
1217 * Quoted strings::              Quoting input to @code{m4}
1218 * Comments::                    Comments in @code{m4} input
1219 * Other tokens::                Other kinds of input tokens
1220 * Input processing::            How @code{m4} copies input to output
1221 * Regular expression syntax::   How @code{m4} interprets regular expressions
1222 @end menu
1223
1224 @node Names
1225 @section Macro names
1226
1227 @cindex names
1228 @cindex words
1229 A name is any sequence of letters, digits, and the character @samp{_}
1230 (underscore), where the first character is not a digit.  @code{m4} will
1231 use the longest such sequence found in the input.  If a name has a
1232 macro definition, it will be subject to macro expansion
1233 (@pxref{Macros}).  Names are case-sensitive.
1234
1235 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1236
1237 The definitions of letters, digits and other input characters can be
1238 changed at any time, using the builtin macro @code{changesyntax}.
1239 @xref{Changesyntax}, for more information.
1240
1241 @node Quoted strings
1242 @section Quoting input to @code{m4}
1243
1244 @cindex quoted string
1245 @cindex string, quoted
1246 A quoted string is a sequence of characters surrounded by quote
1247 strings, defaulting to
1248 @samp{`} (grave-accent, also known as back-tick, with UCS value U0060)
1249 and @samp{'} (apostrophe, also known as single-quote, with UCS value
1250 U0027), where the nested begin and end quotes within the
1251 string are balanced.  The value of a string token is the text, with one
1252 level of quotes stripped off.  Thus
1253
1254 @comment ignore
1255 @example
1256 `'
1257 @result{}
1258 @end example
1259
1260 @noindent
1261 is the empty string, and double-quoting turns into single-quoting.
1262
1263 @comment ignore
1264 @example
1265 ``quoted''
1266 @result{}`quoted'
1267 @end example
1268
1269 The quote characters can be changed at any time, using the builtin macros
1270 @code{changequote} (@pxref{Changequote}) or @code{changesyntax}
1271 (@pxref{Changesyntax}).
1272
1273 @node Comments
1274 @section Comments in @code{m4} input
1275
1276 @cindex comments
1277 Comments in @code{m4} are normally delimited by the characters @samp{#}
1278 and newline.  All characters between the comment delimiters are ignored,
1279 but the entire comment (including the delimiters) is passed through to
1280 the output, unless you supply the @option{--discard-comments} or
1281 @option{-c} option at the command line (@pxref{Operation modes, ,
1282 Invoking m4}).  When discarding comments, the comment delimiters are
1283 discarded, even if the close-comment string is a newline.
1284
1285 Comments cannot be nested, so the first newline after a @samp{#} ends
1286 the comment.  The commenting effect of the begin-comment string
1287 can be inhibited by quoting it.
1288
1289 @example
1290 $ @kbd{m4}
1291 `quoted text' # `commented text'
1292 @result{}quoted text # `commented text'
1293 `quoting inhibits' `#' `comments'
1294 @result{}quoting inhibits # comments
1295 @end example
1296
1297 @comment options: -c
1298 @example
1299 $ @kbd{m4 -c}
1300 `quoted text' # `commented text'
1301 `quoting inhibits' `#' `comments'
1302 @result{}quoted text quoting inhibits # comments
1303 @end example
1304
1305 The comment delimiters can be changed to any string at any time, using
1306 the builtin macros @code{changecom} (@pxref{Changecom}) or
1307 @code{changesyntax} (@pxref{Changesyntax}).
1308
1309 @node Other tokens
1310 @section Other kinds of input tokens
1311
1312 @cindex tokens, special
1313 Any character, that is neither a part of a name, nor of a quoted string,
1314 nor a comment, is a token by itself.  When not in the context of macro
1315 expansion, all of these tokens are just copied to output.  However,
1316 during macro expansion, whitespace characters (space, tab, newline,
1317 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1318 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1319 roles, explained later.  Which characters actually perform these roles
1320 can be adjusted with @code{changesyntax} (@pxref{Changesyntax}).
1321
1322 @node Input processing
1323 @section How @code{m4} copies input to output
1324
1325 As @code{m4} reads the input token by token, it will copy each token
1326 directly to the output immediately.
1327
1328 The exception is when it finds a word with a macro definition.  In that
1329 case @code{m4} will calculate the macro's expansion, possibly reading
1330 more input to get the arguments.  It then inserts the expansion in front
1331 of the remaining input.  In other words, the resulting text from a macro
1332 call will be read and parsed into tokens again.
1333
1334 @code{m4} expands a macro as soon as possible.  If it finds a macro call
1335 when collecting the arguments to another, it will expand the second call
1336 first.  This process continues until there are no more macro calls to
1337 expand and all the input has been consumed.
1338
1339 For a running example, examine how @code{m4} handles this input:
1340
1341 @comment ignore
1342 @example
1343 format(`Result is %d', eval(`2**15'))
1344 @end example
1345
1346 @noindent
1347 First, @code{m4} sees that the token @samp{format} is a macro name, so
1348 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1349 and @samp{@w{ }}, before encountering another potential macro.  Sure
1350 enough, @samp{eval} is a macro name, so the nested argument collection
1351 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1352 with the lone argument of @samp{2**15}.  The expansion of
1353 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1354 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1355 combined with the next @samp{)}, the format macro now has all its
1356 arguments, as if the user had typed:
1357
1358 @comment ignore
1359 @example
1360 format(`Result is %d', 32768)
1361 @end example
1362
1363 @noindent
1364 The format macro expands to @samp{Result is 32768}, and we have another
1365 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1366 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1367 @samp{8}.  None of these are macros, so the final output is
1368
1369 @comment ignore
1370 @example
1371 @result{}Result is 32768
1372 @end example
1373
1374 As a more complicated example, we will contrast an actual code example
1375 from the Gnulib project@footnote{Derived from a patch in
1376 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1377 and a followup patch in
1378 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1379 showing both a buggy approach and the desired results.  The user desires
1380 to output a shell assignment statement that takes its argument and turns
1381 it into a shell variable by converting it to uppercase and prepending a
1382 prefix.  The original attempt looks like this:
1383
1384 @example
1385 changequote([,])dnl
1386 define([gl_STRING_MODULE_INDICATOR],
1387   [
1388     dnl comment
1389     GNULIB_]translit([$1],[a-z],[A-Z])[=1
1390   ])dnl
1391   gl_STRING_MODULE_INDICATOR([strcase])
1392 @result{} @w{ }
1393 @result{}        GNULIB_strcase=1
1394 @result{} @w{ }
1395 @end example
1396
1397 Oops -- the argument did not get capitalized.  And although the manual
1398 is not able to easily show it, both lines that appear empty actually
1399 contain two trailing spaces.  By stepping through the parse, it is easy
1400 to see what happened.  First, @code{m4} sees the token
1401 @samp{changequote}, which it recognizes as a macro, followed by
1402 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1403 argument list.  The macro expands to the empty string, but changes the
1404 quoting characters to something more useful for generating shell code
1405 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1406 but unbalanced @samp{[]} tend to be rare).  Also in the first line,
1407 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1408 macro that consumes the rest of the line, resulting in no output for
1409 that line.
1410
1411 The second line starts a macro definition.  @code{m4} sees the token
1412 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1413 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}.  Because an unquoted
1414 comma was encountered, the first argument is known to be the expansion
1415 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1416 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1417 whitespace is discarded as part of argument collection.  Then comes a
1418 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1419 comment@key{NL}@ @ @ @ GNULIB_]}.  This is followed by the token
1420 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1421 macro expansion has started.
1422
1423 The arguments to the @code{translit} are found by the tokens @samp{(},
1424 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1425 @samp{)}.  All three string arguments are expanded (or in other words,
1426 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1427 capitalization, the result of the macro is @samp{$1}.  This expansion is
1428 rescanned, resulting in the two literal characters @samp{$} and
1429 @samp{1}.
1430
1431 Scanning of the outer macro resumes, and picks up with
1432 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}.  The collected pieces of
1433 expanded text are concatenated, with the end result that the macro
1434 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1435 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1436 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1437
1438 The final line is then parsed, beginning with @samp{ } and @samp{ }
1439 that are output literally.  Then @samp{gl_STRING_MODULE_INDICATOR} is
1440 recognized as a macro name, with an argument list of @samp{(},
1441 @samp{[strcase]}, and @samp{)}.  Since the definition of the macro
1442 contains the sequence @samp{$1}, that sequence is replaced with the
1443 argument @samp{strcase} prior to starting the rescan.  The rescan sees
1444 @samp{@key{NL}} and four spaces, which are output literally, then
1445 @samp{dnl}, which discards the text @samp{ comment@key{NL}}.  Next
1446 comes four more spaces, also output literally, and the token
1447 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1448 substitution.  Since that is not a macro name, it is output literally,
1449 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1450 two more spaces.  Finally, the original @samp{@key{NL}} seen after the
1451 macro invocation is scanned and output literally.
1452
1453 Now for a corrected approach.  This rearranges the use of newlines and
1454 whitespace so that less whitespace is output (which, although harmless
1455 to shell scripts, can be visually unappealing), and fixes the quoting
1456 issues so that the capitalization occurs when the macro
1457 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1458 defined.  It also adds another layer of quoting to the first argument of
1459 @code{translit}, to ensure that the output will be rescanned as a string
1460 rather than a potential uppercase macro name needing further expansion.
1461
1462 @example
1463 changequote([,])dnl
1464 define([gl_STRING_MODULE_INDICATOR],
1465   [dnl comment
1466   GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1467 ])dnl
1468   gl_STRING_MODULE_INDICATOR([strcase])
1469 @result{}    GNULIB_STRCASE=1
1470 @end example
1471
1472 The parsing of the first line is unchanged.  The second line sees the
1473 name of the macro to define, then sees the discarded @samp{@key{NL}}
1474 and two spaces, as before.  But this time, the next token is
1475 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1476 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1477 @samp{)} to end the macro definition and @samp{dnl} to skip the
1478 newline.  No early expansion of @code{translit} occurs, so the entire
1479 string becomes the definition of the macro.
1480
1481 The final line is then parsed, beginning with two spaces that are
1482 output literally, and an invocation of
1483 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1484 Again, the @samp{$1} in the macro definition is substituted prior to
1485 rescanning.  Rescanning first encounters @samp{dnl}, and discards
1486 @samp{ comment@key{NL}}.  Then two spaces are output literally.  Next
1487 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1488 output literally.  The token @samp{[]} is an empty string, so it does
1489 not affect output.  Then the token @samp{translit} is encountered.
1490
1491 This time, the arguments to @code{translit} are parsed as @samp{(},
1492 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1493 @samp{[A-Z]}, and @samp{)}.  The two spaces are discarded, and the
1494 translit results in the desired result @samp{[STRCASE]}.  This is
1495 rescanned, but since it is a string, the quotes are stripped and the
1496 only output is a literal @samp{STRCASE}.
1497 Then the scanner sees @samp{=} and @samp{1}, which are output
1498 literally, followed by @samp{dnl} which discards the rest of the
1499 definition of @code{gl_STRING_MODULE_INDICATOR}.  The newline at the
1500 end of output is the literal @samp{@key{NL}} that appeared after the
1501 invocation of the macro.
1502
1503 The order in which @code{m4} expands the macros can be further explored
1504 using the trace facilities of GNU @code{m4} (@pxref{Trace}).
1505
1506 @node Regular expression syntax
1507 @section How @code{m4} interprets regular expressions
1508
1509 There are several contexts where @code{m4} parses an argument as a
1510 regular expression.  This section describes the various flavors of
1511 regular expressions.  @xref{Changeresyntax}.
1512
1513 @include regexprops-generic.texi
1514
1515 @node Macros
1516 @chapter How to invoke macros
1517
1518 This chapter covers macro invocation, macro arguments and how macro
1519 expansion is treated.
1520
1521 @menu
1522 * Invocation::                  Macro invocation
1523 * Inhibiting Invocation::       Preventing macro invocation
1524 * Macro Arguments::             Macro arguments
1525 * Quoting Arguments::           On Quoting Arguments to macros
1526 * Macro expansion::             Expanding macros
1527 @end menu
1528
1529 @node Invocation
1530 @section Macro invocation
1531
1532 @cindex macro invocation
1533 @cindex invoking macros
1534 Macro invocations has one of the forms
1535
1536 @comment ignore
1537 @example
1538 name
1539 @end example
1540
1541 @noindent
1542 which is a macro invocation without any arguments, or
1543
1544 @comment ignore
1545 @example
1546 name(arg1, arg2, @dots{}, arg@var{n})
1547 @end example
1548
1549 @noindent
1550 which is a macro invocation with @var{n} arguments.  Macros can have any
1551 number of arguments.  All arguments are strings, but different macros
1552 might interpret the arguments in different ways.
1553
1554 The opening parenthesis @emph{must} follow the @var{name} directly, with
1555 no spaces in between.  If it does not, the macro is called with no
1556 arguments at all.
1557
1558 For a macro call to have no arguments, the parentheses @emph{must} be
1559 left out.  The macro call
1560
1561 @comment ignore
1562 @example
1563 name()
1564 @end example
1565
1566 @noindent
1567 is a macro call with one argument, which is the empty string, not a call
1568 with no arguments.
1569
1570 @node Inhibiting Invocation
1571 @section Preventing macro invocation
1572
1573 An innovation of the @code{m4} language, compared to some of its
1574 predecessors (like Strachey's @code{GPM}, for example), is the ability
1575 to recognize macro calls without resorting to any special, prefixed
1576 invocation character.  While generally useful, this feature might
1577 sometimes be the source of spurious, unwanted macro calls.  So, GNU
1578 @code{m4} offers several mechanisms or techniques for inhibiting the
1579 recognition of names as macro calls.
1580
1581 @cindex GNU extensions
1582 @cindex blind macro
1583 @cindex macro, blind
1584 First of all, many builtin macros cannot meaningfully be called without
1585 arguments.  As a GNU extension, for any of these macros,
1586 whenever an opening parenthesis does not immediately follow their name,
1587 the builtin macro call is not triggered.  This solves the most usual
1588 cases, like for @samp{include} or @samp{eval}.  Later in this document,
1589 the sentence ``This macro is recognized only with parameters'' refers to
1590 this specific provision of GNU M4, also known as a blind
1591 builtin macro.  For the builtins defined by POSIX that bear
1592 this disclaimer, POSIX specifically states that invoking those
1593 builtins without arguments is unspecified, because many other
1594 implementations simply invoke the builtin as though it were given one
1595 empty argument instead.
1596
1597 @example
1598 $ @kbd{m4}
1599 eval
1600 @result{}eval
1601 eval(`1')
1602 @result{}1
1603 @end example
1604
1605 There is also a command line option (@option{--prefix-builtins}, or
1606 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1607 builtin macros with a prefix of @samp{m4_} at startup.  The option has
1608 no effect whatsoever on user defined macros.  For example, with this option,
1609 one has to write @code{m4_dnl} and even @code{m4_m4exit}.  It also has
1610 no effect on whether a macro requires parameters.
1611
1612 @comment options: -P
1613 @example
1614 $ @kbd{m4 -P}
1615 eval
1616 @result{}eval
1617 eval(`1')
1618 @result{}eval(1)
1619 m4_eval
1620 @result{}m4_eval
1621 m4_eval(`1')
1622 @result{}1
1623 @end example
1624
1625 Another alternative is to redefine problematic macros to a name less
1626 likely to cause conflicts, using @ref{Definitions}.  Or the parsing
1627 engine can be changed to redefine what constitutes a valid macro name,
1628 using @ref{Changesyntax}.
1629
1630 Of course, the simplest way to prevent a name from being interpreted
1631 as a call to an existing macro is to quote it.  The remainder of
1632 this section studies a little more deeply how quoting affects macro
1633 invocation, and how quoting can be used to inhibit macro invocation.
1634
1635 Even if quoting is usually done over the whole macro name, it can also
1636 be done over only a few characters of this name (provided, of course,
1637 that the unquoted portions are not also a macro).  It is also possible
1638 to quote the empty string, but this works only @emph{inside} the name.
1639 For example:
1640
1641 @example
1642 `divert'
1643 @result{}divert
1644 `d'ivert
1645 @result{}divert
1646 di`ver't
1647 @result{}divert
1648 div`'ert
1649 @result{}divert
1650 @end example
1651
1652 @noindent
1653 all yield the string @samp{divert}.  While in both:
1654
1655 @example
1656 `'divert
1657 @result{}
1658 divert`'
1659 @result{}
1660 @end example
1661
1662 @noindent
1663 the @code{divert} builtin macro will be called, which expands to the
1664 empty string.
1665
1666 @cindex rescanning
1667 The output of macro evaluations is always rescanned.  In the following
1668 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1669 if @code{m4}
1670 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1671
1672 @example
1673 define(`cde', `CDE')
1674 @result{}
1675 define(`x', `substr(ab')
1676 @result{}
1677 define(`y', `cde, `1', `3')')
1678 @result{}
1679 x`'y
1680 @result{}bCD
1681 @end example
1682
1683 Unquoted strings on either side of a quoted string are subject to
1684 being recognized as macro names.  In the following example, quoting the
1685 empty string allows for the second @code{macro} to be recognized as such:
1686
1687 @example
1688 define(`macro', `m')
1689 @result{}
1690 macro(`m')macro
1691 @result{}mmacro
1692 macro(`m')`'macro
1693 @result{}mm
1694 @end example
1695
1696 Quoting may prevent recognizing as a macro name the concatenation of a
1697 macro expansion with the surrounding characters.  In this example:
1698
1699 @example
1700 define(`macro', `di$1')
1701 @result{}
1702 macro(`v')`ert'
1703 @result{}divert
1704 macro(`v')ert
1705 @result{}
1706 @end example
1707
1708 @noindent
1709 the input will produce the string @samp{divert}.  When the quotes were
1710 removed, the @code{divert} builtin was called instead.
1711
1712 @node Macro Arguments
1713 @section Macro arguments
1714
1715 @cindex macros, arguments to
1716 @cindex arguments to macros
1717 When a name is seen, and it has a macro definition, it will be expanded
1718 as a macro.
1719
1720 If the name is followed by an opening parenthesis, the arguments will be
1721 collected before the macro is called.  If too few arguments are
1722 supplied, the missing arguments are taken to be the empty string.
1723 However, some builtins are documented to behave differently for a
1724 missing optional argument than for an explicit empty string.  If there
1725 are too many arguments, the excess arguments are ignored.  Unquoted
1726 leading whitespace is stripped off all arguments, but whitespace
1727 generated by a macro expansion or occurring after a macro that expanded
1728 to an empty string remains intact.  Whitespace includes space, tab,
1729 newline, carriage return, vertical tab, and formfeed.
1730
1731 @example
1732 define(`macro', `$1')
1733 @result{}
1734 macro( unquoted leading space lost)
1735 @result{}unquoted leading space lost
1736 macro(` quoted leading space kept')
1737 @result{} quoted leading space kept
1738 macro(
1739  divert `unquoted space kept after expansion')
1740 @result{} unquoted space kept after expansion
1741 macro(macro(`
1742 ')`whitespace from expansion kept')
1743 @result{}
1744 @result{}whitespace from expansion kept
1745 macro(`unquoted trailing whitespace kept'
1746 )
1747 @result{}unquoted trailing whitespace kept
1748 @result{}
1749 @end example
1750
1751 @cindex warnings, suppressing
1752 @cindex suppressing warnings
1753 Normally @code{m4} will issue warnings if a builtin macro is called
1754 with an inappropriate number of arguments, but it can be suppressed with
1755 the @option{--quiet} command line option (or @option{--silent}, or
1756 @option{-Q}, @pxref{Operation modes, , Invoking m4}).  For user
1757 defined macros, there is no check of the number of arguments given.
1758
1759 @example
1760 $ @kbd{m4}
1761 index(`abc')
1762 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
1763 @result{}0
1764 index(`abc',)
1765 @result{}0
1766 index(`abc', `b', `0', `ignored')
1767 @error{}m4:stdin:3: warning: index: extra arguments ignored: 4 > 3
1768 @result{}1
1769 @end example
1770
1771 @comment options: -Q
1772 @example
1773 $ @kbd{m4 -Q}
1774 index(`abc')
1775 @result{}0
1776 index(`abc',)
1777 @result{}0
1778 index(`abc', `b', `', `ignored')
1779 @result{}1
1780 @end example
1781
1782 Macros are expanded normally during argument collection, and whatever
1783 commas, quotes and parentheses that might show up in the resulting
1784 expanded text will serve to define the arguments as well.  Thus, if
1785 @var{foo} expands to @samp{, b, c}, the macro call
1786
1787 @comment ignore
1788 @example
1789 bar(a foo, d)
1790 @end example
1791
1792 @noindent
1793 is a macro call with four arguments, which are @samp{a }, @samp{b},
1794 @samp{c} and @samp{d}.  To understand why the first argument contains
1795 whitespace, remember that unquoted leading whitespace is never part
1796 of an argument, but trailing whitespace always is.
1797
1798 It is possible for a macro's definition to change during argument
1799 collection, in which case the expansion uses the definition that was in
1800 effect at the time the opening @samp{(} was seen.
1801
1802 @example
1803 define(`f', `1')
1804 @result{}
1805 f(define(`f', `2'))
1806 @result{}1
1807 f
1808 @result{}2
1809 @end example
1810
1811 It is an error if the end of file occurs while collecting arguments.
1812
1813 @comment status: 1
1814 @example
1815 hello world
1816 @result{}hello world
1817 define(
1818 ^D
1819 @error{}m4:stdin:2: define: end of file in argument list
1820 @end example
1821
1822 @node Quoting Arguments
1823 @section On Quoting Arguments to macros
1824
1825 @cindex quoted macro arguments
1826 @cindex macros, quoted arguments to
1827 @cindex arguments, quoted macro
1828 Each argument has unquoted leading whitespace removed.  Within each
1829 argument, all unquoted parentheses must match.  For example, if
1830 @var{foo} is a macro,
1831
1832 @comment ignore
1833 @example
1834 foo(() (`(') `(')
1835 @end example
1836
1837 @noindent
1838 is a macro call, with one argument, whose value is @samp{() (() (}.
1839 Commas separate arguments, except when they occur inside quotes,
1840 comments, or unquoted parentheses.  @xref{Pseudo Arguments}, for
1841 examples.
1842
1843 It is common practice to quote all arguments to macros, unless you are
1844 sure you want the arguments expanded.  Thus, in the above
1845 example with the parentheses, the `right' way to do it is like this:
1846
1847 @comment ignore
1848 @example
1849 foo(`() (() (')
1850 @end example
1851
1852 @cindex quoting rule of thumb
1853 @cindex rule of thumb, quoting
1854 It is, however, in certain cases necessary (because nested expansion
1855 must occur to create the arguments for the outer macro) or convenient
1856 (because it uses fewer characters) to leave out quotes for some
1857 arguments, and there is nothing wrong in doing it.  It just makes life a
1858 bit harder, if you are not careful to follow a consistent quoting style.
1859 For consistency, this manual follows the rule of thumb that each layer
1860 of parentheses introduces another layer of single quoting, except when
1861 showing the consequences of quoting rules.  This is done even when the
1862 quoted string cannot be a macro, such as with integers when you have not
1863 changed the syntax via @code{changesyntax} (@pxref{Changesyntax}).
1864
1865 The quoting rule of thumb of one level of quoting per parentheses has a
1866 nice property: when a macro name appears inside parentheses, you can
1867 determine when it will be expanded.  If it is not quoted, it will be
1868 expanded prior to the outer macro, so that its expansion becomes the
1869 argument.  If it is single-quoted, it will be expanded after the outer
1870 macro.  And if it is double-quoted, it will be used as literal text
1871 instead of a macro name.
1872
1873 @example
1874 define(`active', `ACT, IVE')
1875 @result{}
1876 define(`show', `$1 $1')
1877 @result{}
1878 show(active)
1879 @result{}ACT ACT
1880 show(`active')
1881 @result{}ACT, IVE ACT, IVE
1882 show(``active'')
1883 @result{}active active
1884 @end example
1885
1886 @node Macro expansion
1887 @section Macro expansion
1888
1889 @cindex macros, expansion of
1890 @cindex expansion of macros
1891 When the arguments, if any, to a macro call have been collected, the
1892 macro is expanded, and the expansion text is pushed back onto the input
1893 (unquoted), and reread.  The expansion text from one macro call might
1894 therefore result in more macros being called, if the calls are included,
1895 completely or partially, in the first macro calls' expansion.
1896
1897 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1898 @var{bar} expands to @samp{Hello world}, the input
1899
1900 @comment options: -Dbar='Hello world' -Dfoo=bar
1901 @example
1902 $ @kbd{m4 -Dbar="Hello world" -Dfoo=bar}
1903 foo
1904 @result{}Hello world
1905 @end example
1906
1907 @noindent
1908 will expand first to @samp{bar}, and when this is reread and
1909 expanded, into @samp{Hello world}.
1910
1911 @node Definitions
1912 @chapter How to define new macros
1913
1914 @cindex macros, how to define new
1915 @cindex defining new macros
1916 Macros can be defined, redefined and deleted in several different ways.
1917 Also, it is possible to redefine a macro without losing a previous
1918 value, and bring back the original value at a later time.
1919
1920 @menu
1921 * Define::                      Defining a new macro
1922 * Arguments::                   Arguments to macros
1923 * Pseudo Arguments::            Special arguments to macros
1924 * Undefine::                    Deleting a macro
1925 * Defn::                        Renaming macros
1926 * Pushdef::                     Temporarily redefining macros
1927 * Renamesyms::                  Renaming macros with regular expressions
1928
1929 * Indir::                       Indirect call of macros
1930 * Builtin::                     Indirect call of builtins
1931 * M4symbols::                   Getting the defined macro names
1932 @end menu
1933
1934 @node Define
1935 @section Defining a macro
1936
1937 The normal way to define or redefine macros is to use the builtin
1938 @code{define}:
1939
1940 @deffn {Builtin (m4)} define (@var{name}, @ovar{expansion})
1941 Defines @var{name} to expand to @var{expansion}.  If
1942 @var{expansion} is not given, it is taken to be empty.
1943
1944 The expansion of @code{define} is void.
1945 The macro @code{define} is recognized only with parameters.
1946 @end deffn
1947 @comment Other implementations, such as Solaris, can define a macro
1948 @comment with a builtin token attached to text:
1949 @comment  define(foo, a`'defn(`divnum')b)
1950 @comment  defn(`foo') => ab
1951 @comment  dumpdef(`foo') => foo: a<divnum>b
1952 @comment  len(defn(`foo')) => 3
1953 @comment  index(defn(`foo'), defn(`divnum')) => 1
1954 @comment  foo => a0b
1955 @comment It may be worth making some changes to support this behavior,
1956 @comment or something similar to it.
1957 @comment
1958 @comment But be sure it has sane semantics, with potentially deferred
1959 @comment expansion of builtins.  For example, this should not warn
1960 @comment about trying to access the definition of an undefined macro:
1961 @comment  define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops')
1962 @comment Also, think how to handle conflicting argument counts:
1963 @comment  define(`bar', defn(`dnl', `len'))
1964
1965 The following example defines the macro @var{foo} to expand to the text
1966 @samp{Hello World.}.
1967
1968 @example
1969 define(`foo', `Hello world.')
1970 @result{}
1971 foo
1972 @result{}Hello world.
1973 @end example
1974
1975 The empty line in the output is there because the newline is not
1976 a part of the macro definition, and it is consequently copied to
1977 the output.  This can be avoided by use of the macro @code{dnl}.
1978 @xref{Dnl}, for details.
1979
1980 The first argument to @code{define} should be quoted; otherwise, if the
1981 macro is already defined, you will be defining a different macro.  This
1982 example shows the problems with underquoting, since we did not want to
1983 redefine @code{one}:
1984
1985 @example
1986 define(foo, one)
1987 @result{}
1988 define(foo, two)
1989 @result{}
1990 one
1991 @result{}two
1992 @end example
1993
1994 @cindex GNU extensions
1995 GNU @code{m4} normally replaces only the @emph{topmost}
1996 definition of a macro if it has several definitions from @code{pushdef}
1997 (@pxref{Pushdef}).  Some other implementations of @code{m4} replace all
1998 definitions of a macro with @code{define}.  @xref{Incompatibilities},
1999 for more details.
2000
2001 As a GNU extension, the first argument to @code{define} does
2002 not have to be a simple word.
2003 It can be any text string, even the empty string.  A macro with a
2004 non-standard name cannot be invoked in the normal way, as the name is
2005 not recognized.  It can only be referenced by the builtins @code{Indir}
2006 (@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
2007
2008 @cindex arrays
2009 Arrays and associative arrays can be simulated by using non-standard
2010 macro names.
2011
2012 @deffn Composite array (@var{index})
2013 @deffnx Composite array_set (@var{index}, @ovar{value})
2014 Provide access to entries within an array.  @code{array} reads the entry
2015 at location @var{index}, and @code{array_set} assigns @var{value} to
2016 location @var{index}.
2017 @end deffn
2018
2019 @example
2020 define(`array', `defn(format(``array[%d]'', `$1'))')
2021 @result{}
2022 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
2023 @result{}
2024 array_set(`4', `array element no. 4')
2025 @result{}
2026 array_set(`17', `array element no. 17')
2027 @result{}
2028 array(`4')
2029 @result{}array element no. 4
2030 array(eval(`10 + 7'))
2031 @result{}array element no. 17
2032 @end example
2033
2034 Change the @samp{%d} to @samp{%s} and it is an associative array.
2035
2036 @node Arguments
2037 @section Arguments to macros
2038
2039 @cindex macros, arguments to
2040 @cindex arguments to macros
2041 Macros can have arguments.  The @var{n}th argument is denoted by
2042 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
2043 argument, when the macro is expanded.  Replacement of arguments happens
2044 before rescanning, regardless of how many nesting levels of quoting
2045 appear in the expansion.  Here is an example of a macro with
2046 two arguments.
2047
2048 @deffn Composite exch (@var{arg1}, @var{arg2})
2049 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
2050 their order.
2051 @end deffn
2052
2053 @example
2054 define(`exch', `$2, $1')
2055 @result{}
2056 exch(`arg1', `arg2')
2057 @result{}arg2, arg1
2058 @end example
2059
2060 This can be used, for example, if you like the arguments to
2061 @code{define} to be reversed.
2062
2063 @example
2064 define(`exch', `$2, $1')
2065 @result{}
2066 define(exch(``expansion text'', ``macro''))
2067 @result{}
2068 macro
2069 @result{}expansion text
2070 @end example
2071
2072 @xref{Quoting Arguments}, for an explanation of the double quotes.
2073 (You should try and improve this example so that clients of @code{exch}
2074 do not have to double quote; or @pxref{Improved exch, , Answers}).
2075
2076 @cindex GNU extensions
2077 GNU @code{m4} allows the number following the @samp{$} to
2078 consist of one
2079 or more digits, allowing macros to have any number of arguments.  This
2080 is not so in UNIX implementations of @code{m4}, which only recognize
2081 one digit.
2082 @comment FIXME - See Austin group XCU ERN 111.  POSIX says that $11 must
2083 @comment be the first argument concatenated with 1, and instead reserves
2084 @comment ${11} for implementation use.  Once this is implemented, the
2085 @comment documentation needs to reflect how these extended arguments
2086 @comment are handled, as well as backwards compatibility issues with
2087 @comment 1.4.x.  Also, consider adding further extensions such as
2088 @comment ${1-default}, which expands to `default' if $1 is empty.
2089
2090 As a special case, the zeroth argument, @code{$0}, is always the name
2091 of the macro being expanded.
2092
2093 @example
2094 define(`test', ``Macro name: $0'')
2095 @result{}
2096 test
2097 @result{}Macro name: test
2098 @end example
2099
2100 If you want quoted text to appear as part of the expansion text,
2101 remember that quotes can be nested in quoted strings.  Thus, in
2102
2103 @example
2104 define(`foo', `This is macro `foo'.')
2105 @result{}
2106 foo
2107 @result{}This is macro foo.
2108 @end example
2109
2110 @noindent
2111 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
2112 a quoted string, and not a name.
2113
2114 @node Pseudo Arguments
2115 @section Special arguments to macros
2116
2117 @cindex special arguments to macros
2118 @cindex macros, special arguments to
2119 @cindex arguments to macros, special
2120 There is a special notation for the number of actual arguments supplied,
2121 and for all the actual arguments.
2122
2123 The number of actual arguments in a macro call is denoted by @code{$#}
2124 in the expansion text.
2125
2126 @deffn Composite nargs (@dots{})
2127 Expands to a count of the number of arguments supplied.
2128 @end deffn
2129
2130 @example
2131 define(`nargs', `$#')
2132 @result{}
2133 nargs
2134 @result{}0
2135 nargs()
2136 @result{}1
2137 nargs(`arg1', `arg2', `arg3')
2138 @result{}3
2139 nargs(`commas can be quoted, like this')
2140 @result{}1
2141 nargs(arg1#inside comments, commas do not separate arguments
2142 still arg1)
2143 @result{}1
2144 nargs((unquoted parentheses, like this, group arguments))
2145 @result{}1
2146 @end example
2147
2148 Remember that @samp{#} defaults to the comment character; if you forget
2149 quotes to inhibit the comment behavior, your macro definition may not
2150 end where you expected.
2151
2152 @example
2153 dnl Attempt to define a macro to just `$#'
2154 define(underquoted, $#)
2155 oops)
2156 @result{}
2157 underquoted
2158 @result{}0)
2159 @result{}oops
2160 @end example
2161
2162 The notation @code{$*} can be used in the expansion text to denote all
2163 the actual arguments, unquoted, with commas in between.  For example
2164
2165 @example
2166 define(`echo', `$*')
2167 @result{}
2168 echo(arg1,    arg2, arg3 , arg4)
2169 @result{}arg1,arg2,arg3 ,arg4
2170 @end example
2171
2172 Often each argument should be quoted, and the notation @code{$@@} handles
2173 that.  It is just like @code{$*}, except that it quotes each argument.
2174 A simple example of that is:
2175
2176 @example
2177 define(`echo', `$@@')
2178 @result{}
2179 echo(arg1,    arg2, arg3 , arg4)
2180 @result{}arg1,arg2,arg3 ,arg4
2181 @end example
2182
2183 Where did the quotes go?  Of course, they were eaten, when the expanded
2184 text were reread by @code{m4}.  To show the difference, try
2185
2186 @example
2187 define(`echo1', `$*')
2188 @result{}
2189 define(`echo2', `$@@')
2190 @result{}
2191 define(`foo', `This is macro `foo'.')
2192 @result{}
2193 echo1(foo)
2194 @result{}This is macro This is macro foo..
2195 echo1(`foo')
2196 @result{}This is macro foo.
2197 echo2(foo)
2198 @result{}This is macro foo.
2199 echo2(`foo')
2200 @result{}foo
2201 @end example
2202
2203 @noindent
2204 @xref{Trace}, if you do not understand this.  As another example of the
2205 difference, remember that comments encountered in arguments are passed
2206 untouched to the macro, and that quoting disables comments.
2207
2208 @example
2209 define(`echo1', `$*')
2210 @result{}
2211 define(`echo2', `$@@')
2212 @result{}
2213 define(`foo', `bar')
2214 @result{}
2215 echo1(#foo'foo
2216 foo)
2217 @result{}#foo'foo
2218 @result{}bar
2219 echo2(#foo'foo
2220 foo)
2221 @result{}#foobar
2222 @result{}bar'
2223 @end example
2224
2225 A @samp{$} sign in the expansion text, that is not followed by anything
2226 @code{m4} understands, is simply copied to the macro expansion, as any
2227 other text is.
2228
2229 @example
2230 define(`foo', `$$$ hello $$$')
2231 @result{}
2232 foo
2233 @result{}$$$ hello $$$
2234 @end example
2235
2236 @cindex rescanning
2237 @cindex literal output
2238 @cindex output, literal
2239 If you want a macro to expand to something like @samp{$12}, the
2240 judicious use of nested quoting can put a safe character between the
2241 @code{$} and the next character, relying on the rescanning to remove the
2242 nested quote.  This will prevent @code{m4} from interpreting the
2243 @code{$} sign as a reference to an argument.
2244
2245 @example
2246 define(`foo', `no nested quote: $1')
2247 @result{}
2248 foo(`arg')
2249 @result{}no nested quote: arg
2250 define(`foo', `nested quote around $: `$'1')
2251 @result{}
2252 foo(`arg')
2253 @result{}nested quote around $: $1
2254 define(`foo', `nested empty quote after $: $`'1')
2255 @result{}
2256 foo(`arg')
2257 @result{}nested empty quote after $: $1
2258 define(`foo', `nested quote around next character: $`1'')
2259 @result{}
2260 foo(`arg')
2261 @result{}nested quote around next character: $1
2262 define(`foo', `nested quote around both: `$1'')
2263 @result{}
2264 foo(`arg')
2265 @result{}nested quote around both: arg
2266 @end example
2267
2268 @node Undefine
2269 @section Deleting a macro
2270
2271 @cindex macros, how to delete
2272 @cindex deleting macros
2273 @cindex undefining macros
2274 A macro definition can be removed with @code{undefine}:
2275
2276 @deffn {Builtin (m4)} undefine (@var{name}@dots{})
2277 For each argument, remove the macro @var{name}.  The macro names must
2278 necessarily be quoted, since they will be expanded otherwise.  If an
2279 argument is not a defined macro, then the @samp{d} debug level controls
2280 whether a warning is issued (@pxref{Debugmode}).
2281
2282 The expansion of @code{undefine} is void.
2283 The macro @code{undefine} is recognized only with parameters.
2284 @end deffn
2285
2286 @example
2287 foo bar blah
2288 @result{}foo bar blah
2289 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2290 @result{}
2291 foo bar blah
2292 @result{}some other text
2293 undefine(`foo')
2294 @result{}
2295 foo bar blah
2296 @result{}foo other text
2297 undefine(`bar', `blah')
2298 @result{}
2299 foo bar blah
2300 @result{}foo bar blah
2301 @end example
2302
2303 Undefining a macro inside that macro's expansion is safe; the macro
2304 still expands to the definition that was in effect at the @samp{(}.
2305
2306 @example
2307 define(`f', ``$0':$1')
2308 @result{}
2309 f(f(f(undefine(`f')`hello world')))
2310 @result{}f:f:f:hello world
2311 f(`bye')
2312 @result{}f(bye)
2313 @end example
2314
2315 As of M4 1.6, @code{undefine} can warn if @var{name} is not a macro, by
2316 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2317 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2318 m4}).
2319
2320 @example
2321 $ @kbd{m4}
2322 undefine(`a')
2323 @error{}m4:stdin:1: warning: undefine: undefined macro 'a'
2324 @result{}
2325 debugmode(`-d')
2326 @result{}
2327 undefine(`a')
2328 @result{}
2329 @end example
2330
2331 @node Defn
2332 @section Renaming macros
2333
2334 @cindex macros, how to rename
2335 @cindex renaming macros
2336 @cindex macros, displaying definitions
2337 @cindex definitions, displaying macro
2338 It is possible to rename an already defined macro.  To do this, you need
2339 the builtin @code{defn}:
2340
2341 @deffn {Builtin (m4)} defn (@var{name}@dots{})
2342 Expands to the @emph{quoted definition} of each @var{name}.  If an
2343 argument is not a defined macro, the expansion for that argument is
2344 empty, and the @samp{d} debug level controls whether a warning is issued
2345 (@pxref{Debugmode}).
2346
2347 If @var{name} is a user-defined macro, the quoted definition is simply
2348 the quoted expansion text.  If, instead, @var{name} is a builtin, the
2349 expansion is a special token, which points to the builtin's internal
2350 definition.  This token meaningful primarily as the second argument to
2351 @code{define} (and @code{pushdef}), and is silently converted to an
2352 empty string in many other contexts.
2353
2354 The macro @code{defn} is recognized only with parameters.
2355 @end deffn
2356
2357 Its normal use is best understood through an example, which shows how to
2358 rename @code{undefine} to @code{zap}:
2359
2360 @example
2361 define(`zap', defn(`undefine'))
2362 @result{}
2363 zap(`undefine')
2364 @result{}
2365 undefine(`zap')
2366 @result{}undefine(zap)
2367 @end example
2368
2369 In this way, @code{defn} can be used to copy macro definitions, and also
2370 definitions of builtin macros.  Even if the original macro is removed,
2371 the other name can still be used to access the definition.
2372
2373 The fact that macro definitions can be transferred also explains why you
2374 should use @code{$0}, rather than retyping a macro's name in its
2375 definition:
2376
2377 @example
2378 define(`foo', `This is `$0'')
2379 @result{}
2380 define(`bar', defn(`foo'))
2381 @result{}
2382 bar
2383 @result{}This is bar
2384 @end example
2385
2386 Macros used as string variables should be referred through @code{defn},
2387 to avoid unwanted expansion of the text:
2388
2389 @example
2390 define(`string', `The macro dnl is very useful
2391 ')
2392 @result{}
2393 string
2394 @result{}The macro@w{ }
2395 defn(`string')
2396 @result{}The macro dnl is very useful
2397 @result{}
2398 @end example
2399
2400 @cindex rescanning
2401 However, it is important to remember that @code{m4} rescanning is purely
2402 textual.  If an unbalanced end-quote string occurs in a macro
2403 definition, the rescan will see that embedded quote as the termination
2404 of the quoted string, and the remainder of the macro's definition will
2405 be rescanned unquoted.  Thus it is a good idea to avoid unbalanced
2406 end-quotes in macro definitions or arguments to macros.
2407
2408 @example
2409 define(`foo', a'a)
2410 @result{}
2411 define(`a', `A')
2412 @result{}
2413 define(`echo', `$@@')
2414 @result{}
2415 foo
2416 @result{}A'A
2417 defn(`foo')
2418 @result{}aA'
2419 echo(foo)
2420 @result{}AA'
2421 @end example
2422
2423 On the other hand, it is possible to exploit the fact that @code{defn}
2424 can concatenate multiple macros prior to the rescanning phase, in order
2425 to join the definitions of macros that, in isolation, have unbalanced
2426 quotes.  This is particularly useful when one has used several macros to
2427 accumulate text that M4 should rescan as a whole.  In the example below,
2428 note how the use of @code{defn} on @code{l} in isolation opens a string,
2429 which is not closed until the next line; but used on @code{l} and
2430 @code{r} together results in nested quoting.
2431
2432 @example
2433 define(`l', `<[>')define(`r', `<]>')
2434 @result{}
2435 changequote(`[', `]')
2436 @result{}
2437 defn([l])defn([r])
2438 ])
2439 @result{}<[>]defn([r])
2440 @result{})
2441 defn([l], [r])
2442 @result{}<[>][<]>
2443 @end example
2444
2445 @cindex builtins, special tokens
2446 @cindex tokens, builtin macro
2447 Using @code{defn} to generate special tokens for builtin macros will
2448 generate a warning in contexts where a macro name is expected.  But in
2449 contexts that operate on text, the builtin token is just silently
2450 converted to an empty string.  As of M4 1.6, expansion of user macros
2451 will also preserve builtin tokens.  However, any use of builtin tokens
2452 outside of the second argument to @code{define} and @code{pushdef} is
2453 generally not portable, since earlier GNU M4 versions, as well
2454 as other @code{m4} implementations, vary on how such tokens are treated.
2455
2456 @example
2457 $ @kbd{m4 -d}
2458 defn(`defn')
2459 @result{}
2460 define(defn(`divnum'), `cannot redefine a builtin token')
2461 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2462 @result{}
2463 divnum
2464 @result{}0
2465 len(defn(`divnum'))
2466 @result{}0
2467 define(`echo', `$@@')
2468 @result{}
2469 define(`mydivnum', shift(echo(`', defn(`divnum'))))
2470 @result{}
2471 mydivnum
2472 @result{}0
2473 define(`', `empty-$1')
2474 @result{}
2475 defn(defn(`divnum'))
2476 @error{}m4:stdin:9: warning: defn: invalid macro name ignored
2477 @result{}
2478 pushdef(defn(`divnum'), `oops')
2479 @error{}m4:stdin:10: warning: pushdef: invalid macro name ignored
2480 @result{}
2481 traceon(defn(`divnum'))
2482 @error{}m4:stdin:11: warning: traceon: invalid macro name ignored
2483 @result{}
2484 indir(defn(`divnum'), `string')
2485 @error{}m4:stdin:12: warning: indir: invalid macro name ignored
2486 @result{}
2487 indir(`', `string')
2488 @result{}empty-string
2489 traceoff(defn(`divnum'))
2490 @error{}m4:stdin:14: warning: traceoff: invalid macro name ignored
2491 @result{}
2492 popdef(defn(`divnum'))
2493 @error{}m4:stdin:15: warning: popdef: invalid macro name ignored
2494 @result{}
2495 dumpdef(defn(`divnum'))
2496 @error{}m4:stdin:16: warning: dumpdef: invalid macro name ignored
2497 @result{}
2498 undefine(defn(`divnum'))
2499 @error{}m4:stdin:17: warning: undefine: invalid macro name ignored
2500 @result{}
2501 dumpdef(`')
2502 @error{}:@tabchar{}`empty-$1'
2503 @result{}
2504 m4symbols(defn(`divnum'))
2505 @error{}m4:stdin:19: warning: m4symbols: invalid macro name ignored
2506 @result{}
2507 define(`foo', `define(`$1', $2)')dnl
2508 foo(`bar', defn(`divnum'))
2509 @result{}
2510 bar
2511 @result{}0
2512 @end example
2513
2514 As of M4 1.6, @code{defn} can warn if @var{name} is not a macro, by
2515 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2516 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2517 m4}).  Also, @code{defn} with multiple arguments can join text with
2518 builtin tokens.  However, when defining a macro via @code{define} or
2519 @code{pushdef}, a warning is issued and the builtin token ignored if the
2520 builtin token does not occur in isolation.  A future version of
2521 GNU M4 may lift this restriction.
2522
2523 @example
2524 $ @kbd{m4 -d}
2525 defn(`foo')
2526 @error{}m4:stdin:1: warning: defn: undefined macro 'foo'
2527 @result{}
2528 debugmode(`-d')
2529 @result{}
2530 defn(`foo')
2531 @result{}
2532 define(`a', `A')define(`AA', `b')
2533 @result{}
2534 traceon(`defn', `define')
2535 @result{}
2536 defn(`a', `divnum', `a')
2537 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A''
2538 @result{}AA
2539 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2540 @error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
2541 @error{}m4:stdin:7: warning: define: cannot concatenate builtins
2542 @error{}m4trace: -1- define(`mydivnum', `<divnum><divnum>') -> `'
2543 @result{}
2544 traceoff(`defn', `define')dumpdef(`mydivnum')
2545 @error{}mydivnum:@tabchar{}`'
2546 @result{}
2547 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
2548 @error{}m4:stdin:9: warning: define: cannot concatenate builtins
2549 @result{}
2550 define(`mydivnum', defn(`divnum')`a')mydivnum
2551 @error{}m4:stdin:10: warning: define: cannot concatenate builtins
2552 @result{}A
2553 define(`mydivnum', `a'defn(`divnum'))mydivnum
2554 @error{}m4:stdin:11: warning: define: cannot concatenate builtins
2555 @result{}A
2556 define(`q', ``$@@'')
2557 @result{}
2558 define(`foo', q(`a', defn(`divnum')))foo
2559 @error{}m4:stdin:13: warning: define: cannot concatenate builtins
2560 @result{}a,
2561 ifdef(`foo', `yes', `no')
2562 @result{}yes
2563 @end example
2564
2565 @node Pushdef
2566 @section Temporarily redefining macros
2567
2568 @cindex macros, temporary redefinition of
2569 @cindex temporary redefinition of macros
2570 @cindex redefinition of macros, temporary
2571 @cindex definition stack
2572 @cindex pushdef stack
2573 @cindex stack, macro definition
2574 It is possible to redefine a macro temporarily, reverting to the
2575 previous definition at a later time.  This is done with the builtins
2576 @code{pushdef} and @code{popdef}:
2577
2578 @deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion})
2579 @deffnx {Builtin (m4)} popdef (@var{name}@dots{})
2580 Analogous to @code{define} and @code{undefine}.
2581
2582 These macros work in a stack-like fashion.  A macro is temporarily
2583 redefined with @code{pushdef}, which replaces an existing definition of
2584 @var{name}, while saving the previous definition, before the new one is
2585 installed.  If there is no previous definition, @code{pushdef} behaves
2586 exactly like @code{define}.
2587
2588 If a macro has several definitions (of which only one is accessible),
2589 the topmost definition can be removed with @code{popdef}.  If there is
2590 no previous definition, @code{popdef} behaves like @code{undefine}, and
2591 if there is no definition at all, the @samp{d} debug level controls
2592 whether a warning is issued (@pxref{Debugmode}).
2593
2594 The expansion of both @code{pushdef} and @code{popdef} is void.
2595 The macros @code{pushdef} and @code{popdef} are recognized only with
2596 parameters.
2597 @end deffn
2598
2599 @example
2600 define(`foo', `Expansion one.')
2601 @result{}
2602 foo
2603 @result{}Expansion one.
2604 pushdef(`foo', `Expansion two.')
2605 @result{}
2606 foo
2607 @result{}Expansion two.
2608 pushdef(`foo', `Expansion three.')
2609 @result{}
2610 pushdef(`foo', `Expansion four.')
2611 @result{}
2612 popdef(`foo')
2613 @result{}
2614 foo
2615 @result{}Expansion three.
2616 popdef(`foo', `foo')
2617 @result{}
2618 foo
2619 @result{}Expansion one.
2620 popdef(`foo')
2621 @result{}
2622 foo
2623 @result{}foo
2624 @end example
2625
2626 If a macro with several definitions is redefined with @code{define}, the
2627 topmost definition is @emph{replaced} with the new definition.  If it is
2628 removed with @code{undefine}, @emph{all} the definitions are removed,
2629 and not only the topmost one.  However, POSIX allows other
2630 implementations that treat @code{define} as replacing an entire stack
2631 of definitions with a single new definition, so to be portable to other
2632 implementations, it may be worth explicitly using @code{popdef} and
2633 @code{pushdef} rather than relying on the GNU behavior of
2634 @code{define}.
2635
2636 @example
2637 define(`foo', `Expansion one.')
2638 @result{}
2639 foo
2640 @result{}Expansion one.
2641 pushdef(`foo', `Expansion two.')
2642 @result{}
2643 foo
2644 @result{}Expansion two.
2645 define(`foo', `Second expansion two.')
2646 @result{}
2647 foo
2648 @result{}Second expansion two.
2649 undefine(`foo')
2650 @result{}
2651 foo
2652 @result{}foo
2653 @end example
2654
2655 @cindex local variables
2656 @cindex variables, local
2657 Local variables within macros are made with @code{pushdef} and
2658 @code{popdef}.  At the start of the macro a new definition is pushed,
2659 within the macro it is manipulated and at the end it is popped,
2660 revealing the former definition.
2661
2662 It is possible to temporarily redefine a builtin with @code{pushdef}
2663 and @code{defn}.
2664
2665 As of M4 1.6, @code{popdef} can warn if @var{name} is not a macro, by
2666 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2667 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2668 m4}).
2669
2670 @example
2671 define(`a', `1')
2672 @result{}
2673 popdef
2674 @result{}popdef
2675 popdef(`a', `a')
2676 @error{}m4:stdin:3: warning: popdef: undefined macro 'a'
2677 @result{}
2678 debugmode(`-d')
2679 @result{}
2680 popdef(`a')
2681 @result{}
2682 @end example
2683
2684 @node Renamesyms
2685 @section Renaming macros with regular expressions
2686
2687 @cindex regular expressions
2688 @cindex macros, how to rename
2689 @cindex renaming macros
2690 @cindex GNU extensions
2691 Sometimes it is desirable to rename multiple symbols without having to
2692 use a long sequence of calls to @code{define}.  The @code{renamesyms}
2693 builtin allows this:
2694
2695 @deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @
2696   @ovar{resyntax})
2697 Global renaming of macros is done by @code{renamesyms}, which selects
2698 all macros with names that match @var{regexp}, and renames each match
2699 according to @var{replacement}.  It is unspecified what happens if the
2700 rename causes multiple macros to map to the same name.
2701 @comment FIXME - right now, collisions cause a core dump on some platforms:
2702 @comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa')
2703
2704 If @var{resyntax} is given, the particular flavor of regular
2705 expression understood with respect to @var{regexp} can be changed from
2706 the current default.  @xref{Changeresyntax}, for details of the values
2707 that can be given for this argument.
2708
2709 A macro that does not have a name that matches @var{regexp} is left
2710 with its original name.  If only part of the name matches, any part of
2711 the name that is not covered by @var{regexp} is copied to the
2712 replacement name.  Whenever a match is found in the name, the search
2713 proceeds from the end of the match, so no character in the original
2714 name can be substituted twice.  If @var{regexp} matches a string of
2715 zero length, the start position for the continued search is
2716 incremented to avoid infinite loops.
2717
2718 Where a replacement is to be made, @var{replacement} replaces the
2719 matched text in the original name, with @samp{\@var{n}} substituted by
2720 the text matched by the @var{n}th parenthesized sub-expression of
2721 @var{regexp}, and @samp{\&} being the text matched by the entire
2722 regular expression.
2723
2724 The expansion of @code{renamesyms} is void.
2725 The macro @code{renamesyms} is recognized only with parameters.
2726 This macro was added in M4 2.0.
2727 @end deffn
2728
2729 The following example starts with a rename similar to the
2730 @option{--prefix-builtins} option (or @option{-P}), prefixing every
2731 macro with @code{m4_}.  However, note that @option{-P} only renames M4
2732 builtin macros, even if other macros were defined previously, while
2733 @code{renamesyms} will rename any macros that match when it runs,
2734 including text macros.  The rest of the example demonstrates the
2735 behavior of unanchored regular expressions in symbol renaming.
2736
2737 @comment options: -Dfoo=bar -P
2738 @example
2739 $ @kbd{m4 -Dfoo=bar -P}
2740 foo
2741 @result{}bar
2742 m4_foo
2743 @result{}m4_foo
2744 m4_defn(`foo')
2745 @result{}bar
2746 @end example
2747
2748 @example
2749 $ @kbd{m4}
2750 define(`foo', `bar')
2751 @result{}
2752 renamesyms(`^.*$', `m4_\&')
2753 @result{}
2754 foo
2755 @result{}foo
2756 m4_foo
2757 @result{}bar
2758 m4_defn(`m4_foo')
2759 @result{}bar
2760 m4_renamesyms(`f', `g')
2761 @result{}
2762 m4_igdeg(`m4_goo', `m4_goo')
2763 @result{}bar
2764 @end example
2765
2766 If @var{resyntax} is given, @var{regexp} must be given according to
2767 the syntax chosen, though the default regular expression syntax
2768 remains unchanged for other invocations.  Here is a more realistic
2769 example that performs a similar renaming on macros, except that it
2770 ignores macros with names that begin with @samp{_}, and avoids creating
2771 macros with names that begin with @samp{m4_m4}.
2772
2773 @example
2774 renamesyms(`^[^_]\w*$', `m4_\&')
2775 @result{}
2776 m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED')
2777 @result{}
2778 m4_wrap(__line__
2779 )
2780 @result{}
2781 ^D
2782 @result{}3
2783 @end example
2784
2785 When a symbol has multiple definitions, thanks to @code{pushdef}, the
2786 entire stack is renamed.
2787
2788 @example
2789 pushdef(`foo', `1')pushdef(`foo', `2')
2790 @result{}
2791 renamesyms(`^foo$', `bar')
2792 @result{}
2793 bar
2794 @result{}2
2795 popdef(`bar')bar
2796 @result{}1
2797 popdef(`bar')bar
2798 @result{}bar
2799 @end example
2800
2801 @node Indir
2802 @section Indirect call of macros
2803
2804 @cindex indirect call of macros
2805 @cindex call of macros, indirect
2806 @cindex macros, indirect call of
2807 @cindex GNU extensions
2808 Any macro can be called indirectly with @code{indir}:
2809
2810 @deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}})
2811 Results in a call to the macro @var{name}, which is passed the rest of
2812 the arguments @var{args}.  If @var{name} is not defined, the expansion
2813 is void, and the @samp{d} debug level controls whether a warning is
2814 issued (@pxref{Debugmode}).
2815
2816 The macro @code{indir} is recognized only with parameters.
2817 @end deffn
2818
2819 This can be used to call macros with computed or ``invalid''
2820 names (@code{define} allows such names to be defined):
2821
2822 @example
2823 define(`$$internal$macro', `Internal macro (name `$0')')
2824 @result{}
2825 $$internal$macro
2826 @result{}$$internal$macro
2827 indir(`$$internal$macro')
2828 @result{}Internal macro (name $$internal$macro)
2829 @end example
2830
2831 The point is, here, that larger macro packages can have private macros
2832 defined, that will not be called by accident.  They can @emph{only} be
2833 called through the builtin @code{indir}.
2834
2835 One other point to observe is that argument collection occurs before
2836 @code{indir} invokes @var{name}, so if argument collection changes the
2837 value of @var{name}, that will be reflected in the final expansion.
2838 This is different than the behavior when invoking macros directly,
2839 where the definition that was in effect before argument collection is
2840 used.
2841
2842 @example
2843 $ @kbd{m4 -d}
2844 define(`f', `1')
2845 @result{}
2846 f(define(`f', `2'))
2847 @result{}1
2848 indir(`f', define(`f', `3'))
2849 @result{}3
2850 indir(`f', undefine(`f'))
2851 @error{}m4:stdin:4: warning: indir: undefined macro 'f'
2852 @result{}
2853 debugmode(`-d')
2854 @result{}
2855 indir(`f')
2856 @result{}
2857 @end example
2858
2859 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2860 arguments, @code{indir} defers to the invoked @var{name} for whether a
2861 token representing a builtin is recognized or flattened to the empty
2862 string.
2863
2864 @example
2865 $ @kbd{m4 -d}
2866 indir(defn(`defn'), `divnum')
2867 @error{}m4:stdin:1: warning: indir: invalid macro name ignored
2868 @result{}
2869 indir(`define', defn(`defn'), `divnum')
2870 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2871 @result{}
2872 indir(`define', `foo', defn(`divnum'))
2873 @result{}
2874 foo
2875 @result{}0
2876 indir(`divert', defn(`foo'))
2877 @error{}m4:stdin:5: warning: divert: empty string treated as 0
2878 @result{}
2879 @end example
2880
2881 Warning messages issued on behalf of an indirect macro use an
2882 unambiguous representation of the macro name, using escape sequences
2883 similar to C strings, and with colons also quoted.
2884
2885 @example
2886 define(`%%:\
2887 odd', defn(`divnum'))
2888 @result{}
2889 indir(`%%:\
2890 odd', `extra')
2891 @error{}m4:stdin:3: warning: %%\:\\\nodd: extra arguments ignored: 1 > 0
2892 @result{}0
2893 @end example
2894
2895 @node Builtin
2896 @section Indirect call of builtins
2897
2898 @cindex indirect call of builtins
2899 @cindex call of builtins, indirect
2900 @cindex builtins, indirect call of
2901 @cindex GNU extensions
2902 Builtin macros can be called indirectly with @code{builtin}:
2903
2904 @deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}})
2905 @deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1})
2906 Results in a call to the builtin @var{name}, which is passed the
2907 rest of the arguments @var{args}.  If @var{name} does not name a
2908 builtin, the expansion is void, and the @samp{d} debug level controls
2909 whether a warning is issued (@pxref{Debugmode}).
2910
2911 As a special case, if @var{name} is exactly the special token
2912 representing the @code{builtin} macro, as obtained by @code{defn}
2913 (@pxref{Defn}), then @var{args} must consist of a single @var{name1},
2914 and the expansion is the special token representing the builtin macro
2915 named by @var{name1}.
2916
2917 The macro @code{builtin} is recognized only with parameters.
2918 @end deffn
2919
2920 This can be used even if @var{name} has been given another definition
2921 that has covered the original, or been undefined so that no macro
2922 maps to the builtin.
2923
2924 @example
2925 pushdef(`define', `hidden')
2926 @result{}
2927 undefine(`undefine')
2928 @result{}
2929 define(`foo', `bar')
2930 @result{}hidden
2931 foo
2932 @result{}foo
2933 builtin(`define', `foo', defn(`divnum'))
2934 @result{}
2935 foo
2936 @result{}0
2937 builtin(`define', `foo', `BAR')
2938 @result{}
2939 foo
2940 @result{}BAR
2941 undefine(`foo')
2942 @result{}undefine(foo)
2943 foo
2944 @result{}BAR
2945 builtin(`undefine', `foo')
2946 @result{}
2947 foo
2948 @result{}foo
2949 @end example
2950
2951 The @var{name} argument only matches the original name of the builtin,
2952 even when the @option{--prefix-builtins} option (or @option{-P},
2953 @pxref{Operation modes, , Invoking m4}) is in effect.  This is different
2954 from @code{indir}, which only tracks current macro names.
2955
2956 @comment options: -P
2957 @example
2958 $ @kbd{m4 -P}
2959 m4_builtin(`divnum')
2960 @result{}0
2961 m4_builtin(`m4_divnum')
2962 @error{}m4:stdin:2: warning: m4_builtin: undefined builtin 'm4_divnum'
2963 @result{}
2964 m4_indir(`divnum')
2965 @error{}m4:stdin:3: warning: m4_indir: undefined macro 'divnum'
2966 @result{}
2967 m4_indir(`m4_divnum')
2968 @result{}0
2969 m4_debugmode(`-d')
2970 @result{}
2971 m4_builtin(`m4_divnum')
2972 @result{}
2973 @end example
2974
2975 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2976 without arguments, even when they normally require parameters to be
2977 recognized; but it will provoke a warning, and the expansion will behave
2978 as though empty strings had been passed as the required arguments.
2979
2980 @example
2981 builtin
2982 @result{}builtin
2983 builtin()
2984 @error{}m4:stdin:2: warning: builtin: undefined builtin ''
2985 @result{}
2986 builtin(`builtin')
2987 @error{}m4:stdin:3: warning: builtin: too few arguments: 0 < 1
2988 @result{}
2989 builtin(`builtin',)
2990 @error{}m4:stdin:4: warning: builtin: undefined builtin ''
2991 @result{}
2992 builtin(`builtin', ``'
2993 ')
2994 @error{}m4:stdin:5: warning: builtin: undefined builtin '`\'\n'
2995 @result{}
2996 indir(`index')
2997 @error{}m4:stdin:7: warning: index: too few arguments: 0 < 2
2998 @result{}0
2999 @end example
3000
3001 Normally, once a builtin macro is undefined, the only way to retrieve
3002 its functionality is by defining a new macro that expands to
3003 @code{builtin} under the hood.  But this extra layer of expansion is
3004 slightly inefficient, not to mention the fact that it is not robust to
3005 changes in the current quoting scheme due to @code{changequote}
3006 (@pxref{Changequote}).  On the other hand, defining a macro to the
3007 special token produced by @code{defn} (@pxref{Defn}) is very efficient,
3008 and avoids the need for quoting within the macro definition; but
3009 @code{defn} only works if the desired macro is already defined by some
3010 other name.  So @code{builtin} provides a special case where it is
3011 possible to retrieve the same special token representing a builtin as
3012 what @code{defn} would provide, were the desired macro still defined.
3013 This feature is activated by passing @code{defn(`builtin')} as the first
3014 argument to builtin.  Normally, passing a special token representing a
3015 macro as @var{name} results in a warning and an empty expansion, but in
3016 this case, if the second argument @var{name1} names a valid builtin,
3017 there is no warning and the expansion is the appropriate special
3018 token.  In fact, with just the @code{builtin} macro accessible, it is
3019 possible to reconstitute the entire startup state of @code{m4}.
3020
3021 In the example below, compare the number of macro invocations performed
3022 by @code{defn1} and @code{defn2}, and the differences once quoting is
3023 changed.
3024
3025 @example
3026 $ @kbd{m4 -d}
3027 undefine(`defn')
3028 @result{}
3029 define(`foo', `bar')
3030 @result{}
3031 define(`defn1', `builtin(`defn', $@@)')
3032 @result{}
3033 define(`defn2', builtin(builtin(`defn', `builtin'), `defn'))
3034 @result{}
3035 dumpdef(`defn1', `defn2')
3036 @error{}defn1:@tabchar{}`builtin(`defn', $@@)'
3037 @error{}defn2:@tabchar{}<defn>
3038 @result{}
3039 traceon
3040 @result{}
3041 defn1(`foo')
3042 @error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')'
3043 @error{}m4trace: -1- builtin(`defn', `foo') -> ``bar''
3044 @result{}bar
3045 defn2(`foo')
3046 @error{}m4trace: -1- defn2(`foo') -> ``bar''
3047 @result{}bar
3048 traceoff
3049 @error{}m4trace: -1- traceoff -> `'
3050 @result{}
3051 changequote(`[', `]')
3052 @result{}
3053 defn1([foo])
3054 @error{}m4:stdin:11: warning: builtin: undefined builtin '`defn\''
3055 @result{}
3056 defn2([foo])
3057 @result{}bar
3058 define([defn1], [builtin([defn], $@@)])
3059 @result{}
3060 defn1([foo])
3061 @result{}bar
3062 changequote
3063 @result{}
3064 defn1(`foo')
3065 @error{}m4:stdin:16: warning: builtin: undefined builtin '[defn]'
3066 @result{}
3067 @end example
3068
3069 @node M4symbols
3070 @section Getting the defined macro names
3071
3072 @cindex macro names, listing
3073 @cindex listing macro names
3074 @cindex currently defined macros
3075 @cindex GNU extensions
3076 The name of the currently defined macros can be accessed by
3077 @code{m4symbols}:
3078
3079 @deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}})
3080 Without arguments, @code{m4symbols} expands to a sorted list of quoted
3081 strings, separated by commas.  This contrasts with @code{dumpdef}
3082 (@pxref{Dumpdef}), whose output cannot be accessed by @code{m4}
3083 programs.
3084
3085 When given arguments, @code{m4symbols} returns the sorted subset of the
3086 @var{names} currently defined, and silently ignores the rest.
3087 This macro was added in M4 2.0.
3088 @end deffn
3089
3090 @example
3091 m4symbols(`ifndef', `ifdef', `define', `undef')
3092 @result{}define,ifdef
3093 @end example
3094
3095 @node Conditionals
3096 @chapter Conditionals, loops, and recursion
3097
3098 Macros, expanding to plain text, perhaps with arguments, are not quite
3099 enough.  We would like to have macros expand to different things, based
3100 on decisions taken at run-time.  For that, we need some kind of conditionals.
3101 Also, we would like to have some kind of loop construct, so we could do
3102 something a number of times, or while some condition is true.
3103
3104 @menu
3105 * Ifdef::                       Testing if a macro is defined
3106 * Ifelse::                      If-else construct, or multibranch
3107 * Shift::                       Recursion in @code{m4}
3108 * Forloop::                     Iteration by counting
3109 * Foreach::                     Iteration by list contents
3110 * Stacks::                      Working with definition stacks
3111 * Composition::                 Building macros with macros
3112 @end menu
3113
3114 @node Ifdef
3115 @section Testing if a macro is defined
3116
3117 @cindex conditionals
3118 There are two different builtin conditionals in @code{m4}.  The first is
3119 @code{ifdef}:
3120
3121 @deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2})
3122 If @var{name} is defined as a macro, @code{ifdef} expands to
3123 @var{string-1}, otherwise to @var{string-2}.  If @var{string-2} is
3124 omitted, it is taken to be the empty string (according to the normal
3125 rules).
3126
3127 The macro @code{ifdef} is recognized only with parameters.
3128 @end deffn
3129
3130 @example
3131 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3132 @result{}foo is not defined
3133 define(`foo', `')
3134 @result{}
3135 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3136 @result{}foo is defined
3137 ifdef(`no_such_macro', `yes', `no', `extra argument')
3138 @error{}m4:stdin:4: warning: ifdef: extra arguments ignored: 4 > 3
3139 @result{}no
3140 @end example
3141
3142 As of M4 1.6, @code{ifdef} transparently handles builtin tokens
3143 generated by @code{defn} (@pxref{Defn}) that occur in either
3144 @var{string}, although a warning is issued for invalid macro names.
3145
3146 @example
3147 define(`', `empty')
3148 @result{}
3149 ifdef(defn(`defn'), `yes', `no')
3150 @error{}m4:stdin:2: warning: ifdef: invalid macro name ignored
3151 @result{}no
3152 define(`foo', ifdef(`divnum', defn(`divnum'), `undefined'))
3153 @result{}
3154 foo
3155 @result{}0
3156 @end example
3157
3158 @node Ifelse
3159 @section If-else construct, or multibranch
3160
3161 @cindex comparing strings
3162 @cindex discarding input
3163 @cindex input, discarding
3164 The other conditional, @code{ifelse}, is much more powerful.  It can be
3165 used as a way to introduce a long comment, as an if-else construct, or
3166 as a multibranch, depending on the number of arguments supplied:
3167
3168 @deffn {Builtin (m4)} ifelse (@var{comment})
3169 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
3170   @ovar{not-equal})
3171 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
3172   @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
3173 Used with only one argument, the @code{ifelse} simply discards it and
3174 produces no output.
3175
3176 If called with three or four arguments, @code{ifelse} expands into
3177 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
3178 for character), otherwise it expands to @var{not-equal}.  A final fifth
3179 argument is ignored, after triggering a warning.
3180
3181 If called with six or more arguments, and @var{string-1} and
3182 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
3183 otherwise the first three arguments are discarded and the processing
3184 starts again.
3185
3186 The macro @code{ifelse} is recognized only with parameters.
3187 @end deffn
3188
3189 Using only one argument is a common @code{m4} idiom for introducing a
3190 block comment, as an alternative to repeatedly using @code{dnl}.  This
3191 special usage is recognized by GNU @code{m4}, so that in this
3192 case, the warning about missing arguments is never triggered.
3193
3194 @example
3195 ifelse(`some comments')
3196 @result{}
3197 ifelse(`foo', `bar')
3198 @error{}m4:stdin:2: warning: ifelse: too few arguments: 2 < 3
3199 @result{}
3200 @end example
3201
3202 Using three or four arguments provides decision points.
3203
3204 @example
3205 ifelse(`foo', `bar', `true')
3206 @result{}
3207 ifelse(`foo', `foo', `true')
3208 @result{}true
3209 define(`foo', `bar')
3210 @result{}
3211 ifelse(foo, `bar', `true', `false')
3212 @result{}true
3213 ifelse(foo, `foo', `true', `false')
3214 @result{}false
3215 @end example
3216
3217 @cindex macro, blind
3218 @cindex blind macro
3219 Notice how the first argument was used unquoted; it is common to compare
3220 the expansion of a macro with a string.  With this macro, you can now
3221 reproduce the behavior of blind builtins, where the macro is recognized
3222 only with arguments.
3223
3224 @example
3225 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
3226 @result{}
3227 foo
3228 @result{}foo
3229 foo()
3230 @result{}arguments:1
3231 foo(`a', `b', `c')
3232 @result{}arguments:3
3233 @end example
3234
3235 For an example of a way to make defining blind macros easier, see
3236 @ref{Composition}.
3237
3238 @cindex multibranches
3239 @cindex switch statement
3240 @cindex case statement
3241 The macro @code{ifelse} can take more than four arguments.  If given more
3242 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
3243 statement in traditional programming languages.  If @var{string-1} and
3244 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
3245 the procedure is repeated with the first three arguments discarded.  This
3246 calls for an example:
3247
3248 @example
3249 ifelse(`foo', `bar', `third', `gnu', `gnats')
3250 @error{}m4:stdin:1: warning: ifelse: extra arguments ignored: 5 > 4
3251 @result{}gnu
3252 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
3253 @result{}
3254 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
3255 @result{}seventh
3256 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
3257 @error{}m4:stdin:4: warning: ifelse: extra arguments ignored: 8 > 7
3258 @result{}7
3259 @end example
3260
3261 As of M4 1.6, @code{ifelse} transparently handles builtin tokens
3262 generated by @code{defn} (@pxref{Defn}).  Because of this, it is always
3263 safe to compare two macro definitions, without worrying whether the
3264 macro might be a builtin.
3265
3266 @example
3267 ifelse(defn(`defn'), `', `yes', `no')
3268 @result{}no
3269 ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
3270 @result{}no
3271 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
3272 @result{}yes
3273 define(`foo', ifelse(`', `', defn(`divnum')))
3274 @result{}
3275 foo
3276 @result{}0
3277 @end example
3278
3279 Naturally, the normal case will be slightly more advanced than these
3280 examples.  A common use of @code{ifelse} is in macros implementing loops
3281 of various kinds.
3282
3283 @node Shift
3284 @section Recursion in @code{m4}
3285
3286 @cindex recursive macros
3287 @cindex macros, recursive
3288 There is no direct support for loops in @code{m4}, but macros can be
3289 recursive.  There is no limit on the number of recursion levels, other
3290 than those enforced by your hardware and operating system.
3291
3292 @cindex loops
3293 Loops can be programmed using recursion and the conditionals described
3294 previously.
3295
3296 There is a builtin macro, @code{shift}, which can, among other things,
3297 be used for iterating through the actual arguments to a macro:
3298
3299 @deffn {Builtin (m4)} shift (@var{arg1}, @dots{})
3300 Takes any number of arguments, and expands to all its arguments except
3301 @var{arg1}, separated by commas, with each argument quoted.
3302
3303 The macro @code{shift} is recognized only with parameters.
3304 @end deffn
3305
3306 @example
3307 shift
3308 @result{}shift
3309 shift(`bar')
3310 @result{}
3311 shift(`foo', `bar', `baz')
3312 @result{}bar,baz
3313 @end example
3314
3315 An example of the use of @code{shift} is this macro:
3316
3317 @cindex reversing arguments
3318 @cindex arguments, reversing
3319 @deffn Composite reverse (@dots{})
3320 Takes any number of arguments, and reverses their order.
3321 @end deffn
3322
3323 It is implemented as:
3324
3325 @example
3326 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3327                           `reverse(shift($@@)), `$1'')')
3328 @result{}
3329 reverse
3330 @result{}
3331 reverse(`foo')
3332 @result{}foo
3333 reverse(`foo', `bar', `gnats', `and gnus')
3334 @result{}and gnus, gnats, bar, foo
3335 @end example
3336
3337 While not a very interesting macro, it does show how simple loops can be
3338 made with @code{shift}, @code{ifelse} and recursion.  It also shows
3339 that @code{shift} is usually used with @samp{$@@}.  Another example of
3340 this is an implementation of a short-circuiting conditional operator.
3341
3342 @cindex short-circuiting conditional
3343 @cindex conditional, short-circuiting
3344 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
3345   @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
3346 Similar to @code{ifelse}, where an equal comparison between the first
3347 two strings results in the third, otherwise the first three arguments
3348 are discarded and the process repeats.  The difference is that each
3349 @var{test-<n>} is expanded only when it is encountered.  This means that
3350 every third argument to @code{cond} is normally given one more level of
3351 quoting than the corresponding argument to @code{ifelse}.
3352 @end deffn
3353
3354 Here is the implementation of @code{cond}, along with a demonstration of
3355 how it can short-circuit the side effects in @code{side}.  Notice how
3356 all the unquoted side effects happen regardless of how many comparisons
3357 are made with @code{ifelse}, compared with only the relevant effects
3358 with @code{cond}.
3359
3360 @example
3361 define(`cond',
3362 `ifelse(`$#', `1', `$1',
3363         `ifelse($1, `$2', `$3',
3364                 `$0(shift(shift(shift($@@))))')')')dnl
3365 define(`side', `define(`counter', incr(counter))$1')dnl
3366 define(`example1',
3367 `define(`counter', `0')dnl
3368 ifelse(side(`$1'), `yes', `one comparison: ',
3369        side(`$1'), `no', `two comparisons: ',
3370        side(`$1'), `maybe', `three comparisons: ',
3371        `side(`default answer: ')')counter')dnl
3372 define(`example2',
3373 `define(`counter', `0')dnl
3374 cond(`side(`$1')', `yes', `one comparison: ',
3375      `side(`$1')', `no', `two comparisons: ',
3376      `side(`$1')', `maybe', `three comparisons: ',
3377      `side(`default answer: ')')counter')dnl
3378 example1(`yes')
3379 @result{}one comparison: 3
3380 example1(`no')
3381 @result{}two comparisons: 3
3382 example1(`maybe')
3383 @result{}three comparisons: 3
3384 example1(`feeling rather indecisive today')
3385 @result{}default answer: 4
3386 example2(`yes')
3387 @result{}one comparison: 1
3388 example2(`no')
3389 @result{}two comparisons: 2
3390 example2(`maybe')
3391 @result{}three comparisons: 3
3392 example2(`feeling rather indecisive today')
3393 @result{}default answer: 4
3394 @end example
3395
3396 @cindex joining arguments
3397 @cindex arguments, joining
3398 @cindex concatenating arguments
3399 Another common task that requires iteration is joining a list of
3400 arguments into a single string.
3401
3402 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3403 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3404 Generate a single-quoted string, consisting of each @var{arg} separated
3405 by @var{separator}.  While @code{joinall} always outputs a
3406 @var{separator} between arguments, @code{join} avoids the
3407 @var{separator} for an empty @var{arg}.
3408 @end deffn
3409
3410 Here are some examples of its usage, based on the implementation
3411 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3412 package:
3413
3414 @comment examples
3415 @example
3416 $ @kbd{m4 -I examples}
3417 include(`join.m4')
3418 @result{}
3419 join,join(`-'),join(`-', `'),join(`-', `', `')
3420 @result{},,,
3421 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3422 @result{},,,-
3423 join(`-', `1')
3424 @result{}1
3425 join(`-', `1', `2', `3')
3426 @result{}1-2-3
3427 join(`', `1', `2', `3')
3428 @result{}123
3429 join(`-', `', `1', `', `', `2', `')
3430 @result{}1-2
3431 joinall(`-', `', `1', `', `', `2', `')
3432 @result{}-1---2-
3433 join(`,', `1', `2', `3')
3434 @result{}1,2,3
3435 define(`nargs', `$#')dnl
3436 nargs(join(`,', `1', `2', `3'))
3437 @result{}1
3438 @end example
3439
3440 Examining the implementation shows some interesting points about several
3441 m4 programming idioms.
3442
3443 @comment examples
3444 @example
3445 $ @kbd{m4 -I examples}
3446 undivert(`join.m4')dnl
3447 @result{}divert(`-1')
3448 @result{}# join(sep, args) - join each non-empty ARG into a single
3449 @result{}# string, with each element separated by SEP
3450 @result{}define(`join',
3451 @result{}`ifelse(`$#', `2', ``$2'',
3452 @result{}  `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3453 @result{}define(`_join',
3454 @result{}`ifelse(`$#$2', `2', `',
3455 @result{}  `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3456 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3457 @result{}# into a single string, with each element separated by SEP
3458 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3459 @result{}define(`_joinall',
3460 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3461 @result{}divert`'dnl
3462 @end example
3463
3464 First, notice that this implementation creates helper macros
3465 @code{_join} and @code{_joinall}.  This division of labor makes it
3466 easier to output the correct number of @var{separator} instances:
3467 @code{join} and @code{joinall} are responsible for the first argument,
3468 without a separator, while @code{_join} and @code{_joinall} are
3469 responsible for all remaining arguments, always outputting a separator
3470 when outputting an argument.
3471
3472 Next, observe how @code{join} decides to iterate to itself, because the
3473 first @var{arg} was empty, or to output the argument and swap over to
3474 @code{_join}.  If the argument is non-empty, then the nested
3475 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3476 with the @samp{$0} to form the next macro name to invoke.  The
3477 @code{joinall} implementation is simpler since it does not have to
3478 suppress empty @var{arg}; it always executes once then defers to
3479 @code{_joinall}.
3480
3481 Another important idiom is the idea that @var{separator} is reused for
3482 each iteration.  Each iteration has one less argument, but rather than
3483 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3484 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3485
3486 Next, notice that it is possible to compare more than one condition in a
3487 single @code{ifelse} test.  The test of @samp{$#$2} against @samp{2}
3488 allows @code{_join} to iterate for two separate reasons---either there
3489 are still more than two arguments, or there are exactly two arguments
3490 but the last argument is not empty.
3491
3492 Finally, notice that these macros require exactly two arguments to
3493 terminate recursion, but that they still correctly result in empty
3494 output when given no @var{args} (i.e., zero or one macro argument).  On
3495 the first pass when there are too few arguments, the @code{shift}
3496 results in no output, but leaves an empty string to serve as the
3497 required second argument for the second pass.  Put another way,
3498 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3499 former guarantees at least two arguments.
3500
3501 @cindex quote manipulation
3502 @cindex manipulating quotes
3503 Sometimes, a recursive algorithm requires adding quotes to each element,
3504 or treating multiple arguments as a single element:
3505
3506 @deffn Composite quote (@dots{})
3507 @deffnx Composite dquote (@dots{})
3508 @deffnx Composite dquote_elt (@dots{})
3509 Takes any number of arguments, and adds quoting.  With @code{quote},
3510 only one level of quoting is added, effectively removing whitespace
3511 after commas and turning multiple arguments into a single string.  With
3512 @code{dquote}, two levels of quoting are added, one around each element,
3513 and one around the list.  And with @code{dquote_elt}, two levels of
3514 quoting are added around each element.
3515 @end deffn
3516
3517 An actual implementation of these three macros is distributed as
3518 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package.  First,
3519 let's examine their usage:
3520
3521 @comment examples
3522 @example
3523 $ @kbd{m4 -I examples}
3524 include(`quote.m4')
3525 @result{}
3526 -quote-dquote-dquote_elt-
3527 @result{}----
3528 -quote()-dquote()-dquote_elt()-
3529 @result{}--`'-`'-
3530 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3531 @result{}-1-`1'-`1'-
3532 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3533 @result{}-1,2-`1',`2'-`1',`2'-
3534 define(`n', `$#')dnl
3535 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3536 @result{}-1-1-2-
3537 dquote(dquote_elt(`1', `2'))
3538 @result{}``1'',``2''
3539 dquote_elt(dquote(`1', `2'))
3540 @result{}``1',`2''
3541 @end example
3542
3543 The last two lines show that when given two arguments, @code{dquote}
3544 results in one string, while @code{dquote_elt} results in two.  Now,
3545 examine the implementation.  Note that @code{quote} and
3546 @code{dquote_elt} make decisions based on their number of arguments, so
3547 that when called without arguments, they result in nothing instead of a
3548 quoted empty string; this is so that it is possible to distinguish
3549 between no arguments and an empty first argument.  @code{dquote}, on the
3550 other hand, results in a string no matter what, since it is still
3551 possible to tell whether it was invoked without arguments based on the
3552 resulting string.
3553
3554 @comment examples
3555 @example
3556 $ @kbd{m4 -I examples}
3557 undivert(`quote.m4')dnl
3558 @result{}divert(`-1')
3559 @result{}# quote(args) - convert args to single-quoted string
3560 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3561 @result{}# dquote(args) - convert args to quoted list of quoted strings
3562 @result{}define(`dquote', ``$@@'')
3563 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3564 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3565 @result{}                             ```$1'',$0(shift($@@))')')
3566 @result{}divert`'dnl
3567 @end example
3568
3569 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3570 than @samp{joinall(`,', @var{args})} for producing the same output.
3571
3572 @cindex nine arguments, more than
3573 @cindex more than nine arguments
3574 @cindex arguments, more than nine
3575 One more useful macro based on @code{shift} allows portably selecting
3576 an arbitrary argument (usually greater than the ninth argument), without
3577 relying on the GNU extension of multi-digit arguments
3578 (@pxref{Arguments}).
3579
3580 @deffn Composite argn (@var{n}, @dots{})
3581 Expands to argument @var{n} out of the remaining arguments.  @var{n}
3582 must be a positive number.  Usually invoked as
3583 @samp{argn(`@var{n}',$@@)}.
3584 @end deffn
3585
3586 It is implemented as:
3587
3588 @example
3589 define(`argn', `ifelse(`$1', 1, ``$2'',
3590   `argn(decr(`$1'), shift(shift($@@)))')')
3591 @result{}
3592 argn(`1', `a')
3593 @result{}a
3594 define(`foo', `argn(`11', $@@)')
3595 @result{}
3596 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3597 @result{}k
3598 @end example
3599
3600 @node Forloop
3601 @section Iteration by counting
3602
3603 @cindex for loops
3604 @cindex loops, counting
3605 @cindex counting loops
3606 Here is an example of a loop macro that implements a simple for loop.
3607
3608 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3609 Takes the name in @var{iterator}, which must be a valid macro name, and
3610 successively assign it each integer value from @var{start} to @var{end},
3611 inclusive.  For each assignment to @var{iterator}, append @var{text} to
3612 the expansion of the @code{forloop}.  @var{text} may refer to
3613 @var{iterator}.  Any definition of @var{iterator} prior to this
3614 invocation is restored.
3615 @end deffn
3616
3617 It can, for example, be used for simple counting:
3618
3619 @comment examples
3620 @example
3621 $ @kbd{m4 -I examples}
3622 include(`forloop.m4')
3623 @result{}
3624 forloop(`i', `1', `8', `i ')
3625 @result{}1 2 3 4 5 6 7 8@w{ }
3626 @end example
3627
3628 For-loops can be nested, like:
3629
3630 @comment examples
3631 @example
3632 $ @kbd{m4 -I examples}
3633 include(`forloop.m4')
3634 @result{}
3635 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3636 ')
3637 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3638 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3639 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3640 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3641 @result{}
3642 @end example
3643
3644 The implementation of the @code{forloop} macro is fairly
3645 straightforward.  The @code{forloop} macro itself is simply a wrapper,
3646 which saves the previous definition of the first argument, calls the
3647 internal macro @code{@w{_forloop}}, and re-establishes the saved
3648 definition of the first argument.
3649
3650 The macro @code{@w{_forloop}} expands the fourth argument once, and
3651 tests to see if the iterator has reached the final value.  If it has
3652 not finished, it increments the iterator (using the predefined macro
3653 @code{incr}, @pxref{Incr}), and recurses.
3654
3655 Here is an actual implementation of @code{forloop}, distributed as
3656 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3657
3658 @comment examples
3659 @example
3660 $ @kbd{m4 -I examples}
3661 undivert(`forloop.m4')dnl
3662 @result{}divert(`-1')
3663 @result{}# forloop(var, from, to, stmt) - simple version
3664 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3665 @result{}define(`_forloop',
3666 @result{}       `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3667 @result{}divert`'dnl
3668 @end example
3669
3670 Notice the careful use of quotes.  Certain macro arguments are left
3671 unquoted, each for its own reason.  Try to find out @emph{why} these
3672 arguments are left unquoted, and see what happens if they are quoted.
3673 (As presented, these two macros are useful but not very robust for
3674 general use.  They lack even basic error handling for cases like
3675 @var{start} less than @var{end}, @var{end} not numeric, or
3676 @var{iterator} not being a macro name.  See if you can improve these
3677 macros; or @pxref{Improved forloop, , Answers}).
3678
3679 @node Foreach
3680 @section Iteration by list contents
3681
3682 @cindex for each loops
3683 @cindex loops, list iteration
3684 @cindex iterating over lists
3685 Here is an example of a loop macro that implements list iteration.
3686
3687 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3688 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3689 Takes the name in @var{iterator}, which must be a valid macro name, and
3690 successively assign it each value from @var{paren-list} or
3691 @var{quote-list}.  In @code{foreach}, @var{paren-list} is a
3692 comma-separated list of elements contained in parentheses.  In
3693 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3694 contained in a quoted string.  For each assignment to @var{iterator},
3695 append @var{text} to the overall expansion.  @var{text} may refer to
3696 @var{iterator}.  Any definition of @var{iterator} prior to this
3697 invocation is restored.
3698 @end deffn
3699
3700 As an example, this displays each word in a list inside of a sentence,
3701 using an implementation of @code{foreach} distributed as
3702 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3703 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3704
3705 @comment examples
3706 @example
3707 $ @kbd{m4 -I examples}
3708 include(`foreach.m4')
3709 @result{}
3710 foreach(`x', (foo, bar, foobar), `Word was: x
3711 ')dnl
3712 @result{}Word was: foo
3713 @result{}Word was: bar
3714 @result{}Word was: foobar
3715 include(`foreachq.m4')
3716 @result{}
3717 foreachq(`x', `foo, bar, foobar', `Word was: x
3718 ')dnl
3719 @result{}Word was: foo
3720 @result{}Word was: bar
3721 @result{}Word was: foobar
3722 @end example
3723
3724 It is possible to be more complex; each element of the @var{paren-list}
3725 or @var{quote-list} can itself be a list, to pass as further arguments
3726 to a helper macro.  This example generates a shell case statement:
3727
3728 @comment examples
3729 @example
3730 $ @kbd{m4 -I examples}
3731 include(`foreach.m4')
3732 @result{}
3733 define(`_case', `  $1)
3734     $2=" $1";;
3735 ')dnl
3736 define(`_cat', `$1$2')dnl
3737 case $`'1 in
3738 @result{}case $1 in
3739 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3740         `_cat(`_case', x)')dnl
3741 @result{}  a)
3742 @result{}    vara=" a";;
3743 @result{}  b)
3744 @result{}    varb=" b";;
3745 @result{}  c)
3746 @result{}    varc=" c";;
3747 esac
3748 @result{}esac
3749 @end example
3750
3751 The implementation of the @code{foreach} macro is a bit more involved;
3752 it is a wrapper around two helper macros.  First, @code{@w{_arg1}} is
3753 needed to grab the first element of a list.  Second,
3754 @code{@w{_foreach}} implements the recursion, successively walking
3755 through the original list.  Here is a simple implementation of
3756 @code{foreach}:
3757
3758 @comment examples
3759 @example
3760 $ @kbd{m4 -I examples}
3761 undivert(`foreach.m4')dnl
3762 @result{}divert(`-1')
3763 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3764 @result{}#   parenthesized list, simple version
3765 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3766 @result{}define(`_arg1', `$1')
3767 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3768 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3769 @result{}divert`'dnl
3770 @end example
3771
3772 Unfortunately, that implementation is not robust to macro names as list
3773 elements.  Each iteration of @code{@w{_foreach}} is stripping another
3774 layer of quotes, leading to erratic results if list elements are not
3775 already fully expanded.  The first cut at implementing @code{foreachq}
3776 takes this into account.  Also, when using quoted elements in a
3777 @var{paren-list}, the overall list must be quoted.  A @var{quote-list}
3778 has the nice property of requiring fewer characters to create a list
3779 containing the same quoted elements.  To see the difference between the
3780 two macros, we attempt to pass double-quoted macro names in a list,
3781 expecting the macro name on output after one layer of quotes is removed
3782 during list iteration and the final layer removed during the final
3783 rescan:
3784
3785 @comment examples
3786 @example
3787 $ @kbd{m4 -I examples}
3788 define(`a', `1')define(`b', `2')define(`c', `3')
3789 @result{}
3790 include(`foreach.m4')
3791 @result{}
3792 include(`foreachq.m4')
3793 @result{}
3794 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3795 ')
3796 @result{}1
3797 @result{}(2)1
3798 @result{}
3799 @result{}, x
3800 @result{})
3801 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3802 ')dnl
3803 @result{}a
3804 @result{}(b
3805 @result{}c)
3806 @end example
3807
3808 Obviously, @code{foreachq} did a better job; here is its implementation:
3809
3810 @comment examples
3811 @example
3812 $ @kbd{m4 -I examples}
3813 undivert(`foreachq.m4')dnl
3814 @result{}include(`quote.m4')dnl
3815 @result{}divert(`-1')
3816 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3817 @result{}#   quoted list, simple version
3818 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3819 @result{}define(`_arg1', `$1')
3820 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3821 @result{}  `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3822 @result{}divert`'dnl
3823 @end example
3824
3825 Notice that @code{@w{_foreachq}} had to use the helper macro
3826 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3827 embedded @code{ifelse} call does not go haywire if a list element
3828 contains a comma.  Unfortunately, this implementation of @code{foreachq}
3829 has its own severe flaw.  Whereas the @code{foreach} implementation was
3830 linear, this macro is quadratic in the number of list elements, and is
3831 much more likely to trip up the limit set by the command line option
3832 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3833 Invoking m4}).  Additionally, this implementation does not expand
3834 @samp{defn(`@var{iterator}')} very well, when compared with
3835 @code{foreach}.
3836
3837 @comment examples
3838 @example
3839 $ @kbd{m4 -I examples}
3840 include(`foreach.m4')include(`foreachq.m4')
3841 @result{}
3842 foreach(`name', `(`a', `b')', ` defn(`name')')
3843 @result{} a b
3844 foreachq(`name', ``a', `b'', ` defn(`name')')
3845 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3846 @end example
3847
3848 It is possible to have robust iteration with linear behavior and sane
3849 @var{iterator} contents for either list style.  See if you can learn
3850 from the best elements of both of these implementations to create robust
3851 macros (or @pxref{Improved foreach, , Answers}).
3852
3853 @node Stacks
3854 @section Working with definition stacks
3855
3856 @cindex definition stack
3857 @cindex pushdef stack
3858 @cindex stack, macro definition
3859 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3860 operation in @code{m4}.  Normally, only the topmost definition in a
3861 stack is important, but sometimes, it is desirable to manipulate the
3862 entire definition stack.
3863
3864 @deffn Composite stack_foreach (@var{macro}, @var{action})
3865 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3866 For each of the @code{pushdef} definitions associated with @var{macro},
3867 invoke the macro @var{action} with a single argument of that definition.
3868 @code{stack_foreach} visits the oldest definition first, while
3869 @code{stack_foreach_lifo} visits the current definition first.
3870 @var{action} should not modify or dereference @var{macro}.  There are a
3871 few special macros, such as @code{defn}, which cannot be used as the
3872 @var{macro} parameter.
3873 @end deffn
3874
3875 A sample implementation of these macros is distributed in the file
3876 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3877
3878 @comment examples
3879 @example
3880 $ @kbd{m4 -I examples}
3881 include(`stack.m4')
3882 @result{}
3883 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3884 @result{}
3885 define(`show', ``$1'
3886 ')
3887 @result{}
3888 stack_foreach(`a', `show')dnl
3889 @result{}1
3890 @result{}2
3891 @result{}3
3892 stack_foreach_lifo(`a', `show')dnl
3893 @result{}3
3894 @result{}2
3895 @result{}1
3896 @end example
3897
3898 Now for the implementation.  Note the definition of a helper macro,
3899 @code{_stack_reverse}, which destructively swaps the contents of one
3900 stack of definitions into the reverse order in the temporary macro
3901 @samp{tmp-$1}.  By calling the helper twice, the original order is
3902 restored back into the macro @samp{$1}; since the operation is
3903 destructive, this explains why @samp{$1} must not be modified or
3904 dereferenced during the traversal.  The caller can then inject
3905 additional code to pass the definition currently being visited to
3906 @samp{$2}.  The choice of helper names is intentional; since @samp{-} is
3907 not valid as part of a macro name, there is no risk of conflict with a
3908 valid macro name, and the code is guaranteed to use @code{defn} where
3909 necessary.  Finally, note that any macro used in the traversal of a
3910 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3911 handled by @code{stack_foreach}, since the macro would temporarily be
3912 undefined during the algorithm.
3913
3914 @comment examples
3915 @example
3916 $ @kbd{m4 -I examples}
3917 undivert(`stack.m4')dnl
3918 @result{}divert(`-1')
3919 @result{}# stack_foreach(macro, action)
3920 @result{}# Invoke ACTION with a single argument of each definition
3921 @result{}# from the definition stack of MACRO, starting with the oldest.
3922 @result{}define(`stack_foreach',
3923 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3924 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3925 @result{}# stack_foreach_lifo(macro, action)
3926 @result{}# Invoke ACTION with a single argument of each definition
3927 @result{}# from the definition stack of MACRO, starting with the newest.
3928 @result{}define(`stack_foreach_lifo',
3929 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3930 @result{}`_stack_reverse(`tmp-$1', `$1')')
3931 @result{}define(`_stack_reverse',
3932 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3933 @result{}divert`'dnl
3934 @end example
3935
3936 @node Composition
3937 @section Building macros with macros
3938
3939 @cindex macro composition
3940 @cindex composing macros
3941 Since m4 is a macro language, it is possible to write macros that
3942 can build other macros.  First on the list is a way to automate the
3943 creation of blind macros.
3944
3945 @cindex macro, blind
3946 @cindex blind macro
3947 @deffn Composite define_blind (@var{name}, @ovar{value})
3948 Defines @var{name} as a blind macro, such that @var{name} will expand to
3949 @var{value} only when given explicit arguments.  @var{value} should not
3950 be the result of @code{defn} (@pxref{Defn}).  This macro is only
3951 recognized with parameters, and results in an empty string.
3952 @end deffn
3953
3954 Defining a macro to define another macro can be a bit tricky.  We want
3955 to use a literal @samp{$#} in the argument to the nested @code{define}.
3956 However, if @samp{$} and @samp{#} are adjacent in the definition of
3957 @code{define_blind}, then it would be expanded as the number of
3958 arguments to @code{define_blind} rather than the intended number of
3959 arguments to @var{name}.  The solution is to pass the difficult
3960 characters through extra arguments to a helper macro
3961 @code{_define_blind}.  When composing macros, it is a common idiom to
3962 need a helper macro to concatenate text that forms parameters in the
3963 composed macro, rather than interpreting the text as a parameter of the
3964 composing macro.
3965
3966 As for the limitation against using @code{defn}, there are two reasons.
3967 If a macro was previously defined with @code{define_blind}, then it can
3968 safely be renamed to a new blind macro using plain @code{define}; using
3969 @code{define_blind} to rename it just adds another layer of
3970 @code{ifelse}, occupying memory and slowing down execution.  And if a
3971 macro is a builtin, then it would result in an attempt to define a macro
3972 consisting of both text and a builtin token; this is not supported, and
3973 the builtin token is flattened to an empty string.
3974
3975 With that explanation, here's the definition, and some sample usage.
3976 Notice that @code{define_blind} is itself a blind macro.
3977
3978 @example
3979 $ @kbd{m4 -d}
3980 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3981 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3982 @result{}
3983 define(`_define_blind', `define(`$1',
3984 `ifelse(`$3', `0', ``$4'', `$2')')')
3985 @result{}
3986 define_blind
3987 @result{}define_blind
3988 define_blind(`foo', `arguments were $*')
3989 @result{}
3990 foo
3991 @result{}foo
3992 foo(`bar')
3993 @result{}arguments were bar
3994 define(`blah', defn(`foo'))
3995 @result{}
3996 blah
3997 @result{}blah
3998 blah(`a', `b')
3999 @result{}arguments were a,b
4000 defn(`blah')
4001 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
4002 @end example
4003
4004 @cindex currying arguments
4005 @cindex argument currying
4006 Another interesting composition tactic is argument @dfn{currying}, or
4007 factoring a macro that takes multiple arguments for use in a context
4008 that provides exactly one argument.
4009
4010 @deffn Composite curry (@var{macro}, @dots{})
4011 Expand to a macro call that takes exactly one argument, then appends
4012 that argument to the original arguments and invokes @var{macro} with the
4013 resulting list of arguments.
4014 @end deffn
4015
4016 A demonstration of currying makes the intent of this macro a little more
4017 obvious.  The macro @code{stack_foreach} mentioned earlier is an example
4018 of a context that provides exactly one argument to a macro name.  But
4019 coupled with currying, we can invoke @code{reverse} with two arguments
4020 for each definition of a macro stack.  This example uses the file
4021 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
4022 distribution.
4023
4024 @comment examples
4025 @example
4026 $ @kbd{m4 -I examples}
4027 include(`curry.m4')include(`stack.m4')
4028 @result{}
4029 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
4030                           `reverse(shift($@@)), `$1'')')
4031 @result{}
4032 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
4033 @result{}
4034 stack_foreach(`a', `:curry(`reverse', `4')')
4035 @result{}:1, 4:2, 4:3, 4
4036 curry(`curry', `reverse', `1')(`2')(`3')
4037 @result{}3, 2, 1
4038 @end example
4039
4040 Now for the implementation.  Notice how @code{curry} leaves off with a
4041 macro name but no open parenthesis, while still in the middle of
4042 collecting arguments for @samp{$1}.  The macro @code{_curry} is the
4043 helper macro that takes one argument, then adds it to the list and
4044 finally supplies the closing parenthesis.  The use of a comma inside the
4045 @code{shift} call allows currying to also work for a macro that takes
4046 one argument, although it often makes more sense to invoke that macro
4047 directly rather than going through @code{curry}.
4048
4049 @comment examples
4050 @example
4051 $ @kbd{m4 -I examples}
4052 undivert(`curry.m4')dnl
4053 @result{}divert(`-1')
4054 @result{}# curry(macro, args)
4055 @result{}# Expand to a macro call that takes one argument, then invoke
4056 @result{}# macro(args, extra).
4057 @result{}define(`curry', `$1(shift($@@,)_$0')
4058 @result{}define(`_curry', ``$1')')
4059 @result{}divert`'dnl
4060 @end example
4061
4062 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
4063 tokens, which are silently flattened to the empty string when passed
4064 through another text macro.  The following example demonstrates a usage
4065 of @code{curry} that works in M4 1.6, but is not portable to earlier
4066 versions:
4067
4068 @comment examples
4069 @example
4070 $ @kbd{m4 -I examples}
4071 include(`curry.m4')
4072 @result{}
4073 curry(`define', `mylen')(defn(`len'))
4074 @result{}
4075 mylen(`abc')
4076 @result{}3
4077 @end example
4078
4079 @cindex renaming macros
4080 @cindex copying macros
4081 @cindex macros, copying
4082 Putting the last few concepts together, it is possible to copy or rename
4083 an entire stack of macro definitions.
4084
4085 @deffn Composite copy (@var{source}, @var{dest})
4086 @deffnx Composite rename (@var{source}, @var{dest})
4087 Ensure that @var{dest} is undefined, then define it to the same stack of
4088 definitions currently in @var{source}.  @code{copy} leaves @var{source}
4089 unchanged, while @code{rename} undefines @var{source}.  There are only a
4090 few macros, such as @code{copy} or @code{defn}, which cannot be copied
4091 via this macro.
4092 @end deffn
4093
4094 The implementation is relatively straightforward (although since it uses
4095 @code{curry}, it is unable to copy builtin macros when used with M4
4096 1.4.x.  See if you can design a portable version that works across all
4097 M4 versions, or @pxref{Improved copy, , Answers}).
4098
4099 @comment examples
4100 @example
4101 $ @kbd{m4 -I examples}
4102 include(`curry.m4')include(`stack.m4')
4103 @result{}
4104 define(`rename', `copy($@@)undefine(`$1')')dnl
4105 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
4106 ')m4exit(`1')',
4107    `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
4108 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
4109 @result{}
4110 copy(`a', `b')
4111 @result{}
4112 rename(`b', `c')
4113 @result{}
4114 a b c
4115 @result{}2 b 2
4116 popdef(`a', `c')a c
4117 @result{}0 0
4118 popdef(`a', `c')a c
4119 @result{}1 1
4120 @end example
4121
4122 @node Debugging
4123 @chapter How to debug macros and input
4124
4125 @cindex debugging macros
4126 @cindex macros, debugging
4127 When writing macros for @code{m4}, they often do not work as intended on
4128 the first try (as is the case with most programming languages).
4129 Fortunately, there is support for macro debugging in @code{m4}.
4130
4131 @menu
4132 * Dumpdef::                     Displaying macro definitions
4133 * Trace::                       Tracing macro calls
4134 * Debugmode::                   Controlling debugging options
4135 * Debuglen::                    Limiting debug output
4136 * Debugfile::                   Saving debugging output
4137 @end menu
4138
4139 @node Dumpdef
4140 @section Displaying macro definitions
4141
4142 @cindex displaying macro definitions
4143 @cindex macros, displaying definitions
4144 @cindex definitions, displaying macro
4145 @cindex standard error, output to
4146 If you want to see what a name expands into, you can use the builtin
4147 @code{dumpdef}:
4148
4149 @deffn {Builtin (m4)} dumpdef (@ovar{name@dots{}})
4150 Accepts any number of arguments.  If called without any arguments, it
4151 displays the definitions of all known names, otherwise it displays the
4152 definitions of each @var{name} given, sorted by name.  If a @var{name}
4153 is undefined, the @samp{d} debug level controls whether a warning is
4154 issued (@pxref{Debugmode}).  Likewise, the @samp{o} debug level controls
4155 whether the output is issued to standard error or the current debug
4156 file (@pxref{Debugfile}).
4157
4158 The expansion of @code{dumpdef} is void.
4159 @end deffn
4160
4161 @example
4162 $ @kbd{m4 -d}
4163 define(`foo', `Hello world.')
4164 @result{}
4165 dumpdef(`foo')
4166 @error{}foo:@tabchar{}`Hello world.'
4167 @result{}
4168 dumpdef(`define')
4169 @error{}define:@tabchar{}<define>
4170 @result{}
4171 @end example
4172
4173 The last example shows how builtin macros definitions are displayed.
4174 The definition that is dumped corresponds to what would occur if the
4175 macro were to be called at that point, even if other definitions are
4176 still live due to redefining a macro during argument collection.
4177
4178 @example
4179 $ @kbd{m4 -d}
4180 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
4181 @result{}
4182 f(popdef(`f')dumpdef(`f'))
4183 @error{}f:@tabchar{}``$0'1'
4184 @result{}f2
4185 f(popdef(`f')dumpdef(`f'))
4186 @error{}m4:stdin:3: warning: dumpdef: undefined macro 'f'
4187 @result{}f1
4188 debugmode(`-d')
4189 @result{}
4190 dumpdef(`f')
4191 @result{}
4192 @end example
4193
4194 @xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and
4195 @samp{s} flags affect the details of the display.  Remember, the
4196 @samp{q} flag is implied when the @option{--debug} option (@option{-d},
4197 @pxref{Debugging options, , Invoking m4}) is used in the command line
4198 without arguments.  Also, @option{--debuglen} (@pxref{Debuglen}) can affect
4199 output, by truncating longer strings (but not builtin and module names).
4200
4201 @comment options: -ds -l3
4202 @example
4203 $ @kbd{m4 -ds -l 3}
4204 pushdef(`foo', `1 long string')
4205 @result{}
4206 pushdef(`foo', defn(`divnum'))
4207 @result{}
4208 pushdef(`foo', `3')
4209 @result{}
4210 debugmode(`+m')
4211 @result{}
4212 dumpdef(`foo', `dnl', `indir', `__gnu__')
4213 @error{}__gnu__:@tabchar{}@{gnu@}
4214 @error{}dnl:@tabchar{}<dnl>@{m4@}
4215 @error{}foo:@tabchar{}3, <divnum>@{m4@}, 1 l...
4216 @error{}indir:@tabchar{}<indir>@{gnu@}
4217 @result{}
4218 debugmode(`-ms')debugmode(`+q')
4219 @result{}
4220 dumpdef(`foo')
4221 @error{}foo:@tabchar{}`3'
4222 @result{}
4223 @end example
4224
4225 @node Trace
4226 @section Tracing macro calls
4227
4228 @cindex tracing macro expansion
4229 @cindex macro expansion, tracing
4230 @cindex expansion, tracing macro
4231 @cindex standard error, output to
4232 It is possible to trace macro calls and expansions through the builtins
4233 @code{traceon} and @code{traceoff}:
4234
4235 @deffn {Builtin (m4)} traceon (@ovar{names@dots{}})
4236 @deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}})
4237 When called without any arguments, @code{traceon} and @code{traceoff}
4238 will turn tracing on and off, respectively, for all macros, identical to
4239 using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}).
4240
4241 When called with arguments, only the macros listed in @var{names} are
4242 affected, whether or not they are currently defined.  A macro's
4243 expansion will be traced if global tracing is on, or if the individual
4244 macro tracing flag is set; to avoid tracing a macro, both the global
4245 flag and the macro must have tracing off.
4246
4247 The expansion of @code{traceon} and @code{traceoff} is void.
4248 @end deffn
4249
4250 Whenever a traced macro is called and the arguments have been collected,
4251 the call is displayed.  If the expansion of the macro call is not void,
4252 the expansion can be displayed after the call.  The output is printed
4253 to the current debug file (defaulting to standard error,
4254 @pxref{Debugfile}).
4255
4256 @example
4257 $ @kbd{m4 -d}
4258 define(`foo', `Hello World.')
4259 @result{}
4260 define(`echo', `$@@')
4261 @result{}
4262 traceon(`foo', `echo')
4263 @result{}
4264 foo
4265 @error{}m4trace: -1- foo -> `Hello World.'
4266 @result{}Hello World.
4267 echo(`gnus', `and gnats')
4268 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
4269 @result{}gnus,and gnats
4270 @end example
4271
4272 The number between dashes is the depth of the expansion.  It is one most
4273 of the time, signifying an expansion at the outermost level, but it
4274 increases when macro arguments contain unquoted macro calls.  The
4275 maximum number that will appear between dashes is controlled by the
4276 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
4277 , Invoking m4}).  Additionally, the option @option{--trace} (or
4278 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
4279 parsing input.
4280
4281 @comment options: -d-V -L3 -tifelse
4282 @comment status: 1
4283 @example
4284 $ @kbd{m4 -L 3 -t ifelse}
4285 ifelse(`one level')
4286 @error{}m4trace: -1- ifelse
4287 @result{}
4288 ifelse(ifelse(ifelse(`three levels')))
4289 @error{}m4trace: -3- ifelse
4290 @error{}m4trace: -2- ifelse
4291 @error{}m4trace: -1- ifelse
4292 @result{}
4293 ifelse(ifelse(ifelse(ifelse(`four levels'))))
4294 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
4295 @end example
4296
4297 Tracing by name is an attribute that is preserved whether the macro is
4298 defined or not.  This allows the selection of macros to trace before
4299 those macros are defined.
4300
4301 @example
4302 $ @kbd{m4 -d}
4303 traceoff(`foo')
4304 @result{}
4305 traceon(`foo')
4306 @result{}
4307 foo
4308 @result{}foo
4309 defn(`foo')
4310 @error{}m4:stdin:4: warning: defn: undefined macro 'foo'
4311 @result{}
4312 undefine(`foo')
4313 @error{}m4:stdin:5: warning: undefine: undefined macro 'foo'
4314 @result{}
4315 pushdef(`foo')
4316 @result{}
4317 popdef(`foo')
4318 @result{}
4319 popdef(`foo')
4320 @error{}m4:stdin:8: warning: popdef: undefined macro 'foo'
4321 @result{}
4322 define(`foo', `bar')
4323 @result{}
4324 foo
4325 @error{}m4trace: -1- foo -> `bar'
4326 @result{}bar
4327 undefine(`foo')
4328 @result{}
4329 ifdef(`foo', `yes', `no')
4330 @result{}no
4331 indir(`foo')
4332 @error{}m4:stdin:13: warning: indir: undefined macro 'foo'
4333 @result{}
4334 define(`foo', `blah')
4335 @result{}
4336 foo
4337 @error{}m4trace: -1- foo -> `blah'
4338 @result{}blah
4339 @end example
4340
4341 Tracing even works on builtins.  However, @code{defn} (@pxref{Defn})
4342 does not transfer tracing status.
4343
4344 @example
4345 $ @kbd{m4 -d}
4346 traceon(`traceon')
4347 @result{}
4348 traceon(`traceoff')
4349 @error{}m4trace: -1- traceon(`traceoff') -> `'
4350 @result{}
4351 traceoff(`traceoff')
4352 @error{}m4trace: -1- traceoff(`traceoff') -> `'
4353 @result{}
4354 traceoff(`traceon')
4355 @result{}
4356 traceon(`eval', `m4_divnum')
4357 @result{}
4358 define(`m4_eval', defn(`eval'))
4359 @result{}
4360 define(`m4_divnum', defn(`divnum'))
4361 @result{}
4362 eval(divnum)
4363 @error{}m4trace: -1- eval(`0') -> `0'
4364 @result{}0
4365 m4_eval(m4_divnum)
4366 @error{}m4trace: -2- m4_divnum -> `0'
4367 @result{}0
4368 @end example
4369
4370 As of GNU M4 2.0, named macro tracing is independent of global
4371 tracing status; calling @code{traceoff} without arguments turns off the
4372 global trace flag, but does not turn off tracing for macros where
4373 tracing was requested by name.  Likewise, calling @code{traceon} without
4374 arguments will affect tracing of macros that are not defined yet.  This
4375 behavior matches traditional implementations of @code{m4}.
4376
4377 @example
4378 $ @kbd{m4 -d}
4379 traceon
4380 @result{}
4381 define(`foo', `bar')
4382 @error{}m4trace: -1- define(`foo', `bar') -> `'
4383 @result{}
4384 foo # traced, even though foo was not defined at traceon
4385 @error{}m4trace: -1- foo -> `bar'
4386 @result{}bar # traced, even though foo was not defined at traceon
4387 traceoff(`foo')
4388 @error{}m4trace: -1- traceoff(`foo') -> `'
4389 @result{}
4390 foo # traced, since global tracing is still on
4391 @error{}m4trace: -1- foo -> `bar'
4392 @result{}bar # traced, since global tracing is still on
4393 traceon(`foo')
4394 @error{}m4trace: -1- traceon(`foo') -> `'
4395 @result{}
4396 traceoff
4397 @error{}m4trace: -1- traceoff -> `'
4398 @result{}
4399 foo # traced, since foo is now traced by name
4400 @error{}m4trace: -1- foo -> `bar'
4401 @result{}bar # traced, since foo is now traced by name
4402 traceoff(`foo')
4403 @result{}
4404 foo # untraced
4405 @result{}bar # untraced
4406 @end example
4407
4408 However, GNU M4 prior to 2.0 had slightly different
4409 semantics, where @code{traceon} without arguments only affected symbols
4410 that were defined at that moment, and @code{traceoff} without arguments
4411 stopped all tracing, even when tracing was requested by macro name.  The
4412 addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it
4413 possible to write a file that approximates the older semantics
4414 regardless of which version of GNU M4 is in use.
4415
4416 @comment options: -d-V
4417 @example
4418 $ @kbd{m4}
4419 ifdef(`m4symbols',
4420   `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)',
4421     `builtin(`traceon', $@@)')')dnl
4422 define(`traceoff', `ifelse(`$#', `0',
4423     `builtin(`traceoff')builtin(`traceoff', m4symbols)',
4424     `builtin(`traceoff', $@@)')')')dnl
4425 define(`a', `1')
4426 @result{}
4427 traceon # called before b is defined, so b is not traced
4428 @result{} # called before b is defined, so b is not traced
4429 define(`b', `2')
4430 @error{}m4trace: -1- define
4431 @result{}
4432 a b
4433 @error{}m4trace: -1- a
4434 @result{}1 2
4435 traceon(`b')
4436 @error{}m4trace: -1- traceon
4437 @error{}m4trace: -1- ifelse
4438 @error{}m4trace: -1- builtin
4439 @result{}
4440 a b
4441 @error{}m4trace: -1- a
4442 @error{}m4trace: -1- b
4443 @result{}1 2
4444 traceoff # stops tracing b, even though it was traced by name
4445 @error{}m4trace: -1- traceoff
4446 @error{}m4trace: -1- ifelse
4447 @error{}m4trace: -1- builtin
4448 @error{}m4trace: -2- m4symbols
4449 @error{}m4trace: -1- builtin
4450 @result{} # stops tracing b, even though it was traced by name
4451 a b
4452 @result{}1 2
4453 @end example
4454
4455 @xref{Debugmode}, for information on controlling the details of the
4456 display.  The format of the trace output is not specified by
4457 POSIX, and varies between implementations of @code{m4}.
4458
4459 Starting with M4 1.6, tracing also works via @code{indir}
4460 (@pxref{Indir}).  However, since tracing is an attribute tracked by
4461 macro names, and @code{builtin} bypasses macro names (@pxref{Builtin}),
4462 it is not possible for @code{builtin} to trace which subsidiary builtin
4463 it invokes.  If you are worried about tracking all invocations of a
4464 given builtin, you should also trace @code{builtin}, or enable global
4465 tracing (the @samp{t} debug level, @pxref{Debugmode}).
4466
4467 @example
4468 $ @kbd{m4 -d}
4469 define(`my_defn', defn(`defn'))undefine(`defn')
4470 @result{}
4471 define(`foo', `bar')traceon(`foo', `defn', `my_defn')
4472 @result{}
4473 foo
4474 @error{}m4trace: -1- foo -> `bar'
4475 @result{}bar
4476 indir(`foo')
4477 @error{}m4trace: -1- foo -> `bar'
4478 @result{}bar
4479 my_defn(`foo')
4480 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4481 @result{}bar
4482 indir(`my_defn', `foo')
4483 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4484 @result{}bar
4485 builtin(`defn', `foo')
4486 @result{}bar
4487 debugmode(`+cxt')
4488 @result{}
4489 builtin(`defn', builtin(`shift', `', `foo'))
4490 @error{}m4trace: -1- id 12: builtin ... = <builtin>
4491 @error{}m4trace: -2- id 13: builtin ... = <builtin>
4492 @error{}m4trace: -2- id 13: builtin(`shift', `', `foo') -> ``foo''
4493 @error{}m4trace: -1- id 12: builtin(`defn', `foo') -> ``bar''
4494 @result{}bar
4495 indir(`my_defn', indir(`shift', `', `foo'))
4496 @error{}m4trace: -1- id 14: indir ... = <indir>
4497 @error{}m4trace: -2- id 15: indir ... = <indir>
4498 @error{}m4trace: -2- id 15: shift ... = <shift>
4499 @error{}m4trace: -2- id 15: shift(`', `foo') -> ``foo''
4500 @error{}m4trace: -2- id 15: indir(`shift', `', `foo') -> ``foo''
4501 @error{}m4trace: -1- id 14: my_defn ... = <defn>
4502 @error{}m4trace: -1- id 14: my_defn(`foo') -> ``bar''
4503 @error{}m4trace: -1- id 14: indir(`my_defn', `foo') -> ``bar''
4504 @result{}bar
4505 @end example
4506
4507 @node Debugmode
4508 @section Controlling debugging options
4509
4510 @cindex controlling debugging output
4511 @cindex debugging output, controlling
4512 The @option{--debug} option to @code{m4} (also spelled
4513 @option{--debugmode} or @option{-d}, @pxref{Debugging options, ,
4514 Invoking m4}) controls the amount of details presented in three
4515 categories of output.  Trace output is requested by @code{traceon}
4516 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
4517 relation to a macro invocation.  Debug output tracks useful events not
4518 associated with a macro invocation, and each line is prefixed by
4519 @samp{m4debug:}.  Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
4520 affected, with no prefix added to the output lines.
4521
4522 The @var{flags} following the option can be one or more of the
4523 following:
4524
4525 @table @code
4526 @item a
4527 In trace output, show the actual arguments that were collected before
4528 invoking the macro.  Arguments are subject to length truncation
4529 specified by @code{debuglen} (@pxref{Debuglen}).
4530
4531 @item c
4532 In trace output, show an additional line for each macro call, when the
4533 macro is seen, but before the arguments are collected, and show the
4534 definition of the macro that will be used for the expansion.  By
4535 default, only one line is printed, after all arguments are collected and
4536 the expansion determined.  The definition is subject to length
4537 truncation specified by @code{debuglen} (@pxref{Debuglen}).  This is
4538 often used with the @samp{x} flag.
4539
4540 @item d
4541 Output a warning on any attempt to dereference an undefined macro via
4542 @code{builtin}, @code{defn}, @code{dumpdef}, @code{indir},
4543 @code{popdef}, or @code{undefine}.  Note that @code{indef},
4544 @code{m4symbols},
4545 @code{traceon}, and @code{traceoff} do not dereference undefined macros.
4546 Like any other warning, the warnings enabled by this flag go to standard
4547 error regardless of the current @code{debugfile} setting, and will
4548 change exit status if the command line option @option{--fatal-warnings}
4549 was specified.  This flag is useful in diagnosing spelling mistakes in
4550 macro names.  It is enabled by default when neither @option{--debug} nor
4551 @option{--fatal-warnings} are specified on the command line.
4552
4553 @item e
4554 In trace output, show the expansion of each macro call.  The expansion
4555 is subject to length truncation specified by @code{debuglen}
4556 (@pxref{Debuglen}).
4557
4558 @item f
4559 In debug and trace output, include the name of the current input file in
4560 the output line.
4561
4562 @item i
4563 In debug output, print a message each time the current input file is
4564 changed.
4565
4566 @item l
4567 In debug and trace output, include the current input line number in the
4568 output line.
4569
4570 @item m
4571 In debug output, print a message each time a module is manipulated
4572 (@pxref{Modules}).  In trace output when the @samp{c} flag is in effect,
4573 and in dumpdef output, follow builtin macros with their module name,
4574 surrounded by braces (@samp{@{@}}).
4575
4576 @item o
4577 Output @code{dumpdef} data to standard error instead of the current
4578 debug file.  This can be useful when post-processing trace output, where
4579 interleaving dumpdef and trace output can cause ambiguities.
4580
4581 @item p
4582 In debug output, print a message when a named file is found through the
4583 path search mechanism (@pxref{Search Path}), giving the actual file name
4584 used.
4585
4586 @item q
4587 In trace and dumpdef output, quote actual arguments and macro expansions
4588 in the display with the current quotes.  This is useful in connection
4589 with the @samp{a} and @samp{e} flags above.
4590
4591 @item s
4592 In dumpdef output, show the entire stack of definitions associated with
4593 a symbol via @code{pushdef}.
4594
4595 @item t
4596 In trace output, trace all macro calls made in this invocation of
4597 @code{m4}.  This is equivalent to using @code{traceon} without
4598 arguments.
4599
4600 @item x
4601 In trace output, add a unique `macro call id' to each line of the trace
4602 output.  This is useful in connection with the @samp{c} flag above, to
4603 match where a macro is first recognized with where it is finally
4604 expanded, in spite of intermediate expansions that occur while
4605 collecting arguments.  It can also be used in isolation to determine how
4606 many macros have been expanded.
4607
4608 @item V
4609 A shorthand for all of the above flags.
4610 @end table
4611
4612 As special cases, if @var{flags} starts with a @samp{+}, the named flags
4613 are enabled without impacting other flags, and if it starts with a
4614 @samp{-}, the named flags are disabled without impacting other flags.
4615 Without either of these starting characters, @var{flags} simply replaces
4616 the previous setting.
4617 @comment FIXME - should we accept usage like debugmode(+fl-q)?  Also,
4618 @comment should we add debugmode(?) which expands to the current
4619 @comment enabled flags, and debugmode(e?) which expands to e if e is
4620 @comment currently enabled?
4621
4622 If no flags are specified with the @option{--debug} option, the default is
4623 @samp{+adeq}.  Many examples in this manual show their output using
4624 default flags.
4625
4626 @cindex GNU extensions
4627 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4628 the debugging output format:
4629
4630 @deffn {Builtin (gnu)} debugmode (@ovar{flags})
4631 The argument @var{flags} should be a subset of the letters listed above.
4632 If no argument is present, all debugging flags are cleared (as if
4633 @var{flags} were an explicit @samp{-V}).  With an empty argument, the
4634 most common flags are enabled (as if @var{flags} were an explicit
4635 @samp{+adeq}).  If an unknown flag is encountered, an error is issued.
4636
4637 The expansion of @code{debugmode} is void.
4638 @end deffn
4639
4640 @comment options: -d-V
4641 @example
4642 $ @kbd{m4}
4643 define(`foo', `FOO$1')
4644 @result{}
4645 traceon(`foo', `divnum')
4646 @result{}
4647 debugmode()dnl same as debugmode(`+adeq')
4648 foo
4649 @error{}m4trace: -1- foo -> `FOO'
4650 @result{}FOO
4651 debugmode(`V')debugmode(`-q')
4652 @error{}m4trace:stdin:5: -1- id 7: debugmode ... = <debugmode>@{gnu@}
4653 @error{}m4trace:stdin:5: -1- id 7: debugmode(`-q') -> `'
4654 @result{}
4655 foo(
4656 `BAR')
4657 @error{}m4trace:stdin:6: -1- id 8: foo ... = FOO$1
4658 @error{}m4trace:stdin:6: -1- id 8: foo(BAR) -> FOOBAR
4659 @result{}FOOBAR
4660 debugmode`'dnl same as debugmode(`-V')
4661 @error{}m4trace:stdin:8: -1- id 9: debugmode ... = <debugmode>@{gnu@}
4662 @error{}m4trace:stdin:8: -1- id 9: debugmode ->@w{ }
4663 foo
4664 @error{}m4trace: -1- foo
4665 @result{}FOO
4666 debugmode(`+clmx')
4667 @result{}
4668 foo(divnum)
4669 @error{}m4trace:11: -1- id 13: foo ... = FOO$1
4670 @error{}m4trace:11: -2- id 14: divnum ... = <divnum>@{m4@}
4671 @error{}m4trace:11: -2- id 14: divnum
4672 @error{}m4trace:11: -1- id 13: foo
4673 @result{}FOO0
4674 debugmode(`-m')
4675 @result{}
4676 @end example
4677
4678 This example shows the effects of the debug flags that are not related
4679 to macro tracing.
4680
4681 @comment examples
4682 @comment options: -dip
4683 @example
4684 $ @kbd{m4 -dip -I examples}
4685 @error{}m4debug: input read from 'stdin'
4686 define(`foo', `m4wrap(`wrapped text
4687 ')dnl')
4688 @result{}
4689 include(`incl.m4')dnl
4690 @error{}m4debug: path search for 'incl.m4' found 'examples/incl.m4'
4691 @error{}m4debug: input read from 'examples/incl.m4'
4692 @result{}Include file start
4693 @result{}Include file end
4694 @error{}m4debug: input reverted to stdin, line 3
4695 ^D
4696 @error{}m4debug: input exhausted
4697 @error{}m4debug: input from m4wrap recursion level 1
4698 @result{}wrapped text
4699 @error{}m4debug: input from m4wrap exhausted
4700 @end example
4701
4702 @node Debuglen
4703 @section Limiting debug output
4704
4705 @cindex GNU extensions
4706 @cindex arglength
4707 @cindex debuglen
4708 @cindex limiting trace output length
4709 @cindex trace output, limiting length
4710 @cindex dumpdef output, limiting length
4711 When debugging, sometimes it is desirable to reduce the clutter of
4712 arbitrary-length strings, because the prefix carries enough information
4713 to understand the issues.  The builtin macro @code{debuglen}, along with
4714 the command line option counterpart @option{--debuglen} (or @option{-l},
4715 @pxref{Debugging options, , Invoking m4}), allow on-the-fly control of
4716 debugging string lengths:
4717
4718 @deffn {Builtin (gnu)} debuglen (@var{len})
4719 The argument @var{len} is an integer that controls how much of
4720 arbitrary-length strings should be output during trace and dumpdef
4721 output.  If specified to a non-zero value, then strings longer than that
4722 length are truncated, and @samp{...} included in the output to show that
4723 truncation took place.  A warning is issued if @var{len} cannot be
4724 parsed as an integer.
4725 @comment FIXME - make this understand an optional suffix, similar to how
4726 @comment --debuglen does.  Also, we need a section documenting scaling
4727 @comment suffixes.
4728 @comment FIXME - should we allow len to be `?', meaning expand to the
4729 @comment current value?
4730
4731 The macro @code{debuglen} is recognized only with parameters.
4732 @end deffn
4733
4734 The following example demonstrates the behavior of length truncation.
4735 Note that each argument and the final result are individually truncated.
4736 Also, the special tokens for builtin functions are not truncated.
4737
4738 @comment options: -l6 -techo -tdefn
4739 @example
4740 $ @kbd{m4 -d -l 6 -t echo -t defn}
4741 debuglen(`oops')
4742 @error{}m4:stdin:1: warning: debuglen: non-numeric argument 'oops'
4743 @result{}
4744 define(`echo', `$@@')
4745 @result{}
4746 echo(`1', `long string')
4747 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4748 @result{}1,long string
4749 indir(`echo', defn(`changequote'))
4750 @error{}m4trace: -2- defn(`change...') -> `<changequote>'
4751 @error{}m4trace: -1- echo(<changequote>) -> ``<changequote>''
4752 @result{}
4753 debuglen
4754 @result{}debuglen
4755 debuglen(`0')
4756 @result{}
4757 echo(`long string')
4758 @error{}m4trace: -1- echo(`long string') -> ``long string''
4759 @result{}long string
4760 debuglen(`12')
4761 @result{}
4762 echo(`long string')
4763 @error{}m4trace: -1- echo(`long string') -> ``long string...'
4764 @result{}long string
4765 @end example
4766
4767 @node Debugfile
4768 @section Saving debugging output
4769
4770 @cindex saving debugging output
4771 @cindex debugging output, saving
4772 @cindex output, saving debugging
4773 @cindex GNU extensions
4774 Debug and tracing output can be redirected to files using either the
4775 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4776 Invoking m4}), or with the builtin macro @code{debugfile}:
4777
4778 @deffn {Builtin (gnu)} debugfile (@ovar{file})
4779 Send all further debug and trace output to @var{file}, opened in append
4780 mode.  If @var{file} is the empty string, debug and trace output are
4781 discarded.  If @code{debugfile} is called without any arguments, debug
4782 and trace output are sent to standard error.  Output from @code{dumpdef}
4783 is sent to this file if the debug level @code{o} is not set
4784 (@pxref{Debugmode}).  This does not affect
4785 warnings, error messages, or @code{errprint} output, which are
4786 always sent to standard error.  If @var{file} cannot be opened, the
4787 current debug file is unchanged, and an error is issued.
4788
4789 When the @option{--safer} option (@pxref{Operation modes, , Invoking
4790 m4}) is in effect, @var{file} must be empty or omitted, since otherwise
4791 an input file could cause the modification of arbitrary files.
4792
4793 The expansion of @code{debugfile} is void.
4794 @end deffn
4795
4796 @example
4797 $ @kbd{m4 -d}
4798 traceon(`divnum')
4799 @result{}
4800 divnum(`extra')
4801 @error{}m4:stdin:2: warning: divnum: extra arguments ignored: 1 > 0
4802 @error{}m4trace: -1- divnum(`extra') -> `0'
4803 @result{}0
4804 debugfile()
4805 @result{}
4806 divnum(`extra')
4807 @error{}m4:stdin:4: warning: divnum: extra arguments ignored: 1 > 0
4808 @result{}0
4809 debugfile
4810 @result{}
4811 divnum
4812 @error{}m4trace: -1- divnum -> `0'
4813 @result{}0
4814 @end example
4815
4816 Although the @option{--safer} option cripples @code{debugfile} to a
4817 limited subset of capabilities, you may still use the @option{--debugfile}
4818 option from the command line with no restrictions.
4819
4820 @comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l
4821 @comment status: 1
4822 @example
4823 $ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq}
4824 foo # traced to `trace'
4825 @result{}bar # traced to `trace'
4826 debugfile(`file')
4827 @error{}m4:stdin:2: debugfile: disabled by --safer
4828 @result{}
4829 foo # traced to `trace'
4830 @result{}bar # traced to `trace'
4831 debugfile()
4832 @result{}
4833 foo # trace discarded
4834 @result{}bar # trace discarded
4835 debugfile
4836 @result{}
4837 foo # traced to stderr
4838 @error{}m4trace:7: -1- foo -> `bar'
4839 @result{}bar # traced to stderr
4840 undivert(`trace')dnl
4841 @result{}m4trace:1: -1- foo -> `bar'
4842 @result{}m4trace:3: -1- foo -> `bar'
4843 @end example
4844
4845 Sometimes it is useful to post-process trace output, even though there
4846 is no standardized format for trace output.  In this situation, forcing
4847 @code{dumpdef} to output to standard error instead of the default of the
4848 current debug file will avoid any ambiguities between the two types of
4849 output; it also allows debugging via @code{dumpdef} when debug output is
4850 discarded.
4851
4852 @example
4853 $ @kbd{m4 -d}
4854 traceon(`divnum')
4855 @result{}
4856 divnum
4857 @error{}m4trace: -1- divnum -> `0'
4858 @result{}0
4859 dumpdef(`divnum')
4860 @error{}divnum:@tabchar{}<divnum>
4861 @result{}
4862 debugfile(`')
4863 @result{}
4864 divnum
4865 @result{}0
4866 dumpdef(`divnum')
4867 @result{}
4868 debugmode(`+o')
4869 @result{}
4870 divnum
4871 @result{}0
4872 dumpdef(`divnum')
4873 @error{}divnum:@tabchar{}<divnum>
4874 @result{}
4875 @end example
4876
4877 @node Input Control
4878 @chapter Input control
4879
4880 This chapter describes various builtin macros for controlling the input
4881 to @code{m4}.
4882
4883 @menu
4884 * Dnl::                         Deleting whitespace in input
4885 * Changequote::                 Changing the quote characters
4886 * Changecom::                   Changing the comment delimiters
4887 * Changeresyntax::              Changing the regular expression syntax
4888 * Changesyntax::                Changing the lexical structure of the input
4889 * M4wrap::                      Saving text until end of input
4890 @end menu
4891
4892 @node Dnl
4893 @section Deleting whitespace in input
4894
4895 @cindex deleting whitespace in input
4896 @cindex discarding input
4897 @cindex input, discarding
4898 The builtin @code{dnl} stands for ``Discard to Next Line'':
4899
4900 @deffn {Builtin (m4)} dnl
4901 All characters, up to and including the next newline, are discarded
4902 without performing any macro expansion.  A warning is issued if the end
4903 of the file is encountered without a newline.
4904
4905 The expansion of @code{dnl} is void.
4906 @end deffn
4907
4908 It is often used in connection with @code{define}, to remove the
4909 newline that follows the call to @code{define}.  Thus
4910
4911 @example
4912 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4913 foo
4914 @result{}Macro foo.
4915 @end example
4916
4917 The input up to and including the next newline is discarded, as opposed
4918 to the way comments are treated (@pxref{Comments}), when the command
4919 line option @option{--discard-comments} is not in effect
4920 (@pxref{Operation modes, , Invoking m4}).
4921
4922 Usually, @code{dnl} is immediately followed by an end of line or some
4923 other whitespace.  GNU @code{m4} will produce a warning diagnostic if
4924 @code{dnl} is followed by an open parenthesis.  In this case, @code{dnl}
4925 will collect and process all arguments, looking for a matching close
4926 parenthesis.  All predictable side effects resulting from this
4927 collection will take place.  @code{dnl} will return no output.  The
4928 input following the matching close parenthesis up to and including the
4929 next newline, on whatever line containing it, will still be discarded.
4930
4931 @example
4932 dnl(`args are ignored, but side effects occur',
4933 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4934 @error{}m4:stdin:1: warning: dnl: extra arguments ignored: 2 > 0
4935 See how `foo' was defined, foo?
4936 @result{}See how foo was defined, like this?
4937 @end example
4938
4939 If the end of file is encountered without a newline character, a
4940 warning is issued and dnl stops consuming input.
4941
4942 @example
4943 m4wrap(`m4wrap(`2 hi
4944 ')0 hi dnl 1 hi')
4945 @result{}
4946 define(`hi', `HI')
4947 @result{}
4948 ^D
4949 @error{}m4:stdin:1: warning: dnl: end of file treated as newline
4950 @result{}0 HI 2 HI
4951 @end example
4952
4953 @node Changequote
4954 @section Changing the quote characters
4955
4956 @cindex changing quote delimiters
4957 @cindex quote delimiters, changing
4958 @cindex delimiters, changing
4959 The default quote delimiters can be changed with the builtin
4960 @code{changequote}:
4961
4962 @deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '})
4963 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4964 the new end-quote delimiter.  If both arguments are missing, the default
4965 quotes (@code{`} and @code{'}) are used.  If @var{start} is void, then
4966 quoting is disabled.  Otherwise, if @var{end} is missing or void, the
4967 default end-quote delimiter (@code{'}) is used.  The quote delimiters
4968 can be of any length.
4969
4970 The expansion of @code{changequote} is void.
4971 @end deffn
4972
4973 @example
4974 changequote(`[', `]')
4975 @result{}
4976 define([foo], [Macro [foo].])
4977 @result{}
4978 foo
4979 @result{}Macro foo.
4980 @end example
4981
4982 The quotation strings can safely contain eight-bit characters.
4983 If no single character is appropriate, @var{start} and @var{end} can be
4984 of any length.  Other implementations cap the delimiter length to five
4985 characters, but GNU has no inherent limit.
4986
4987 @example
4988 changequote(`[[[', `]]]')
4989 @result{}
4990 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4991 @result{}
4992 foo
4993 @result{}Macro [[foo]].
4994 @end example
4995
4996 Calling @code{changequote} with @var{start} as the empty string will
4997 effectively disable the quoting mechanism, leaving no way to quote text.
4998 However, using an empty string is not portable, as some other
4999 implementations of @code{m4} revert to the default quoting, while others
5000 preserve the prior non-empty delimiter.  If @var{start} is not empty,
5001 then an empty @var{end} will use the default end-quote delimiter of
5002 @samp{'}, as otherwise, it would be impossible to end a quoted string.
5003 Again, this is not portable, as some other @code{m4} implementations
5004 reuse @var{start} as the end-quote delimiter, while others preserve the
5005 previous non-empty value.  Omitting both arguments restores the default
5006 begin-quote and end-quote delimiters; fortunately this behavior is
5007 portable to all implementations of @code{m4}.
5008
5009 @example
5010 define(`foo', `Macro `FOO'.')
5011 @result{}
5012 changequote(`', `')
5013 @result{}
5014 foo
5015 @result{}Macro `FOO'.
5016 `foo'
5017 @result{}`Macro `FOO'.'
5018 changequote(`,)
5019 @result{}
5020 foo
5021 @result{}Macro FOO.
5022 @end example
5023
5024 There is no way in @code{m4} to quote a string containing an unmatched
5025 begin-quote, except using @code{changequote} to change the current
5026 quotes.
5027
5028 If the quotes should be changed from, say, @samp{[} to @samp{[[},
5029 temporary quote characters have to be defined.  To achieve this, two
5030 calls of @code{changequote} must be made, one for the temporary quotes
5031 and one for the new quotes.
5032
5033 Macros are recognized in preference to the begin-quote string, so if a
5034 prefix of @var{start} can be recognized as part of a potential macro
5035 name, the quoting mechanism is effectively disabled.  Unless you use
5036 @code{changesyntax} (@pxref{Changesyntax}), this means that @var{start}
5037 should not begin with a letter, digit, or @samp{_} (underscore).
5038 However, even though quoted strings are not recognized, the quote
5039 characters can still be discerned in macro expansion and in trace
5040 output.
5041
5042 @example
5043 define(`echo', `$@@')
5044 @result{}
5045 define(`hi', `HI')
5046 @result{}
5047 changequote(`q', `Q')
5048 @result{}
5049 q hi Q hi
5050 @result{}q HI Q HI
5051 echo(hi)
5052 @result{}qHIQ
5053 changequote
5054 @result{}
5055 changequote(`-', `EOF')
5056 @result{}
5057 - hi EOF hi
5058 @result{} hi  HI
5059 changequote
5060 @result{}
5061 changequote(`1', `2')
5062 @result{}
5063 hi1hi2
5064 @result{}hi1hi2
5065 hi 1hi2
5066 @result{}HI hi
5067 @end example
5068
5069 Quotes are recognized in preference to argument collection.  In
5070 particular, if @var{start} is a single @samp{(}, then argument
5071 collection is effectively disabled.  For portability with other
5072 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5073 @samp{)} as the first character in @var{start}.
5074
5075 @example
5076 define(`echo', `$#:$@@:')
5077 @result{}
5078 define(`hi', `HI')
5079 @result{}
5080 changequote(`(',`)')
5081 @result{}
5082 echo(hi)
5083 @result{}0::hi
5084 changequote
5085 @result{}
5086 changequote(`((', `))')
5087 @result{}
5088 echo(hi)
5089 @result{}1:HI:
5090 echo((hi))
5091 @result{}0::hi
5092 changequote
5093 @result{}
5094 changequote(`,', `)')
5095 @result{}
5096 echo(hi,hi)bye)
5097 @result{}1:HIhibye:
5098 @end example
5099
5100 However, if you are not worried about portability, using @samp{(} and
5101 @samp{)} as quoting characters has an interesting property---you can use
5102 it to compute a quoted string containing the expansion of any quoted
5103 text, as long as the expansion results in both balanced quotes and
5104 balanced parentheses.  The trick is realizing @code{expand} uses
5105 @samp{$1} unquoted, to trigger its expansion using the normal quoting
5106 characters, but uses extra parentheses to group unquoted commas that
5107 occur in the expansion without consuming whitespace following those
5108 commas.  Then @code{_expand} uses @code{changequote} to convert the
5109 extra parentheses back into quoting characters.  Note that it takes two
5110 more @code{changequote} invocations to restore the original quotes.
5111 Contrast the behavior on whitespace when using @samp{$*}, via
5112 @code{quote}, to attempt the same task.
5113
5114 @example
5115 changequote(`[', `]')dnl
5116 define([a], [1, (b)])dnl
5117 define([b], [2])dnl
5118 define([quote], [[$*]])dnl
5119 define([expand], [_$0(($1))])dnl
5120 define([_expand],
5121   [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
5122 expand([a, a, [a, a], [[a, a]]])
5123 @result{}1, (2), 1, (2), a, a, [a, a]
5124 quote(a, a, [a, a], [[a, a]])
5125 @result{}1,(2),1,(2),a, a,[a, a]
5126 @end example
5127
5128 If @var{end} is a prefix of @var{start}, the end-quote will be
5129 recognized in preference to a nested begin-quote.  In particular,
5130 changing the quotes to have the same string for @var{start} and
5131 @var{end} disables nesting of quotes.  When quote nesting is disabled,
5132 it is impossible to double-quote strings across macro expansions, so
5133 using the same string is not done very often.
5134
5135 @example
5136 define(`hi', `HI')
5137 @result{}
5138 changequote(`""', `"')
5139 @result{}
5140 ""hi"""hi"
5141 @result{}hihi
5142 ""hi" ""hi"
5143 @result{}hi hi
5144 ""hi"" "hi"
5145 @result{}hi" "HI"
5146 changequote
5147 @result{}
5148 `hi`hi'hi'
5149 @result{}hi`hi'hi
5150 changequote(`"', `"')
5151 @result{}
5152 "hi"hi"hi"
5153 @result{}hiHIhi
5154 @end example
5155
5156 It is an error if the end of file occurs within a quoted string.
5157
5158 @comment status: 1
5159 @example
5160 `hello world'
5161 @result{}hello world
5162 `dangling quote
5163 ^D
5164 @error{}m4:stdin:2: end of file in string
5165 @end example
5166
5167 @comment status: 1
5168 @example
5169 ifelse(`dangling quote
5170 ^D
5171 @error{}m4:stdin:1: ifelse: end of file in string
5172 @end example
5173
5174 @node Changecom
5175 @section Changing the comment delimiters
5176
5177 @cindex changing comment delimiters
5178 @cindex comment delimiters, changing
5179 @cindex delimiters, changing
5180 The default comment delimiters can be changed with the builtin
5181 macro @code{changecom}:
5182
5183 @deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}})
5184 This sets @var{start} as the new begin-comment delimiter and @var{end}
5185 as the new end-comment delimiter.  If both arguments are missing, or
5186 @var{start} is void, then comments are disabled.  Otherwise, if
5187 @var{end} is missing or void, the default end-comment delimiter of
5188 newline is used.  The comment delimiters can be of any length.
5189
5190 The expansion of @code{changecom} is void.
5191 @end deffn
5192
5193 @example
5194 define(`comment', `COMMENT')
5195 @result{}
5196 # A normal comment
5197 @result{}# A normal comment
5198 changecom(`/*', `*/')
5199 @result{}
5200 # Not a comment anymore
5201 @result{}# Not a COMMENT anymore
5202 But: /* this is a comment now */ while this is not a comment
5203 @result{}But: /* this is a comment now */ while this is not a COMMENT
5204 @end example
5205
5206 @cindex comments, copied to output
5207 Note how comments are copied to the output, much as if they were quoted
5208 strings.  If you want the text inside a comment expanded, quote the
5209 begin-comment delimiter.
5210
5211 Calling @code{changecom} without any arguments, or with @var{start} as
5212 the empty string, will effectively disable the commenting mechanism.  To
5213 restore the original comment start of @samp{#}, you must explicitly ask
5214 for it.  If @var{start} is not empty, then an empty @var{end} will use
5215 the default end-comment delimiter of newline, as otherwise, it would be
5216 impossible to end a comment.  However, this is not portable, as some
5217 other @code{m4} implementations preserve the previous non-empty
5218 delimiters instead.
5219
5220 @example
5221 define(`comment', `COMMENT')
5222 @result{}
5223 changecom
5224 @result{}
5225 # Not a comment anymore
5226 @result{}# Not a COMMENT anymore
5227 changecom(`#', `')
5228 @result{}
5229 # comment again
5230 @result{}# comment again
5231 @end example
5232
5233 The comment strings can safely contain eight-bit characters.
5234 If no single character is appropriate, @var{start} and @var{end} can be
5235 of any length.  Other implementations cap the delimiter length to five
5236 characters, but GNU has no inherent limit.
5237
5238 As of M4 1.6, macros and quotes are recognized in preference to
5239 comments, so if a prefix of @var{start} can be recognized as part of a
5240 potential macro name, or confused with a quoted string, the comment
5241 mechanism is effectively disabled (earlier versions of GNU M4
5242 favored comments, but this was inconsistent with other implementations).
5243 Unless you use @code{changesyntax} (@pxref{Changesyntax}), this means
5244 that @var{start} should not begin with a letter, digit, or @samp{_}
5245 (underscore), and that neither the start-quote nor the start-comment
5246 string should be a prefix of the other.
5247
5248 @example
5249 define(`hi', `HI')
5250 @result{}
5251 define(`hi1hi2', `hello')
5252 @result{}
5253 changecom(`q', `Q')
5254 @result{}
5255 q hi Q hi
5256 @result{}q HI Q HI
5257 changecom(`1', `2')
5258 @result{}
5259 hi1hi2
5260 @result{}hello
5261 hi 1hi2
5262 @result{}HI 1hi2
5263 changecom(`[[', `]]')
5264 @result{}
5265 changequote(`[[[', `]]]')
5266 @result{}
5267 [hi]
5268 @result{}[HI]
5269 [[hi]]
5270 @result{}[[hi]]
5271 [[[hi]]]
5272 @result{}hi
5273 changequote
5274 @result{}
5275 changecom(`[[[', `]]]')
5276 @result{}
5277 changequote(`[[', `]]')
5278 @result{}
5279 [[hi]]
5280 @result{}hi
5281 [[[hi]]]
5282 @result{}[hi]
5283 @end example
5284
5285 Comments are recognized in preference to argument collection.  In
5286 particular, if @var{start} is a single @samp{(}, then argument
5287 collection is effectively disabled.  For portability with other
5288 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5289 @samp{)} as the first character in @var{start}.
5290
5291 @example
5292 define(`echo', `$#:$*:$@@:')
5293 @result{}
5294 define(`hi', `HI')
5295 @result{}
5296 changecom(`(',`)')
5297 @result{}
5298 echo(hi)
5299 @result{}0:::(hi)
5300 changecom
5301 @result{}
5302 changecom(`((', `))')
5303 @result{}
5304 echo(hi)
5305 @result{}1:HI:HI:
5306 echo((hi))
5307 @result{}0:::((hi))
5308 changecom(`,', `)')
5309 @result{}
5310 echo(hi,hi)bye)
5311 @result{}1:HI,hi)bye:HI,hi)bye:
5312 changecom
5313 @result{}
5314 echo(hi,`,`'hi',hi)
5315 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
5316 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
5317 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
5318 @end example
5319
5320 It is an error if the end of file occurs within a comment.
5321
5322 @comment status: 1
5323 @example
5324 changecom(`/*', `*/')
5325 @result{}
5326 /*dangling comment
5327 ^D
5328 @error{}m4:stdin:2: end of file in comment
5329 @end example
5330
5331 @comment status: 1
5332 @example
5333 changecom(`/*', `*/')
5334 @result{}
5335 len(/*dangling comment
5336 ^D
5337 @error{}m4:stdin:2: len: end of file in comment
5338 @end example
5339
5340 @node Changeresyntax
5341 @section Changing the regular expression syntax
5342
5343 @cindex regular expression syntax, changing
5344 @cindex basic regular expressions
5345 @cindex extended regular expressions
5346 @cindex regular expressions
5347 @cindex expressions, regular
5348 @cindex syntax, changing regular expression
5349 @cindex flavors of regular expressions
5350 @cindex GNU extensions
5351 The GNU extensions @code{patsubst}, @code{regexp}, and more
5352 recently, @code{renamesyms} each deal with regular expressions.  There
5353 are multiple flavors of regular expressions, so the
5354 @code{changeresyntax} builtin exists to allow choosing the default
5355 flavor:
5356
5357 @deffn {Builtin (gnu)} changeresyntax (@var{resyntax})
5358 Changes the default regular expression syntax used by M4 according to
5359 the value of @var{resyntax}, equivalent to passing @var{resyntax} as the
5360 argument to the command line option @option{--regexp-syntax}
5361 (@pxref{Operation modes, , Invoking m4}).  If @var{resyntax} is empty,
5362 the default flavor is reverted to the @code{GNU_M4} style, compatible
5363 with emacs.
5364
5365 @var{resyntax} can be any one of the values in the table below.  Case is
5366 not important, and @samp{-} or @samp{ } can be substituted for @samp{_} in
5367 the given names.  If @var{resyntax} is unrecognized, a warning is
5368 issued and the default flavor is not changed.
5369
5370 @table @dfn
5371 @item AWK
5372 @xref{awk regular expression syntax}, for details.
5373
5374 @item BASIC
5375 @itemx ED
5376 @itemx POSIX_BASIC
5377 @itemx SED
5378 @xref{posix-basic regular expression syntax}, for details.
5379
5380 @item BSD_M4
5381 @item EXTENDED
5382 @itemx POSIX_EXTENDED
5383 @xref{posix-extended regular expression syntax}, for details.
5384
5385 @item GNU_AWK
5386 @itemx GAWK
5387 @xref{gnu-awk regular expression syntax}, for details.
5388
5389 @item GNU_EGREP
5390 @itemx EGREP
5391 @xref{egrep regular expression syntax}, for details.
5392
5393 @item GNU_M4
5394 @item EMACS
5395 @itemx GNU_EMACS
5396 @xref{emacs regular expression syntax}, for details.  This is the
5397 default regular expression flavor.
5398
5399 @item GREP
5400 @xref{grep regular expression syntax}, for details.
5401
5402 @item MINIMAL
5403 @itemx POSIX_MINIMAL
5404 @itemx POSIX_MINIMAL_BASIC
5405 @xref{posix-minimal-basic regular expression syntax}, for details.
5406
5407 @item POSIX_AWK
5408 @xref{posix-awk regular expression syntax}, for details.
5409
5410 @item POSIX_EGREP
5411 @xref{posix-egrep regular expression syntax}, for details.
5412 @end table
5413
5414 The expansion of @code{changeresyntax} is void.
5415 The macro @code{changeresyntax} is recognized only with parameters.
5416 This macro was added in M4 2.0.
5417 @end deffn
5418
5419 For an example of how @var{resyntax} is recognized, the first three
5420 usages select the @samp{GNU_M4} regular expression flavor:
5421
5422 @example
5423 changeresyntax(`gnu m4')
5424 @result{}
5425 changeresyntax(`GNU-m4')
5426 @result{}
5427 changeresyntax(`Gnu_M4')
5428 @result{}
5429 changeresyntax(`unknown')
5430 @error{}m4:stdin:4: warning: changeresyntax: bad syntax-spec: 'unknown'
5431 @result{}
5432 @end example
5433
5434 Using @code{changeresyntax} makes it possible to omit the optional
5435 @var{resyntax} parameter to other macros, while still using a different
5436 regular expression flavor.
5437
5438 @example
5439 patsubst(`ab', `a|b', `c')
5440 @result{}ab
5441 patsubst(`ab', `a\|b', `c')
5442 @result{}cc
5443 patsubst(`ab', `a|b', `c', `EXTENDED')
5444 @result{}cc
5445 changeresyntax(`EXTENDED')
5446 @result{}
5447 patsubst(`ab', `a|b', `c')
5448 @result{}cc
5449 patsubst(`ab', `a\|b', `c')
5450 @result{}ab
5451 @end example
5452
5453 @node Changesyntax
5454 @section Changing the lexical structure of the input
5455
5456 @cindex lexical structure of the input
5457 @cindex input, lexical structure of the
5458 @cindex syntax table
5459 @cindex changing syntax
5460 @cindex GNU extensions
5461 @quotation
5462 The macro @code{changesyntax} and all associated functionality is
5463 experimental (@pxref{Experiments}).  The functionality might change in
5464 the future.  Please direct your comments about it the same way you would
5465 do for bugs.
5466 @end quotation
5467
5468 The input to @code{m4} is read character by character, and these
5469 characters are grouped together to form input tokens (such as macro
5470 names, strings, comments, etc.).
5471
5472 Each token is parsed according to certain rules.  For example, a macro
5473 name starts with a letter or @samp{_} and consists of the longest
5474 possible string of letters, @samp{_} and digits.  But who is to decide
5475 what characters are letters, digits, quotes, white space?  Earlier the
5476 operating system decided, now you do.  The builtin macro
5477 @code{changesyntax} is used to change the way @code{m4} parses the input
5478 stream into tokens.
5479
5480 @deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
5481 Each @var{syntax-spec} is a two-part string.  The first part is a
5482 command, consisting of a single character describing a syntax category,
5483 and an optional one-character action.  The action can be @samp{-} to
5484 remove the listed characters from that category, @samp{=} to set the
5485 category to the listed characters
5486 and reassign all other characters previously in that category to
5487 `Other', or @samp{+} to add the listed characters to the category
5488 without affecting other characters.  If an action is not specified, but
5489 additional characters are present, then @samp{=} is assumed.
5490
5491 The remaining characters of each @var{syntax-spec} form the set of
5492 characters to perform the action on for that syntax category.  Character
5493 ranges are expanded as for @code{translit} (@pxref{Translit}).  To start
5494 the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
5495 be specified.
5496
5497 If @var{syntax-spec} is just a category, and no action or characters
5498 were specified, then all characters in that category are reset to their
5499 default state.  A warning is issued if the category character is not
5500 valid.  If @var{syntax-spec} is the empty string, then all categories
5501 are reset to their default state.
5502
5503 Syntax categories are divided into basic and context.  Every input
5504 byte belongs to exactly one basic syntax category.  Additionally, any
5505 byte can be assigned to a context category regardless of its current
5506 basic category.  Context categories exist because a character can
5507 behave differently when parsed in isolation than when it occurs in
5508 context to close out a token started by another basic category (for
5509 example, @kbd{newline} defaults to the basic category `Whitespace' as
5510 well as the context category `End comment').
5511
5512 The following table describes the case-insensitive designation for each
5513 syntax category (the first byte in @var{syntax-spec}), and a description
5514 of what each category controls.
5515
5516 @multitable @columnfractions .06 .20 .13 .55
5517 @headitem Code @tab Category @tab Type @tab Description
5518
5519 @item @kbd{W} @tab @dfn{Words} @tab Basic
5520 @tab Characters that can start a macro name.  Defaults to the letters as
5521 defined by the locale, and the character @samp{_}.
5522
5523 @item @kbd{D} @tab @dfn{Digits} @tab Basic
5524 @tab Characters that, together with the letters, form the remainder of a
5525 macro name.  Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any
5526 other digits defined by the locale.
5527
5528 @item @kbd{S} @tab @dfn{White space} @tab Basic
5529 @tab Characters that should be trimmed from the beginning of each argument to
5530 a macro call.  The defaults are space, tab, newline, carriage return,
5531 form feed, and vertical tab, and any others as defined by the locale.
5532
5533 @item @kbd{(} @tab @dfn{Open parenthesis} @tab Basic
5534 @tab Characters that open the argument list of a macro call.  The default is
5535 the single character @samp{(}.
5536
5537 @item @kbd{)} @tab @dfn{Close parenthesis} @tab Basic
5538 @tab Characters that close the argument list of a macro call.  The default
5539 is the single character @samp{)}.
5540
5541 @item @kbd{,} @tab @dfn{Argument separator} @tab Basic
5542 @tab Characters that separate the arguments of a macro call.  The default is
5543 the single character @samp{,}.
5544
5545 @item @kbd{L} @tab @dfn{Left quote} @tab Basic
5546 @tab The set of characters that can start a single-character quoted string.
5547 The default is the single character @samp{`}.  For multiple-character
5548 quote delimiters, use @code{changequote} (@pxref{Changequote}).
5549
5550 @item @kbd{R} @tab @dfn{Right quote} @tab Context
5551 @tab The set of characters that can end a single-character quoted string.
5552 The default is the single character @samp{'}.  For multiple-character
5553 quote delimiters, use @code{changequote} (@pxref{Changequote}).  Note
5554 that @samp{'} also defaults to the syntax category `Other', when it
5555 appears in isolation.
5556
5557 @item @kbd{B} @tab @dfn{Begin comment} @tab Basic
5558 @tab The set of characters that can start a single-character comment.  The
5559 default is the single character @samp{#}.  For multiple-character
5560 comment delimiters, use @code{changecom} (@pxref{Changecom}).
5561
5562 @item @kbd{E} @tab @dfn{End comment} @tab Context
5563 @tab The set of characters that can end a single-character comment.  The
5564 default is the single character @kbd{newline}.  For multiple-character
5565 comment delimiters, use @code{changecom} (@pxref{Changecom}).  Note that
5566 newline also defaults to the syntax category `White space', when it
5567 appears in isolation.
5568
5569 @item @kbd{$} @tab @dfn{Dollar} @tab Context
5570 @tab Characters that can introduce an argument reference in the body of a
5571 macro.  The default is the single character @samp{$}.
5572
5573 @comment FIXME - implement ${10} argument parsing.
5574 @item @kbd{@{} @tab @dfn{Left brace} @tab Context
5575 @tab Characters that introduce an extended argument reference in the body of
5576 a macro immediately after a character in the Dollar category.  The
5577 default is the single character @samp{@{}.
5578
5579 @item @kbd{@}} @tab @dfn{Right brace} @tab Context
5580 @tab Characters that conclude an extended argument reference in the body of a
5581 macro.  The default is the single character @samp{@}}.
5582
5583 @item @kbd{O} @tab @dfn{Other} @tab Basic
5584 @tab Characters that have no special syntactical meaning to @code{m4}.
5585 Defaults to all characters except those in the categories above.
5586
5587 @item @kbd{A} @tab @dfn{Active} @tab Basic
5588 @tab Characters that themselves, alone, form macro names.  This is a
5589 GNU extension, and active characters have lower precedence
5590 than comments.  By default, no characters are active.
5591
5592 @item @kbd{@@} @tab @dfn{Escape} @tab Basic
5593 @tab Characters that must precede macro names for them to be recognized.
5594 This is a GNU extension.  When an escape character is defined,
5595 then macros are not recognized unless the escape character is present;
5596 however, the macro name, visible by @samp{$0} in macro definitions, does
5597 not include the escape character.  By default, no characters are
5598 escapes.
5599
5600 @comment FIXME - we should also consider supporting:
5601 @comment @item @kbd{I} @tab @dfn{Ignore} @tab Basic
5602 @comment @tab Characters that are ignored if they appear in
5603 @comment the input; perhaps defaulting to '\0'.
5604 @end multitable
5605
5606 The expansion of @code{changesyntax} is void.
5607 The macro @code{changesyntax} is recognized only with parameters.  Use
5608 this macro with caution, as it is possible to change the syntax in such
5609 a way that no further macros can be recognized by @code{m4}.
5610 This macro was added in M4 2.0.
5611 @end deffn
5612
5613 With @code{changesyntax} we can modify what characters form a word.  For
5614 example, we can make @samp{.} a valid character in a macro name, or even
5615 start a macro name with a number.
5616
5617 @example
5618 define(`test.1', `TEST ONE')
5619 @result{}
5620 define(`1', `one')
5621 @result{}
5622 __file__
5623 @result{}stdin
5624 test.1
5625 @result{}test.1
5626 dnl Add `.' and remove `_'.
5627 changesyntax(`W+.', `W-_')
5628 @result{}
5629 __file__
5630 @result{}__file__
5631 test.1
5632 @result{}TEST ONE
5633 dnl Set words to include numbers.
5634 changesyntax(`W=a-zA-Z0-9_')
5635 @result{}
5636 __file__
5637 @result{}stdin
5638 test.1
5639 @result{}test.one
5640 dnl Reset words to default (a-zA-Z_).
5641 changesyntax(`W')
5642 @result{}
5643 __file__
5644 @result{}stdin
5645 test.1
5646 @result{}test.1
5647 @end example
5648
5649 Another possibility is to change the syntax of a macro call.
5650
5651 @example
5652 define(`test', `$#')
5653 @result{}
5654 test(a, b, c)
5655 @result{}3
5656 dnl Change macro syntax.
5657 changesyntax(`(<', `,|', `)>')
5658 @result{}
5659 test(a, b, c)
5660 @result{}0(a, b, c)
5661 test<a|b|c>
5662 @result{}3
5663 @end example
5664
5665 Leading spaces are always removed from macro arguments in @code{m4}, but
5666 by changing the syntax categories we can avoid it.  The use of
5667 @code{format} is an alternative to using a literal tab character.
5668
5669 @example
5670 define(`test', `$1$2$3')
5671 @result{}
5672 test(`a', `b', `c')
5673 @result{}abc
5674 dnl Don't ignore whitespace.
5675 changesyntax(`O 'format(``%c'', `9')`
5676 ')
5677 @result{}
5678 test(a, b,
5679 c)
5680 @result{}a b
5681 @result{}c
5682 @end example
5683
5684 It is possible to redefine the @samp{$} used to indicate macro arguments
5685 in user defined macros.  Dollar class syntax elements are copied to the
5686 output if there is no valid expansion.
5687
5688 @example
5689 define(`argref', `Dollar: $#, Question: ?#')
5690 @result{}
5691 argref(1, 2, 3)
5692 @result{}Dollar: 3, Question: ?#
5693 dnl Change argument identifier.
5694 changesyntax(`$?')
5695 @result{}
5696 argref(1,2,3)
5697 @result{}Dollar: $#, Question: 3
5698 define(`escape', `$?`'1$?1?')
5699 @result{}
5700 escape(foo)
5701 @result{}$?1$foo?
5702 dnl Multiple argument identifiers.
5703 changesyntax(`$+$')
5704 @result{}
5705 argref(1, 2, 3)
5706 @result{}Dollar: 3, Question: 3
5707 @end example
5708
5709 Macro calls can be given a @TeX{} or Texinfo like syntax using an
5710 escape.  If one or more characters are defined as escapes, macro names
5711 are only recognized if preceded by an escape character.
5712
5713 If the escape is not followed by what is normally a word (a letter
5714 optionally followed by letters and/or numerals), that single character
5715 is returned as a macro name.
5716
5717 As always, words without a macro definition cause no error message.
5718 They and the escape character are simply output.
5719
5720 @example
5721 define(`foo', `bar')
5722 @result{}
5723 dnl Require @@ escape before any macro.
5724 changesyntax(`@@@@')
5725 @result{}
5726 foo
5727 @result{}foo
5728 @@foo
5729 @result{}bar
5730 @@bar
5731 @result{}@@bar
5732 @@dnl Change escape character.
5733 @@changesyntax(`@@\', `O@@')
5734 @result{}
5735 foo
5736 @result{}foo
5737 @@foo
5738 @result{}@@foo
5739 \foo
5740 @result{}bar
5741 define(`#', `No comment')
5742 @result{}define(#, No comment)
5743 \define(`#', `No comment')
5744 @result{}
5745 \# \foo # Comment \foo
5746 @result{}No comment bar # Comment \foo
5747 @end example
5748
5749 Active characters are known from @TeX{}.  In @code{m4} an active
5750 character is always seen as a one-letter word, and so, if it has a macro
5751 definition, the macro will be called.
5752
5753 @example
5754 define(`@@', `TEST')
5755 @result{}
5756 define(`a@@a', `hello')
5757 @result{}
5758 define(`a', `A')
5759 @result{}
5760 @@
5761 @result{}@@
5762 a@@a
5763 @result{}A@@A
5764 dnl Make @@ active.
5765 changesyntax(`A@@')
5766 @result{}
5767 @@
5768 @result{}TEST
5769 a@@a
5770 @result{}ATESTa
5771 @end example
5772
5773 There is obviously an overlap between @code{changesyntax} and
5774 @code{changequote}, since there are now two ways to modify quote
5775 delimiters.  To avoid incompatibilities, if the quotes are modified by
5776 @code{changequote}, any characters previously set to either quote
5777 delimiter by @code{changesyntax} are first demoted to the other category
5778 (@samp{O}), so the result is only a single set of quotes.  In the other
5779 direction, if quotes were already disabled, or if both the start and end
5780 delimiter set by @code{changequote} are single bytes, then
5781 @code{changesyntax} preserves those settings.  But if either delimiter
5782 occupies multiple bytes, @code{changesyntax} first disables both
5783 delimiters.  Quotes can be disabled via @code{changesyntax} by emptying
5784 the left quote basic category (@samp{L}).  Meanwhile, the right quote
5785 context category (@samp{R}) will never be empty; if a
5786 @code{changesyntax} action would otherwise leave that category empty,
5787 then the default end delimiter from @code{changequote} (@samp{'}) is
5788 used; thus, it is never possible to get @code{m4} in a state where a
5789 quoted string cannot be terminated.  These interactions apply to comment
5790 delimiters as well, @i{mutatis mutandis} with @code{changecom}.
5791
5792 @example
5793 define(`test', `TEST')
5794 @result{}
5795 dnl Add additional single-byte delimiters.
5796 changesyntax(`L+<', `R+>')
5797 @result{}
5798 <test> `test' [test] <<test>>
5799 @result{}test test [TEST] <test>
5800 dnl Use standard interface, overriding changesyntax settings.
5801 changequote(<[>, `]')
5802 @result{}
5803 <test> `test' [test] <<test>>
5804 @result{}<TEST> `TEST' test <<TEST>>
5805 dnl Introduce multi-byte delimiters.
5806 changequote([<<], [>>])
5807 @result{}
5808 <test> `test' [test] <<test>>
5809 @result{}<TEST> `TEST' [TEST] test
5810 dnl Change end quote, effectively disabling quotes.
5811 changesyntax(<<R]>>)
5812 @result{}
5813 <test> `test' [test] <<test>>
5814 @result{}<TEST> `TEST' [TEST] <<TEST>>
5815 dnl Change beginning quote, make ] normal, thus making ' end quote.
5816 changesyntax(L`, R-])
5817 @result{}
5818 <test> `test' [test] <<test>>
5819 @result{}<TEST> test [TEST] <<TEST>>
5820 dnl Set multi-byte quote; unrelated changes don't impact it.
5821 changequote(`<<', `>>')changesyntax(<<@@\>>)
5822 @result{}
5823 <\test> `\test' [\test] <<\test>>
5824 @result{}<TEST> `TEST' [TEST] \test
5825 @end example
5826
5827 If several characters are assigned to a category that forms single
5828 character tokens, all such characters are treated as equal.  Any open
5829 parenthesis will match any close parenthesis, etc.
5830
5831 @example
5832 dnl Go crazy with symbols.
5833 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
5834 @result{}
5835 eval@{2**4-1; 2: 8>
5836 @result{}00001111
5837 @end example
5838
5839 The syntax table is initialized to be backwards compatible, so if you
5840 never call @code{changesyntax}, nothing will have changed.
5841
5842 For now, debugging output continues to use @kbd{(}, @kbd{,} and @kbd{)}
5843 to show macro calls; and macro expansions that result in a list of
5844 arguments (such as @samp{$@@} or @code{shift}) use @samp{,}, regardless
5845 of the current syntax settings.  However, this is likely to change in a
5846 future release, so it should not be relied on, particularly since it is
5847 next to impossible to write recursive macros if the argument separator
5848 doesn't match between expansion and rescanning.
5849
5850 @c FIXME - changing syntax of , should not break iterative macros.
5851 @example
5852 $ @kbd{m4 -d}
5853 changesyntax(`,=|')traceon(`foo')define(`foo'|`$#:$@@')
5854 @result{}
5855 foo(foo(1|2|3))
5856 @error{}m4trace: -2- foo(`1', `2', `3') -> `3:`1',`2',`3''
5857 @error{}m4trace: -1- foo(`3:1,2,3') -> `1:`3:1,2,3''
5858 @result{}1:3:1,2,3
5859 @end example
5860
5861 @node M4wrap
5862 @section Saving text until end of input
5863
5864 @cindex saving input
5865 @cindex input, saving
5866 @cindex deferring expansion
5867 @cindex expansion, deferring
5868 It is possible to `save' some text until the end of the normal input has
5869 been seen.  Text can be saved, to be read again by @code{m4} when the
5870 normal input has been exhausted.  This feature is normally used to
5871 initiate cleanup actions before normal exit, e.g., deleting temporary
5872 files.
5873
5874 To save input text, use the builtin @code{m4wrap}:
5875
5876 @deffn {Builtin (m4)} m4wrap (@var{string}, @dots{})
5877 Stores @var{string} in a safe place, to be reread when end of input is
5878 reached.  As a GNU extension, additional arguments are
5879 concatenated with a space to the @var{string}.
5880
5881 Successive invocations of @code{m4wrap} accumulate saved text in
5882 first-in, first-out order, as required by POSIX.
5883
5884 The expansion of @code{m4wrap} is void.
5885 The macro @code{m4wrap} is recognized only with parameters.
5886 @end deffn
5887
5888 @example
5889 define(`cleanup', `This is the `cleanup' action.
5890 ')
5891 @result{}
5892 m4wrap(`cleanup')
5893 @result{}
5894 This is the first and last normal input line.
5895 @result{}This is the first and last normal input line.
5896 ^D
5897 @result{}This is the cleanup action.
5898 @end example
5899
5900 The saved input is only reread when the end of normal input is seen, and
5901 not if @code{m4exit} is used to exit @code{m4}.
5902
5903 It is safe to call @code{m4wrap} from wrapped text, where all the
5904 recursively wrapped text is deferred until the current wrapped text is
5905 exhausted.  As of M4 1.6, when @code{m4wrap} is not used recursively,
5906 the saved pieces of text are reread in the same order in which they were
5907 saved (FIFO---first in, first out), as required by POSIX.
5908
5909 @example
5910 m4wrap(`1
5911 ')
5912 @result{}
5913 m4wrap(`2', `3
5914 ')
5915 @result{}
5916 ^D
5917 @result{}1
5918 @result{}2 3
5919 @end example
5920
5921 However, earlier versions had reverse ordering (LIFO---last in, first
5922 out), as this behavior is more like the semantics of the C function
5923 @code{atexit}.  It is possible to emulate POSIX behavior even
5924 with older versions of GNU M4 by including the file
5925 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
5926 distribution:
5927
5928 @comment examples
5929 @example
5930 $ @kbd{m4 -I examples}
5931 undivert(`wrapfifo.m4')dnl
5932 @result{}dnl Redefine m4wrap to have FIFO semantics.
5933 @result{}define(`_m4wrap_level', `0')dnl
5934 @result{}define(`m4wrap',
5935 @result{}`ifdef(`m4wrap'_m4wrap_level,
5936 @result{}       `define(`m4wrap'_m4wrap_level,
5937 @result{}               defn(`m4wrap'_m4wrap_level)`$1')',
5938 @result{}       `builtin(`m4wrap', `define(`_m4wrap_level',
5939 @result{}                                  incr(_m4wrap_level))dnl
5940 @result{}m4wrap'_m4wrap_level)dnl
5941 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5942 include(`wrapfifo.m4')
5943 @result{}
5944 m4wrap(`a`'m4wrap(`c
5945 ', `d')')m4wrap(`b')
5946 @result{}
5947 ^D
5948 @result{}abc
5949 @end example
5950
5951 It is likewise possible to emulate LIFO behavior without resorting to
5952 the GNU M4 extension of @code{builtin}, by including the file
5953 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
5954 distribution.  (Unfortunately, both examples shown here share some
5955 subtle bugs.  See if you can find and correct them; or @pxref{Improved
5956 m4wrap, , Answers}).
5957
5958 @comment examples
5959 @example
5960 $ @kbd{m4 -I examples}
5961 undivert(`wraplifo.m4')dnl
5962 @result{}dnl Redefine m4wrap to have LIFO semantics.
5963 @result{}define(`_m4wrap_level', `0')dnl
5964 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
5965 @result{}define(`m4wrap',
5966 @result{}`ifdef(`m4wrap'_m4wrap_level,
5967 @result{}       `define(`m4wrap'_m4wrap_level,
5968 @result{}               `$1'defn(`m4wrap'_m4wrap_level))',
5969 @result{}       `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
5970 @result{}m4wrap'_m4wrap_level)dnl
5971 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5972 include(`wraplifo.m4')
5973 @result{}
5974 m4wrap(`a`'m4wrap(`c
5975 ', `d')')m4wrap(`b')
5976 @result{}
5977 ^D
5978 @result{}bac
5979 @end example
5980
5981 Here is an example of implementing a factorial function using
5982 @code{m4wrap}:
5983
5984 @example
5985 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
5986 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
5987 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
5988 @result{}
5989 f(`10')
5990 @result{}
5991 ^D
5992 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
5993 @end example
5994
5995 Invocations of @code{m4wrap} at the same recursion level are
5996 concatenated and rescanned as usual:
5997
5998 @example
5999 define(`ab', `AB
6000 ')
6001 @result{}
6002 m4wrap(`a')m4wrap(`b')
6003 @result{}
6004 ^D
6005 @result{}AB
6006 @end example
6007
6008 @noindent
6009 however, the transition between recursion levels behaves like an end of
6010 file condition between two input files.
6011
6012 @comment status: 1
6013 @example
6014 m4wrap(`m4wrap(`)')len(abc')
6015 @result{}
6016 ^D
6017 @error{}m4:stdin:1: len: end of file in argument list
6018 @end example
6019
6020 As of M4 1.6, @code{m4wrap} transparently handles builtin tokens
6021 generated by @code{defn} (@pxref{Defn}).  However, for portability, it
6022 is better to defer the evaluation of @code{defn} along with the rest of
6023 the wrapped text, as is done for @code{foo} in the example below, rather
6024 than computing the builtin token up front, as is done for @code{bar}.
6025
6026 @example
6027 m4wrap(`define(`foo', defn(`divnum'))foo
6028 ')
6029 @result{}
6030 m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar
6031 ')
6032 @result{}
6033 ^D
6034 @result{}0
6035 @result{}0
6036 @end example
6037
6038 @node File Inclusion
6039 @chapter File inclusion
6040
6041 @cindex file inclusion
6042 @cindex inclusion, of files
6043 @code{m4} allows you to include named files at any point in the input.
6044
6045 @menu
6046 * Include::                     Including named files and modules
6047 * Search Path::                 Searching for include files
6048 @end menu
6049
6050 @node Include
6051 @section Including named files and modules
6052
6053 There are two builtin macros in @code{m4} for including files:
6054
6055 @deffn {Builtin (m4)} include (@var{file})
6056 @deffnx {Builtin (m4)} sinclude (@var{file})
6057 Both macros cause the file named @var{file} to be read by
6058 @code{m4}.  When the end of the file is reached, input is resumed from
6059 the previous input file.
6060
6061 The expansion of @code{include} and @code{sinclude} is therefore the
6062 contents of @var{file}.
6063
6064 If @var{file} does not exist, is a directory, or cannot otherwise be
6065 read, the expansion is void,
6066 and @code{include} will fail with an error while @code{sinclude} is
6067 silent.  The empty string counts as a file that does not exist.
6068
6069 The macros @code{include} and @code{sinclude} are recognized only with
6070 parameters.
6071 @end deffn
6072
6073 @comment status: 1
6074 @example
6075 include(`n')
6076 @error{}m4:stdin:1: include: cannot open file 'n': No such file or directory
6077 @result{}
6078 include()
6079 @error{}m4:stdin:2: include: cannot open file '': No such file or directory
6080 @result{}
6081 sinclude(`n')
6082 @result{}
6083 sinclude()
6084 @result{}
6085 @end example
6086
6087 This section uses the @option{--include} command-line option (or
6088 @option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab
6089 files from the @file{m4-@value{VERSION}/@/examples}
6090 directory shipped as part of the GNU @code{m4} package.  The
6091 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
6092 contains the lines:
6093
6094 @comment ignore
6095 @example
6096 $ @kbd{cat examples/incl.m4}
6097 @result{}Include file start
6098 @result{}foo
6099 @result{}Include file end
6100 @end example
6101
6102 Normally file inclusion is used to insert the contents of a file
6103 into the input stream.  The contents of the file will be read by
6104 @code{m4} and macro calls in the file will be expanded:
6105
6106 @comment examples
6107 @example
6108 $ @kbd{m4 -I examples}
6109 define(`foo', `FOO')
6110 @result{}
6111 include(`incl.m4')
6112 @result{}Include file start
6113 @result{}FOO
6114 @result{}Include file end
6115 @result{}
6116 @end example
6117
6118 The fact that @code{include} and @code{sinclude} expand to the contents
6119 of the file can be used to define macros that operate on entire files.
6120 Here is an example, which defines @samp{bar} to expand to the contents
6121 of @file{incl.m4}:
6122
6123 @comment examples
6124 @example
6125 $ @kbd{m4 -I examples}
6126 define(`bar', include(`incl.m4'))
6127 @result{}
6128 This is `bar':  >>bar<<
6129 @result{}This is bar:  >>Include file start
6130 @result{}foo
6131 @result{}Include file end
6132 @result{}<<
6133 @end example
6134
6135 This use of @code{include} is not trivial, though, as files can contain
6136 quotes, commas, and parentheses, which can interfere with the way the
6137 @code{m4} parser works.  GNU M4 seamlessly concatenates
6138 the file contents with the next character, even if the included file
6139 ended in the middle of a comment, string, or macro call.  These
6140 conditions are only treated as end of file errors if specified as input
6141 files on the command line.
6142
6143 In GNU M4, an alternative method of reading files is
6144 using @code{undivert} (@pxref{Undivert}) on a named file.
6145
6146 In addition, as a GNU M4 extension, if the included file cannot
6147 be found exactly as given, various standard suffixes are appended.
6148 If the included file name is absolute (a full path from the root directory
6149 is given) then additional search directories are not examined, although
6150 suffixes will be tried if the file is not found exactly as given.
6151 For each directory that is searched (according to the absolute directory
6152 give in the file name, or else by directories listed in @env{M4PATH} and
6153 given with the @option{-I} and @option{-B} options), first the unchanged
6154 file name is tried, and then again with the suffixes @samp{.m4f} and
6155 @samp{.m4}.
6156
6157 Furthermore, if no matching file has yet been found, before moving on to
6158 the next directory, @samp{.la} and the usual binary module suffix for
6159 the host platform (usually @samp{.so}) are also tried.  Matching with one
6160 of those suffixes will attempt to load the matched file as a dynamic
6161 module. @xref{Modules}, for more details.
6162
6163 @node Search Path
6164 @section Searching for include files
6165
6166 @cindex search path for included files
6167 @cindex included files, search path for
6168 @cindex GNU extensions
6169 GNU @code{m4} allows included files to be found in other directories
6170 than the current working directory.
6171
6172 @cindex @env{M4PATH}
6173 If the @option{--prepend-include} or @option{-B} command-line option was
6174 provided (@pxref{Preprocessor features, , Invoking m4}), those
6175 directories are searched first, in reverse order that those options were
6176 listed on the command line.  Then @code{m4} looks in the current working
6177 directory.  Next comes the directories specified with the
6178 @option{--include} or @option{-I} option, in the order found on the
6179 command line.  Finally, if the @env{M4PATH} environment variable is set,
6180 it is expected to contain a colon-separated list of directories, which
6181 will be searched in order.
6182
6183 If the automatic search for include-files causes trouble, the @samp{p}
6184 debug flag (@pxref{Debugmode}) can help isolate the problem.
6185
6186 @node Diversions
6187 @chapter Diverting and undiverting output
6188
6189 @cindex deferring output
6190 Diversions are a way of temporarily saving output.  The output of
6191 @code{m4} can at any time be diverted to a temporary file, and be
6192 reinserted into the output stream, @dfn{undiverted}, again at a later
6193 time.
6194
6195 @cindex @env{TMPDIR}
6196 Numbered diversions are counted from 0 upwards, diversion number 0
6197 being the normal output stream.  GNU
6198 @code{m4} tries to keep diversions in memory.  However, there is a
6199 limit to the overall memory usable by all diversions taken together
6200 (512K, currently).  When this maximum is about to be exceeded,
6201 a temporary file is opened to receive the contents of the biggest
6202 diversion still in memory, freeing this memory for other diversions.
6203 When creating the temporary file, @code{m4} honors the value of the
6204 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
6205 Thus, the amount of available disk space provides the only real limit on
6206 the number and aggregate size of diversions.
6207
6208 Diversions make it possible to generate output in a different order than
6209 the input was read.  It is possible to implement topological sorting
6210 dependencies.  For example, GNU Autoconf makes use of
6211 diversions under the hood to ensure that the expansion of a prerequisite
6212 macro appears in the output prior to the expansion of a dependent macro,
6213 regardless of which order the two macros were invoked in the user's
6214 input file.
6215
6216 @menu
6217 * Divert::                      Diverting output
6218 * Undivert::                    Undiverting output
6219 * Divnum::                      Diversion numbers
6220 * Cleardivert::                 Discarding diverted text
6221 @end menu
6222
6223 @node Divert
6224 @section Diverting output
6225
6226 @cindex diverting output to files
6227 @cindex output, diverting to files
6228 @cindex files, diverting output to
6229 Output is diverted using @code{divert}:
6230
6231 @deffn {Builtin (m4)} divert (@dvar{number, 0}, @ovar{text})
6232 The current diversion is changed to @var{number}.  If @var{number} is left
6233 out or empty, it is assumed to be zero.  If @var{number} cannot be
6234 parsed, the diversion is unchanged.
6235
6236 @cindex GNU extensions
6237 As a GNU extension, if optional @var{text} is supplied and
6238 @var{number} was valid, then @var{text} is immediately output to the
6239 new diversion, regardless of whether the expansion of @code{divert}
6240 occurred while collecting arguments for another macro.
6241
6242 The expansion of @code{divert} is void.
6243 @end deffn
6244
6245 When all the @code{m4} input will have been processed, all existing
6246 diversions are automatically undiverted, in numerical order.
6247
6248 @example
6249 divert(`1')
6250 This text is diverted.
6251 divert
6252 @result{}
6253 This text is not diverted.
6254 @result{}This text is not diverted.
6255 ^D
6256 @result{}
6257 @result{}This text is diverted.
6258 @end example
6259
6260 Several calls of @code{divert} with the same argument do not overwrite
6261 the previous diverted text, but append to it.  Diversions are printed
6262 after any wrapped text is expanded.
6263
6264 @example
6265 define(`text', `TEXT')
6266 @result{}
6267 divert(`1')`diverted text.'
6268 divert
6269 @result{}
6270 m4wrap(`Wrapped text precedes ')
6271 @result{}
6272 ^D
6273 @result{}Wrapped TEXT precedes diverted text.
6274 @end example
6275
6276 @cindex discarding input
6277 @cindex input, discarding
6278 If output is diverted to a negative diversion, it is simply discarded.
6279 This can be used to suppress unwanted output.  A common example of
6280 unwanted output is the trailing newlines after macro definitions.  Here
6281 is a common programming idiom in @code{m4} for avoiding them.
6282
6283 @example
6284 divert(`-1')
6285 define(`foo', `Macro `foo'.')
6286 define(`bar', `Macro `bar'.')
6287 divert
6288 @result{}
6289 @end example
6290
6291 @cindex GNU extensions
6292 Traditional implementations only supported ten diversions.  But as a
6293 GNU extension, diversion numbers can be as large as positive
6294 integers will allow, rather than treating a multi-digit diversion number
6295 as a request to discard text.
6296
6297 @example
6298 divert(eval(`1<<28'))world
6299 divert(`2')hello
6300 ^D
6301 @result{}hello
6302 @result{}world
6303 @end example
6304
6305 The ability to immediately output extra text is a GNU
6306 extension, but it can prove useful for ensuring that text goes to a
6307 particular diversion no matter how many pending macro expansions are in
6308 progress.  For a demonstration of why this is useful, it is important to
6309 understand in the example below why @samp{one} is output in diversion 2,
6310 not diversion 1, while @samp{three} and @samp{five} both end up in the
6311 correctly numbered diversion.  The key point is that when @code{divert}
6312 is executed unquoted as part of the argument collection of another
6313 macro, the side effect takes place immediately, but the text @samp{one}
6314 is not passed to any diversion until after the @samp{divert(`2')} and
6315 the enclosing @code{echo} have also taken place.  The example with
6316 @samp{three} shows how following the quoting rule of thumb delays the
6317 invocation of @code{divert} until it is not nested in any argument
6318 collection context, while the example with @samp{five} shows the use of
6319 the optional argument to speed up the output process.
6320
6321 @example
6322 define(`echo', `$1')
6323 @result{}
6324 echo(divert(`1')`one'divert(`2'))`'dnl
6325 echo(`divert(`3')three`'divert(`4')')`'dnl
6326 echo(divert(`5', `five')divert(`6'))`'dnl
6327 divert
6328 @result{}
6329 undivert(`1')
6330 @result{}
6331 undivert(`2')
6332 @result{}one
6333 undivert(`3')
6334 @result{}three
6335 undivert(`4')
6336 @result{}
6337 undivert(`5')
6338 @result{}five
6339 undivert(`6')
6340 @result{}
6341 @end example
6342
6343 Note that @code{divert} is an English word, but also an active macro
6344 without arguments.  When processing plain text, the word might appear in
6345 normal text and be unintentionally swallowed as a macro invocation.  One
6346 way to avoid this is to use the @option{-P} option to rename all
6347 builtins (@pxref{Operation modes, , Invoking m4}).  Another is to write
6348 a wrapper that requires a parameter to be recognized.
6349
6350 @example
6351 We decided to divert the stream for irrigation.
6352 @result{}We decided to  the stream for irrigation.
6353 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
6354 @result{}
6355 divert(`-1')
6356 Ignored text.
6357 divert(`0')
6358 @result{}
6359 We decided to divert the stream for irrigation.
6360 @result{}We decided to divert the stream for irrigation.
6361 @end example
6362
6363 @node Undivert
6364 @section Undiverting output
6365
6366 Diverted text can be undiverted explicitly using the builtin
6367 @code{undivert}:
6368
6369 @deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}})
6370 Undiverts the numeric @var{diversions} given by the arguments, in the
6371 order given.  If no arguments are supplied, all diversions are
6372 undiverted, in numerical order.
6373
6374 @cindex file inclusion
6375 @cindex inclusion, of files
6376 @cindex GNU extensions
6377 As a GNU extension, @var{diversions} may contain non-numeric
6378 strings, which are treated as the names of files to copy into the output
6379 without expansion.  A warning is issued if a file could not be opened.
6380
6381 The expansion of @code{undivert} is void.
6382 @end deffn
6383
6384 @example
6385 divert(`1')
6386 This text is diverted.
6387 divert
6388 @result{}
6389 This text is not diverted.
6390 @result{}This text is not diverted.
6391 undivert(`1')
6392 @result{}
6393 @result{}This text is diverted.
6394 @result{}
6395 @end example
6396
6397 Notice the last two blank lines.  One of them comes from the newline
6398 following @code{undivert}, the other from the newline that followed the
6399 @code{divert}!  A diversion often starts with a blank line like this.
6400
6401 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
6402 but rather copied directly to the current output, and it is therefore
6403 not an error to undivert into a diversion.  Undiverting the empty string
6404 is the same as specifying diversion 0; in either case nothing happens
6405 since the output has already been flushed.
6406
6407 @example
6408 divert(`1')diverted text
6409 divert
6410 @result{}
6411 undivert()
6412 @result{}
6413 undivert(`0')
6414 @result{}
6415 undivert
6416 @result{}diverted text
6417 @result{}
6418 divert(`1')more
6419 divert(`2')undivert(`1')diverted text`'divert
6420 @result{}
6421 undivert(`1')
6422 @result{}
6423 undivert(`2')
6424 @result{}more
6425 @result{}diverted text
6426 @end example
6427
6428 When a diversion has been undiverted, the diverted text is discarded,
6429 and it is not possible to bring back diverted text more than once.
6430
6431 @example
6432 divert(`1')
6433 This text is diverted first.
6434 divert(`0')undivert(`1')dnl
6435 @result{}
6436 @result{}This text is diverted first.
6437 undivert(`1')
6438 @result{}
6439 divert(`1')
6440 This text is also diverted but not appended.
6441 divert(`0')undivert(`1')dnl
6442 @result{}
6443 @result{}This text is also diverted but not appended.
6444 @end example
6445
6446 Attempts to undivert the current diversion are silently ignored.  Thus,
6447 when the current diversion is not 0, the current diversion does not get
6448 rearranged among the other diversions.
6449
6450 @example
6451 divert(`1')one
6452 divert(`2')two
6453 divert(`3')three
6454 divert(`4')four
6455 divert(`5')five
6456 divert(`2')undivert(`5', `2', `4')dnl
6457 undivert`'dnl effectively undivert(`1', `2', `3', `4', `5')
6458 divert`'undivert`'dnl
6459 @result{}two
6460 @result{}five
6461 @result{}four
6462 @result{}one
6463 @result{}three
6464 @end example
6465
6466 @cindex GNU extensions
6467 @cindex file inclusion
6468 @cindex inclusion, of files
6469 GNU @code{m4} allows named files to be undiverted.  Given a
6470 non-numeric argument, the contents of the file named will be copied,
6471 uninterpreted, to the current output.  This complements the builtin
6472 @code{include} (@pxref{Include}).  To illustrate the difference, assume
6473 the file @file{foo} contains:
6474
6475 @comment file: foo
6476 @example
6477 $ @kbd{cat foo}
6478 bar
6479 @end example
6480
6481 @noindent
6482 then
6483
6484 @example
6485 define(`bar', `BAR')
6486 @result{}
6487 undivert(`foo')
6488 @result{}bar
6489 @result{}
6490 include(`foo')
6491 @result{}BAR
6492 @result{}
6493 @end example
6494
6495 If the file is not found (or cannot be read), an error message is
6496 issued, and the expansion is void.  It is possible to intermix files
6497 and diversion numbers.
6498
6499 @example
6500 divert(`1')diversion one
6501 divert(`2')undivert(`foo')dnl
6502 divert(`3')diversion three
6503 divert`'dnl
6504 undivert(`1', `2', `foo', `3')dnl
6505 @result{}diversion one
6506 @result{}bar
6507 @result{}bar
6508 @result{}diversion three
6509 @end example
6510
6511 @node Divnum
6512 @section Diversion numbers
6513
6514 @cindex diversion numbers
6515 The current diversion is tracked by the builtin @code{divnum}:
6516
6517 @deffn {Builtin (m4)} divnum
6518 Expands to the number of the current diversion.
6519 @end deffn
6520
6521 @example
6522 Initial divnum
6523 @result{}Initial 0
6524 divert(`1')
6525 Diversion one: divnum
6526 divert(`2')
6527 Diversion two: divnum
6528 ^D
6529 @result{}
6530 @result{}Diversion one: 1
6531 @result{}
6532 @result{}Diversion two: 2
6533 @end example
6534
6535 @node Cleardivert
6536 @section Discarding diverted text
6537
6538 @cindex discarding diverted text
6539 @cindex diverted text, discarding
6540 Often it is not known, when output is diverted, whether the diverted
6541 text is actually needed.  Since all non-empty diversion are brought back
6542 on the main output stream when the end of input is seen, a method of
6543 discarding a diversion is needed.  If all diversions should be
6544 discarded, the easiest is to end the input to @code{m4} with
6545 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
6546
6547 @example
6548 divert(`1')
6549 Diversion one: divnum
6550 divert(`2')
6551 Diversion two: divnum
6552 divert(`-1')
6553 undivert
6554 ^D
6555 @end example
6556
6557 @noindent
6558 No output is produced at all.
6559
6560 Clearing selected diversions can be done with the following macro:
6561
6562 @deffn Composite cleardivert (@ovar{diversions@dots{}})
6563 Discard the contents of each of the listed numeric @var{diversions}.
6564 @end deffn
6565
6566 @example
6567 define(`cleardivert',
6568 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
6569 @result{}
6570 @end example
6571
6572 It is called just like @code{undivert}, but the effect is to clear the
6573 diversions, given by the arguments.  (This macro has a nasty bug!  You
6574 should try to see if you can find it and correct it; or @pxref{Improved
6575 cleardivert, , Answers}).
6576
6577 @node Modules
6578 @chapter Extending M4 with dynamic runtime modules
6579
6580 @cindex modules
6581 @cindex dynamic modules
6582 @cindex loadable modules
6583 GNU M4 1.4.x had a monolithic architecture.  All of its
6584 functionality was contained in a single binary, and additional macros
6585 could be added only by writing more code in the M4 language, or at the
6586 extreme by hacking the sources and recompiling the whole thing to make
6587 a custom M4 installation.
6588
6589 Starting with release 2.0, M4 uses Libtool's @code{libltdl} facilities
6590 (@pxref{Using libltdl, , libltdl, libtool, The GNU Libtool Manual})
6591 to move all of M4's builtins out to pluggable modules.  Unless compile
6592 time options are set to change the default build, the installed M4 2.0
6593 binary is virtually identical to 1.4.x, supporting the same builtins.
6594 However, additional modules can be loaded into the running M4 interpreter
6595 as it is started up at the command line, or during normal expansion of
6596 macros.  This facilitates runtime extension of the M4 builtin macro
6597 list using compiled C code linked against a new shared library,
6598 typically named @file{libm4.so}.
6599
6600 For example, you might want to add a @code{setenv} builtin to M4, to
6601 use before invoking @code{esyscmd}.  We might write a @file{setenv.c}
6602 something like this:
6603
6604 @comment ignore
6605 @example
6606 #include "m4module.h"
6607
6608 M4BUILTIN(setenv);
6609
6610 m4_builtin m4_builtin_table[] =
6611 @{
6612   /* name      handler         flags             minargs maxargs */
6613   @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2,      3 @},
6614
6615   @{ NULL,     NULL,           0,                0,      0 @}
6616 @};
6617
6618 /**
6619  * setenv(NAME, VALUE, [OVERWRITE])
6620  **/
6621 M4BUILTIN_HANDLER (setenv)
6622 @{
6623   int overwrite = 1;
6624
6625   if (argc >= 4)
6626     if (!m4_numeric_arg (context, argc, argv, 3, &overwrite))
6627       return;
6628
6629   setenv (M4ARG (1), M4ARG (2), overwrite);
6630 @}
6631 @end example
6632
6633 Then, having compiled and linked the module, in (somewhat contrived)
6634 M4 code:
6635
6636 @comment ignore
6637 @example
6638 $ @kbd{m4 setenv}
6639 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6640 @result{}
6641 esyscmd(`ifconfig -a')dnl
6642 @result{}@dots{}
6643 @end example
6644
6645 Or instead of loading the module from the M4 invocation, you can use
6646 the @code{include} builtin:
6647
6648 @comment ignore
6649 @example
6650 $ @kbd{m4}
6651 include(`setenv')
6652 @result{}
6653 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6654 @result{}
6655 @end example
6656
6657 Also, at build time, you can choose which modules to build into
6658 the core (so that they will be available without dynamic loading).
6659 SUSv3 M4 functionality is contained in the module @samp{m4}, GNU
6660 extensions in the module @samp{gnu}, additional module builtins in the
6661 module @samp{load} and so on.
6662
6663 We hinted earlier that the @code{m4} and @code{gnu} modules are
6664 preloaded into the installed M4 binary, but it is possible to install
6665 a @emph{thinner} binary; for example, omitting the GNU
6666 extensions by configuring the distribution with @kbd{./configure
6667 --with-modules=m4}.  For a binary built with that option to understand
6668 code that uses GNU extensions, you must then run @kbd{m4 gnu}.
6669 It is also possible to build a @emph{fatter} binary with additional
6670 modules preloaded: adding, say, the @code{load} module using
6671 @kbd{./configure --with-modules="m4 gnu load"}.
6672
6673 GNU M4 now has a facility for defining additional builtins without
6674 recompiling the sources.  In actual fact, all of the builtins provided
6675 by GNU M4 are loaded from such modules.  All of the builtin
6676 descriptions in this manual are annotated with the module from which
6677 they are loaded -- mostly from the module @samp{m4}.
6678
6679 When you start GNU M4, the modules @samp{m4} and @samp{gnu} are
6680 loaded by default.  If you supply the @option{-G} option at startup, the
6681 module @samp{traditional} is loaded instead of @samp{gnu}.
6682 @xref{Compatibility}, for more details on the differences between these
6683 two modes of startup.
6684
6685 @menu
6686 * M4modules::                   Listing loaded modules
6687 * Standard Modules::            Standard bundled modules
6688 @end menu
6689
6690 @node M4modules
6691 @section Listing loaded modules
6692
6693 @deffn {Builtin (gnu)} m4modules
6694 Expands to a quoted ordered list of currently loaded modules,
6695 with the most recently loaded module at the front of the list.  Loading
6696 a module multiple times will not affect the order of this list, the
6697 position depends on when the module was @emph{first} loaded.
6698 @end deffn
6699
6700 For example, if GNU @code{m4} is started with the
6701 @option{load} module, @code{m4modules} will yield the following:
6702
6703 @example
6704 $ @kbd{m4}
6705 m4modules
6706 @result{}gnu,m4
6707 @end example
6708
6709 @node Standard Modules
6710 @section Standard bundled modules
6711
6712 GNU @code{m4} ships with several bundled modules as standard.
6713 By convention, these modules define a text macro that can be tested
6714 with @code{ifdef} when they are loaded; only the @code{m4} module lacks
6715 this feature test macro, since it is not permitted by POSIX.
6716 Each of the feature test macros are intended to be used without
6717 arguments.
6718
6719 @table @code
6720 @item m4
6721 Provides all of the builtins defined by POSIX.  This module
6722 is always loaded --- GNU @code{m4} would only be a very slow
6723 version of @command{cat} without the builtins supplied by this module.
6724
6725 @item gnu
6726 Provides all of the GNU extensions, as defined by
6727 GNU M4 through the 1.4.x release series.  It also provides a
6728 couple of feature test macros:
6729
6730 @deffn {Macro (gnu)} __gnu__
6731 Expands to the empty string, as an indication that the @samp{gnu}
6732 module is loaded.
6733 @end deffn
6734
6735 @deffn {Macro (gnu)} __m4_version__
6736 Expands to an unquoted string containing the release version number of
6737 the running GNU @code{m4} executable.
6738 @end deffn
6739
6740 This module is always loaded, unless the @option{-G} command line
6741 option is supplied at startup (@pxref{Limits control, , Invoking m4}).
6742
6743 @item traditional
6744 This module provides compatibility with System V @code{m4}, for anything
6745 not specified by POSIX, and is loaded instead of the
6746 @samp{gnu} module if the @option{-G} command line option is specified.
6747
6748 @deffn {Macro (traditional)} __traditional__
6749 Expands to the empty string, as an indication that the
6750 @samp{traditional} module is loaded.
6751 @end deffn
6752
6753 @item load
6754 This module supplies the builtins for advanced use of modules from within a
6755 GNU @code{m4} program.  @xref{Modules}, for more details.  The
6756 module also defines the following macro:
6757
6758 @deffn {Macro (load)} __load__
6759 Expands to the empty string, as an indication that the @samp{load}
6760 module is loaded.
6761 @end deffn
6762
6763 @item mpeval
6764 This module provides the implementation for the experimental
6765 @code{mpeval} feature.  If the host machine does not have the
6766 GNU gmp library, the builtin will generate an error if called.
6767 @xref{Mpeval}, for more details.  The module also defines the following
6768 macro:
6769
6770 @deffn {Macro (mpeval)} __mpeval__
6771 Expands to the empty string, as an indication that the @samp{mpeval}
6772 module is loaded.
6773 @end deffn
6774 @end table
6775
6776 Here is an example of using the feature test macros.
6777
6778 @example
6779 $ @kbd{m4}
6780 __gnu__-__traditional__
6781 @result{}-__traditional__
6782 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6783 @result{}Extensions are active
6784 __gnu__(`ignored')
6785 @error{}m4:stdin:3: warning: __gnu__: extra arguments ignored: 1 > 0
6786 @result{}
6787 @end example
6788
6789 @comment options: -G
6790 @example
6791 $ @kbd{m4 --traditional}
6792 __gnu__-__traditional__
6793 @result{}__gnu__-
6794 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6795 @result{}Minimal features
6796 @end example
6797
6798 Since the version string is unquoted and can potentially contain macro
6799 names (for example, a beta release could be numbered @samp{1.9b}), or be
6800 impacted by the use of @code{changesyntax}), the
6801 @code{__m4_version__} macro should generally be used via @code{defn}
6802 rather than directly invoked (@pxref{Defn}).  In general, feature tests
6803 are more reliable than version number checks, so exercise caution when
6804 using this macro.
6805
6806 @comment This test is excluded from the testsuite since it depends on a
6807 @comment texinfo macro; but builtins.at covers the same thing.
6808 @comment ignore
6809 @example
6810 defn(`__m4_version__')
6811 @result{}@value{VERSION}
6812 @end example
6813
6814 @node Text handling
6815 @chapter Macros for text handling
6816
6817 There are a number of builtins in @code{m4} for manipulating text in
6818 various ways, extracting substrings, searching, substituting, and so on.
6819
6820 @menu
6821 * Len::                         Calculating length of strings
6822 * Index macro::                 Searching for substrings
6823 * Regexp::                      Searching for regular expressions
6824 * Substr::                      Extracting substrings
6825 * Translit::                    Translating characters
6826 * Patsubst::                    Substituting text by regular expression
6827 * Format::                      Formatting strings (printf-like)
6828 @end menu
6829
6830 @node Len
6831 @section Calculating length of strings
6832
6833 @cindex length of strings
6834 @cindex strings, length of
6835 The length of a string can be calculated by @code{len}:
6836
6837 @deffn {Builtin (m4)} len (@var{string})
6838 Expands to the length of @var{string}, as a decimal number.
6839
6840 The macro @code{len} is recognized only with parameters.
6841 @end deffn
6842
6843 @example
6844 len()
6845 @result{}0
6846 len(`abcdef')
6847 @result{}6
6848 @end example
6849
6850 @node Index macro
6851 @section Searching for substrings
6852
6853 @cindex substrings, locating
6854 Searching for substrings is done with @code{index}:
6855
6856 @deffn {Builtin (m4)} index (@var{string}, @var{substring}, @ovar{offset})
6857 Expands to the index of the first occurrence of @var{substring} in
6858 @var{string}.  The first character in @var{string} has index 0.  If
6859 @var{substring} does not occur in @var{string}, @code{index} expands to
6860 @samp{-1}.  If @var{offset} is provided, it determines the index at
6861 which the search starts; a negative @var{offset} specifies the offset
6862 relative to the end of @var{string}.
6863
6864 The macro @code{index} is recognized only with parameters.
6865 @end deffn
6866
6867 @example
6868 index(`gnus, gnats, and armadillos', `nat')
6869 @result{}7
6870 index(`gnus, gnats, and armadillos', `dag')
6871 @result{}-1
6872 @end example
6873
6874 Omitting @var{substring} evokes a warning, but still produces output;
6875 contrast this with an empty @var{substring}.
6876
6877 @example
6878 index(`abc')
6879 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
6880 @result{}0
6881 index(`abc', `')
6882 @result{}0
6883 index(`abc', `b')
6884 @result{}1
6885 @end example
6886
6887 @cindex GNU extensions
6888 As an extension, an @var{offset} can be provided to limit the search to
6889 the tail of the @var{string}.  A negative offset is interpreted relative
6890 to the end of @var{string}, and it is not an error if @var{offset}
6891 exceeds the bounds of @var{string}.
6892
6893 @example
6894 index(`aba', `a', `1')
6895 @result{}2
6896 index(`ababa', `ba', `-3')
6897 @result{}3
6898 index(`abc', `ab', `4')
6899 @result{}-1
6900 index(`abc', `bc', `-4')
6901 @result{}1
6902 @end example
6903
6904 @ignore
6905 @comment Expose a bug in the strstr() algorithm present in glibc
6906 @comment 2.9 through 2.12 and in gnulib up to Sep 2010.
6907
6908 @example
6909 index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-:',
6910 `:12-:12-:12-:12-:12-:12-:12-:12-')
6911 @result{}-1
6912 @end example
6913
6914 @comment Expose a bug in the gnulib replacement strstr() algorithm
6915 @comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
6916
6917 @example
6918 index(`..wi.d.', `.d.')
6919 @result{}4
6920 @end example
6921 @end ignore
6922
6923 @node Regexp
6924 @section Searching for regular expressions
6925
6926 @cindex regular expressions
6927 @cindex expressions, regular
6928 @cindex GNU extensions
6929 Searching for regular expressions is done with the builtin
6930 @code{regexp}:
6931
6932 @deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax})
6933 @deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @
6934   @ovar{replacement}, @ovar{resyntax})
6935 Searches for @var{regexp} in @var{string}.
6936
6937 If @var{resyntax} is given, the particular flavor of regular expression
6938 understood with respect to @var{regexp} can be changed from the current
6939 default.  @xref{Changeresyntax}, for details of the values that can be
6940 given for this argument.  If exactly three arguments given, then the
6941 third argument is treated as @var{resyntax} only if it matches a known
6942 syntax name, otherwise it is treated as @var{replacement}.
6943
6944 If @var{replacement} is omitted, @code{regexp} expands to the index of
6945 the first match of @var{regexp} in @var{string}.  If @var{regexp} does
6946 not match anywhere in @var{string}, it expands to -1.
6947
6948 If @var{replacement} is supplied, and there was a match, @code{regexp}
6949 changes the expansion to this argument, with @samp{\@var{n}} substituted
6950 by the text matched by the @var{n}th parenthesized sub-expression of
6951 @var{regexp}, up to nine sub-expressions.  The escape @samp{\&} is
6952 replaced by the text of the entire regular expression matched.  For
6953 all other characters, @samp{\} treats the next character literally.  A
6954 warning is issued if there were fewer sub-expressions than the
6955 @samp{\@var{n}} requested, or if there is a trailing @samp{\}.  If there
6956 was no match, @code{regexp} expands to the empty string.
6957
6958 The macro @code{regexp} is recognized only with parameters.
6959 @end deffn
6960
6961 @example
6962 regexp(`GNUs not Unix', `\<[a-z]\w+')
6963 @result{}5
6964 regexp(`GNUs not Unix', `\<Q\w*')
6965 @result{}-1
6966 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
6967 @result{}*** Unix *** nix ***
6968 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
6969 @result{}
6970 @end example
6971
6972 Here are some more examples on the handling of backslash:
6973
6974 @example
6975 regexp(`abc', `\(b\)', `\\\10\a')
6976 @result{}\b0a
6977 regexp(`abc', `b', `\1\')
6978 @error{}m4:stdin:2: warning: regexp: sub-expression 1 not present
6979 @error{}m4:stdin:2: warning: regexp: trailing \ ignored in replacement
6980 @result{}
6981 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
6982 @error{}m4:stdin:3: warning: regexp: sub-expression 4 not present
6983 @error{}m4:stdin:3: warning: regexp: sub-expression 5 not present
6984 @error{}m4:stdin:3: warning: regexp: sub-expression 6 not present
6985 @result{}c
6986 @end example
6987
6988 Omitting @var{regexp} evokes a warning, but still produces output;
6989 contrast this with an empty @var{regexp} argument.
6990
6991 @example
6992 regexp(`abc')
6993 @error{}m4:stdin:1: warning: regexp: too few arguments: 1 < 2
6994 @result{}0
6995 regexp(`abc', `')
6996 @result{}0
6997 regexp(`abc', `', `\\def')
6998 @result{}\def
6999 @end example
7000
7001 If @var{resyntax} is given, @var{regexp} must be given according to
7002 the syntax chosen, though the default regular expression syntax
7003 remains unchanged for other invocations:
7004
7005 @example
7006 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***',
7007        `POSIX_EXTENDED')
7008 @result{}*** Unix *** nix ***
7009 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***')
7010 @result{}
7011 @end example
7012
7013 Occasionally, you might want to pass an @var{resyntax} argument without
7014 wishing to give @var{replacement}.  If there are exactly three
7015 arguments, and the last argument is a valid @var{resyntax}, it is used
7016 as such, rather than as a replacement.
7017
7018 @example
7019 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED')
7020 @result{}9
7021 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED')
7022 @result{}POSIX_EXTENDED
7023 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `')
7024 @result{}
7025 regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `')
7026 @result{}POSIX_EXTENDED
7027 @end example
7028
7029 @node Substr
7030 @section Extracting substrings
7031
7032 @cindex extracting substrings
7033 @cindex substrings, extracting
7034 Substrings are extracted with @code{substr}:
7035
7036 @deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length}, @
7037   @ovar{replace})
7038 Performs a substring operation on @var{string}.  If @var{from} is
7039 positive, it represents the 0-based index where the substring begins.
7040 If @var{length} is omitted, the substring ends at the end of
7041 @var{string}; if it is positive, @var{length} is added to the starting
7042 index to determine the ending index.
7043
7044 @cindex GNU extensions
7045 As a GNU extension, if @var{from} is negative, it is added to
7046 the length of @var{string} to determine the starting index; if it is
7047 empty, the start of the string is used.  Likewise, if @var{length} is
7048 negative, it is added to the length of @var{string} to determine the
7049 ending index, and an emtpy @var{length} behaves like an omitted
7050 @var{length}.  It is not an error if either of the resulting indices lie
7051 outside the string, but the selected substring only contains the bytes
7052 of @var{string} that overlap the selected indices.  If the end point
7053 lies before the beginning point, the substring chosen is the empty
7054 string located at the starting index.
7055
7056 If @var{replace} is omitted, then the expansion is only the selected
7057 substring, which may be empty.  As a GNU extension,if
7058 @var{replace} is provided, then the expansion is the original
7059 @var{string} with the selected substring replaced by @var{replace}.  The
7060 expansion is empty and a warning issued if @var{from} or @var{length}
7061 cannot be parsed, or if @var{replace} is provided but the selected
7062 indices do not overlap with @var{string}.
7063
7064 The macro @code{substr} is recognized only with parameters.
7065 @end deffn
7066
7067 @example
7068 substr(`gnus, gnats, and armadillos', `6')
7069 @result{}gnats, and armadillos
7070 substr(`gnus, gnats, and armadillos', `6', `5')
7071 @result{}gnats
7072 @end example
7073
7074 Omitting @var{from} evokes a warning, but still produces output.  On the
7075 other hand, selecting a @var{from} or @var{length} that lies beyond
7076 @var{string} is not a problem.
7077
7078 @example
7079 substr(`abc')
7080 @error{}m4:stdin:1: warning: substr: too few arguments: 1 < 2
7081 @result{}abc
7082 substr(`abc', `')
7083 @result{}abc
7084 substr(`abc', `4')
7085 @result{}
7086 substr(`abc', `1', `4')
7087 @result{}bc
7088 @end example
7089
7090 Using negative values for @var{from} or @var{length} are GNU
7091 extensions, useful for accessing a fixed size tail of an
7092 arbitrary-length string.  Prior to M4 1.6, using these values would
7093 silently result in the empty string.  Some other implementations crash
7094 on negative values, and many treat an explicitly empty @var{length} as
7095 0, which is different from the omitted @var{length} implying the rest of
7096 the original @var{string}.
7097
7098 @example
7099 substr(`abcde', `2', `')
7100 @result{}cde
7101 substr(`abcde', `-3')
7102 @result{}cde
7103 substr(`abcde', `', `-3')
7104 @result{}ab
7105 substr(`abcde', `-6')
7106 @result{}abcde
7107 substr(`abcde', `-6', `5')
7108 @result{}abcd
7109 substr(`abcde', `-7', `1')
7110 @result{}
7111 substr(`abcde', `1', `-2')
7112 @result{}bc
7113 substr(`abcde', `-4', `-1')
7114 @result{}bcd
7115 substr(`abcde', `4', `-3')
7116 @result{}
7117 substr(`abcdefghij', `-09', `08')
7118 @result{}bcdefghi
7119 @end example
7120
7121 Another useful GNU extension, also added in M4 1.6, is the
7122 ability to replace a substring within the original @var{string}.  An
7123 empty length substring at the beginning or end of @var{string} is valid,
7124 but selecting a substring that does not overlap @var{string} causes a
7125 warning.
7126
7127 @example
7128 substr(`abcde', `1', `3', `t')
7129 @result{}ate
7130 substr(`abcde', `5', `', `f')
7131 @result{}abcdef
7132 substr(`abcde', `-3', `-4', `f')
7133 @result{}abfcde
7134 substr(`abcde', `-6', `1', `f')
7135 @result{}fabcde
7136 substr(`abcde', `-7', `1', `f')
7137 @error{}m4:stdin:5: warning: substr: substring out of range
7138 @result{}
7139 substr(`abcde', `6', `', `f')
7140 @error{}m4:stdin:6: warning: substr: substring out of range
7141 @result{}
7142 @end example
7143
7144 If backwards compabitility to M4 1.4.x behavior is necessary, the
7145 following macro is sufficient to do the job (mimicking warnings about
7146 empty @var{from} or @var{length} or an ignored fourth argument is left
7147 as an exercise to the reader).
7148
7149 @example
7150 define(`substr', `ifelse(`$#', `0', ``$0'',
7151   eval(`2 < $#')`$3', `1', `',
7152   index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')')
7153 @result{}
7154 substr(`abcde', `3')
7155 @result{}de
7156 substr(`abcde', `3', `')
7157 @result{}
7158 substr(`abcde', `-1')
7159 @result{}
7160 substr(`abcde', `1', `-1')
7161 @result{}
7162 substr(`abcde', `2', `1', `C')
7163 @result{}c
7164 @end example
7165
7166 On the other hand, it is possible to portably emulate the GNU
7167 extension of negative @var{from} and @var{length} arguments across all
7168 @code{m4} implementations, albeit with a lot more overhead.  This
7169 example uses @code{incr} and @code{decr} to normalize @samp{-08} to
7170 something that a later @code{eval} will treat as a decimal value, rather
7171 than looking like an invalid octal number, while avoiding using these
7172 macros on an empty string.  The helper macro @code{_substr_normalize} is
7173 recursive, since it is easier to fix @var{length} after @var{from} has
7174 been normalized, with the final iteration supplying two non-negative
7175 arguments to the original builtin, now named @code{_substr}.
7176
7177 @comment options: -daq -t_substr
7178 @example
7179 $ @kbd{m4 -daq -t _substr}
7180 define(`_substr', defn(`substr'))dnl
7181 define(`substr', `ifelse(`$#', `0', ``$0'',
7182   `_$0(`$1', _$0_normalize(len(`$1'),
7183     ifelse(`$2', `', `0', `incr(decr(`$2'))'),
7184     ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl
7185 define(`_substr_normalize', `ifelse(
7186   eval(`$2 < 0 && $1 + $2 >= 0'), `1',
7187     `$0(`$1', eval(`$1 + $2'), `$3')',
7188   eval(`$2 < 0')`$3', `1', ``0', `$1'',
7189   eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1',
7190     `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))',
7191   eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'',
7192   eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')',
7193   `$3', `', ``$2', `$1'',
7194   eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1',
7195     ``$2', eval(`$1 - $2 + $3')',
7196   eval(`$3 - 0 < 0'), `1', ``$2', `0'',
7197   ``$2', `$3'')')dnl
7198 substr(`abcde', `2', `')
7199 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7200 @result{}cde
7201 substr(`abcde', `-3')
7202 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7203 @result{}cde
7204 substr(`abcde', `', `-3')
7205 @error{}m4trace: -1- _substr(`abcde', `0', `2')
7206 @result{}ab
7207 substr(`abcde', `-6')
7208 @error{}m4trace: -1- _substr(`abcde', `0', `5')
7209 @result{}abcde
7210 substr(`abcde', `-6', `5')
7211 @error{}m4trace: -1- _substr(`abcde', `0', `4')
7212 @result{}abcd
7213 substr(`abcde', `-7', `1')
7214 @error{}m4trace: -1- _substr(`abcde', `0', `0')
7215 @result{}
7216 substr(`abcde', `1', `-2')
7217 @error{}m4trace: -1- _substr(`abcde', `1', `2')
7218 @result{}bc
7219 substr(`abcde', `-4', `-1')
7220 @error{}m4trace: -1- _substr(`abcde', `1', `3')
7221 @result{}bcd
7222 substr(`abcde', `4', `-3')
7223 @error{}m4trace: -1- _substr(`abcde', `4', `0')
7224 @result{}
7225 substr(`abcdefghij', `-09', `08')
7226 @error{}m4trace: -1- _substr(`abcdefghij', `1', `8')
7227 @result{}bcdefghi
7228 @end example
7229
7230 @node Translit
7231 @section Translating characters
7232
7233 @cindex translating characters
7234 @cindex characters, translating
7235 Character translation is done with @code{translit}:
7236
7237 @deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement})
7238 Expands to @var{string}, with each character that occurs in
7239 @var{chars} translated into the character from @var{replacement} with
7240 the same index.
7241
7242 If @var{replacement} is shorter than @var{chars}, the excess characters
7243 of @var{chars} are deleted from the expansion; if @var{chars} is
7244 shorter, the excess characters in @var{replacement} are silently
7245 ignored.  If @var{replacement} is omitted, all characters in
7246 @var{string} that are present in @var{chars} are deleted from the
7247 expansion.  If a character appears more than once in @var{chars}, only
7248 the first instance is used in making the translation.  Only a single
7249 translation pass is made, even if characters in @var{replacement} also
7250 appear in @var{chars}.
7251
7252 As a GNU extension, both @var{chars} and @var{replacement} can
7253 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
7254 letters) or @samp{0-9} (meaning all digits).  To include a dash @samp{-}
7255 in @var{chars} or @var{replacement}, place it first or last in the
7256 entire string, or as the last character of a range.  Back-to-back ranges
7257 can share a common endpoint.  It is not an error for the last character
7258 in the range to be `larger' than the first.  In that case, the range
7259 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
7260 The expansion of a range is dependent on the underlying encoding of
7261 characters, so using ranges is not always portable between machines.
7262
7263 The macro @code{translit} is recognized only with parameters.
7264 @end deffn
7265
7266 @example
7267 translit(`GNUs not Unix', `A-Z')
7268 @result{}s not nix
7269 translit(`GNUs not Unix', `a-z', `A-Z')
7270 @result{}GNUS NOT UNIX
7271 translit(`GNUs not Unix', `A-Z', `z-a')
7272 @result{}tmfs not fnix
7273 translit(`+,-12345', `+--1-5', `<;>a-c-a')
7274 @result{}<;>abcba
7275 translit(`abcdef', `aabdef', `bcged')
7276 @result{}bgced
7277 @end example
7278
7279 In the @sc{ascii} encoding, the first example deletes all uppercase
7280 letters, the second converts lowercase to uppercase, and the third
7281 `mirrors' all uppercase letters, while converting them to lowercase.
7282 The two first cases are by far the most common, even though they are not
7283 portable to @sc{ebcdic} or other encodings.  The fourth example shows a
7284 range ending in @samp{-}, as well as back-to-back ranges.  The final
7285 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
7286 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
7287 @samp{e} are swapped, and the @samp{f} is discarded.
7288
7289 Omitting @var{chars} evokes a warning, but still produces output.
7290
7291 @example
7292 translit(`abc')
7293 @error{}m4:stdin:1: warning: translit: too few arguments: 1 < 2
7294 @result{}abc
7295 @end example
7296
7297 @node Patsubst
7298 @section Substituting text by regular expression
7299
7300 @cindex regular expressions
7301 @cindex expressions, regular
7302 @cindex pattern substitution
7303 @cindex substitution by regular expression
7304 @cindex GNU extensions
7305 Global substitution in a string is done by @code{patsubst}:
7306
7307 @deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @
7308   @ovar{replacement}, @ovar{resyntax})
7309 Searches @var{string} for matches of @var{regexp}, and substitutes
7310 @var{replacement} for each match.
7311
7312 If @var{resyntax} is given, the particular flavor of regular expression
7313 understood with respect to @var{regexp} can be changed from the current
7314 default.  @xref{Changeresyntax}, for details of the values that can be
7315 given for this argument.  Unlike @var{regexp}, if exactly three
7316 arguments given, the third argument is always treated as
7317 @var{replacement}, even if it matches a known syntax name.
7318
7319 The parts of @var{string} that are not covered by any match of
7320 @var{regexp} are copied to the expansion.  Whenever a match is found, the
7321 search proceeds from the end of the match, so a character from
7322 @var{string} will never be substituted twice.  If @var{regexp} matches a
7323 string of zero length, the start position for the search is incremented,
7324 to avoid infinite loops.
7325
7326 When a replacement is to be made, @var{replacement} is inserted into
7327 the expansion, with @samp{\@var{n}} substituted by the text matched by
7328 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
7329 nine sub-expressions.  The escape @samp{\&} is replaced by the text of
7330 the entire regular expression matched.  For all other characters,
7331 @samp{\} treats the next character literally.  A warning is issued if
7332 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
7333 if there is a trailing @samp{\}.
7334
7335 The @var{replacement} argument can be omitted, in which case the text
7336 matched by @var{regexp} is deleted.
7337
7338 The macro @code{patsubst} is recognized only with parameters.
7339 @end deffn
7340
7341 When used with two arguments, @code{regexp} returns the position of the
7342 match, but @code{patsubst} deletes the match:
7343
7344 @example
7345 patsubst(`GNUs not Unix', `^', `OBS: ')
7346 @result{}OBS: GNUs not Unix
7347 patsubst(`GNUs not Unix', `\<', `OBS: ')
7348 @result{}OBS: GNUs OBS: not OBS: Unix
7349 patsubst(`GNUs not Unix', `\w*', `(\&)')
7350 @result{}(GNUs)() (not)() (Unix)()
7351 patsubst(`GNUs not Unix', `\w+', `(\&)')
7352 @result{}(GNUs) (not) (Unix)
7353 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
7354 @result{}GN not@w{ }
7355 patsubst(`GNUs not Unix', `not', `NOT\')
7356 @error{}m4:stdin:6: warning: patsubst: trailing \ ignored in replacement
7357 @result{}GNUs NOT Unix
7358 @end example
7359
7360 Here is a slightly more realistic example, which capitalizes individual
7361 words or whole sentences, by substituting calls of the macros
7362 @code{upcase} and @code{downcase} into the strings.
7363
7364 @deffn Composite upcase (@var{text})
7365 @deffnx Composite downcase (@var{text})
7366 @deffnx Composite capitalize (@var{text})
7367 Expand to @var{text}, but with capitalization changed: @code{upcase}
7368 changes all letters to upper case, @code{downcase} changes all letters
7369 to lower case, and @code{capitalize} changes the first character of each
7370 word to upper case and the remaining characters to lower case.
7371 @end deffn
7372
7373 First, an example of their usage, using implementations distributed in
7374 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
7375
7376 @comment examples
7377 @example
7378 $ @kbd{m4 -I examples}
7379 include(`capitalize.m4')
7380 @result{}
7381 upcase(`GNUs not Unix')
7382 @result{}GNUS NOT UNIX
7383 downcase(`GNUs not Unix')
7384 @result{}gnus not unix
7385 capitalize(`GNUs not Unix')
7386 @result{}Gnus Not Unix
7387 @end example
7388
7389 Now for the implementation.  There is a helper macro @code{_capitalize}
7390 which puts only its first word in mixed case.  Then @code{capitalize}
7391 merely parses out the words, and replaces them with an invocation of
7392 @code{_capitalize}.  (As presented here, the @code{capitalize} macro has
7393 some subtle flaws.  You should try to see if you can find and correct
7394 them; or @pxref{Improved capitalize, , Answers}).
7395
7396 @comment examples
7397 @example
7398 $ @kbd{m4 -I examples}
7399 undivert(`capitalize.m4')dnl
7400 @result{}divert(`-1')
7401 @result{}# upcase(text)
7402 @result{}# downcase(text)
7403 @result{}# capitalize(text)
7404 @result{}#   change case of text, simple version
7405 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
7406 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
7407 @result{}define(`_capitalize',
7408 @result{}       `regexp(`$1', `^\(\w\)\(\w*\)',
7409 @result{}               `upcase(`\1')`'downcase(`\2')')')
7410 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
7411 @result{}divert`'dnl
7412 @end example
7413
7414 If @var{resyntax} is given, @var{regexp} must be given according to
7415 the syntax chosen, though the default regular expression syntax
7416 remains unchanged for other invocations:
7417
7418 @example
7419 define(`epatsubst',
7420        `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl
7421 epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_')
7422 @result{}_bar_ _baz_
7423 patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_')
7424 @result{}_bar_ _baz_
7425 @end example
7426
7427 While @code{regexp} replaces the whole input with the replacement as
7428 soon as there is a match, @code{patsubst} replaces each
7429 @emph{occurrence} of a match and preserves non-matching pieces:
7430
7431 @example
7432 define(`patreg',
7433 `patsubst($@@)
7434 regexp($@@)')dnl
7435 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
7436 @result{}bar FOO baz FOO
7437 @result{}FOO
7438 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
7439 @result{}bab abb 212
7440 @result{}bab
7441 @end example
7442
7443 Omitting @var{regexp} evokes a warning, but still produces output;
7444 contrast this with an empty @var{regexp} argument.
7445
7446 @example
7447 patsubst(`abc')
7448 @error{}m4:stdin:1: warning: patsubst: too few arguments: 1 < 2
7449 @result{}abc
7450 patsubst(`abc', `')
7451 @result{}abc
7452 patsubst(`abc', `', `\\-')
7453 @result{}\-a\-b\-c\-
7454 @end example
7455
7456 @node Format
7457 @section Formatting strings (printf-like)
7458
7459 @cindex formatted output
7460 @cindex output, formatted
7461 @cindex GNU extensions
7462 Formatted output can be made with @code{format}:
7463
7464 @deffn {Builtin (gnu)} format (@var{format-string}, @dots{})
7465 Works much like the C function @code{printf}.  The first argument
7466 @var{format-string} can contain @samp{%} specifications which are
7467 satisfied by additional arguments, and the expansion of @code{format} is
7468 the formatted string.
7469
7470 The macro @code{format} is recognized only with parameters.
7471 @end deffn
7472
7473 Its use is best described by a few examples:
7474
7475 @comment This test is a bit fragile, if someone tries to port to a
7476 @comment platform without infinity.
7477 @example
7478 define(`foo', `The brown fox jumped over the lazy dog')
7479 @result{}
7480 format(`The string "%s" uses %d characters', foo, len(foo))
7481 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
7482 format(`%*.*d', `-1', `-1', `1')
7483 @result{}1
7484 format(`%.0f', `56789.9876')
7485 @result{}56790
7486 len(format(`%-*X', `5000', `1'))
7487 @result{}5000
7488 ifelse(format(`%010F', `infinity'), `       INF', `success',
7489        format(`%010F', `infinity'), `  INFINITY', `success',
7490        format(`%010F', `infinity'))
7491 @result{}success
7492 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
7493        format(`%.1A', `1.999'), `0X2.0P+0', `success',
7494        format(`%.1A', `1.999'))
7495 @result{}success
7496 format(`%g', `0xa.P+1')
7497 @result{}20
7498 @end example
7499
7500 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
7501 example shows how @code{format} can be used to produce tabular output.
7502
7503 @comment examples
7504 @example
7505 $ @kbd{m4 -I examples}
7506 include(`forloop.m4')
7507 @result{}
7508 forloop(`i', `1', `10', `format(`%6d squared is %10d
7509 ', i, eval(i**2))')
7510 @result{}     1 squared is          1
7511 @result{}     2 squared is          4
7512 @result{}     3 squared is          9
7513 @result{}     4 squared is         16
7514 @result{}     5 squared is         25
7515 @result{}     6 squared is         36
7516 @result{}     7 squared is         49
7517 @result{}     8 squared is         64
7518 @result{}     9 squared is         81
7519 @result{}    10 squared is        100
7520 @result{}
7521 @end example
7522
7523 The builtin @code{format} is modeled after the ANSI C @samp{printf}
7524 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
7525 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
7526 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
7527 @samp{%}; it supports field widths and precisions, and the flags
7528 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}.  For
7529 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
7530 @samp{l} are recognized, and for floating point specifiers, the width
7531 modifier @samp{l} is recognized.  Items not yet supported include
7532 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
7533 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
7534 modifiers, and any platform extensions available in the native
7535 @code{printf}.  For more details on the functioning of @code{printf},
7536 see the C Library Manual, or the POSIX specification (for
7537 example, @samp{%a} is supported even on platforms that haven't yet
7538 implemented C99 hexadecimal floating point output natively).
7539
7540 @c FIXME - format still needs some improvements.
7541 Warnings are issued for unrecognized specifiers, an improper number of
7542 arguments, or difficulty parsing an argument according to the format
7543 string (such as overflow or extra characters).  It is anticipated that a
7544 future release of GNU @code{m4} will support more specifiers.
7545 Likewise, escape sequences are not yet recognized.
7546
7547 @example
7548 format(`%p', `0')
7549 @error{}m4:stdin:1: warning: format: unrecognized specifier in '%p'
7550 @result{}
7551 format(`%*d', `')
7552 @error{}m4:stdin:2: warning: format: empty string treated as 0
7553 @error{}m4:stdin:2: warning: format: too few arguments: 2 < 3
7554 @result{}0
7555 format(`%.1f', `2a')
7556 @error{}m4:stdin:3: warning: format: non-numeric argument '2a'
7557 @result{}2.0
7558 @end example
7559
7560 @ignore
7561 @comment Expose a crash with a bad format string fixed in 1.4.15.
7562 @comment Unfortunately, 8-bit bytes are hard to check for; but the
7563 @comment exit status is enough to sniff the crash in broken versions.
7564
7565 @example
7566 format(`%'format(`%c', `128'))
7567 @result{}
7568 @error{}ignore
7569 @end example
7570 @end ignore
7571
7572 @node Arithmetic
7573 @chapter Macros for doing arithmetic
7574
7575 @cindex arithmetic
7576 @cindex integer arithmetic
7577 Integer arithmetic is included in @code{m4}, with a C-like syntax.  As
7578 convenient shorthands, there are builtins for simple increment and
7579 decrement operations.
7580
7581 @menu
7582 * Incr::                        Decrement and increment operators
7583 * Eval::                        Evaluating integer expressions
7584 * Mpeval::                      Multiple precision arithmetic
7585 @end menu
7586
7587 @node Incr
7588 @section Decrement and increment operators
7589
7590 @cindex decrement operator
7591 @cindex increment operator
7592 Increment and decrement of integers are supported using the builtins
7593 @code{incr} and @code{decr}:
7594
7595 @deffn {Builtin (m4)} incr (@var{number})
7596 @deffnx {Builtin (m4)} decr (@var{number})
7597 Expand to the numerical value of @var{number}, incremented
7598 or decremented, respectively, by one.  Except for the empty string, the
7599 expansion is empty if @var{number} could not be parsed.
7600
7601 The macros @code{incr} and @code{decr} are recognized only with
7602 parameters.
7603 @end deffn
7604
7605 @example
7606 incr(`4')
7607 @result{}5
7608 decr(`7')
7609 @result{}6
7610 incr()
7611 @error{}m4:stdin:3: warning: incr: empty string treated as 0
7612 @result{}1
7613 decr()
7614 @error{}m4:stdin:4: warning: decr: empty string treated as 0
7615 @result{}-1
7616 @end example
7617
7618 The builtin macros @code{incr} and @code{decr} are recognized only when
7619 given arguments.
7620
7621 @node Eval
7622 @section Evaluating integer expressions
7623
7624 @cindex integer expression evaluation
7625 @cindex evaluation, of integer expressions
7626 @cindex expressions, evaluation of integer
7627 Integer expressions are evaluated with @code{eval}:
7628
7629 @deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
7630 Expands to the value of @var{expression}.  The expansion is empty
7631 if a problem is encountered while parsing the arguments.  If specified,
7632 @var{radix} and @var{width} control the format of the output.
7633
7634 Calculations are done with signed numbers, using at least 31-bit
7635 precision, but as a GNU extension, @code{m4} will use wider
7636 integers if available.  Precision is finite, based on the platform's
7637 notion of @code{intmax_t}, and overflow silently results in wraparound.
7638 A warning is issued if division by zero is attempted, or if
7639 @var{expression} could not be parsed.
7640
7641 Expressions can contain the following operators, listed in order of
7642 decreasing precedence.
7643
7644 @table @samp
7645 @item ()
7646 Parentheses
7647 @item +  -  ~  !
7648 Unary plus and minus, and bitwise and logical negation
7649 @item **
7650 Exponentiation
7651 @item *  /  %  \
7652 Multiplication, division, modulo, and ratio
7653 @item +  -
7654 Addition and subtraction
7655 @item <<  >>  >>>
7656 Shift left, shift right, unsigned shift right
7657 @item >  >=  <  <=
7658 Relational operators
7659 @item ==  !=
7660 Equality operators
7661 @item &
7662 Bitwise and
7663 @item ^
7664 Bitwise exclusive-or
7665 @item |
7666 Bitwise or
7667 @item &&
7668 Logical and
7669 @item ||
7670 Logical or
7671 @item ?:
7672 Conditional ternary
7673 @item ,
7674 Sequential evaluation
7675 @end table
7676
7677 The macro @code{eval} is recognized only with parameters.
7678 @end deffn
7679
7680 All binary operators, except exponentiation, are left associative.  C
7681 operators that perform variable assignment, such as @samp{+=} or
7682 @samp{--}, are not implemented, since @code{eval} only operates on
7683 constants, not variables.  Attempting to use them results in an error.
7684 @comment FIXME - since XCU ERN 137 is approved, we could provide an
7685 @comment extension that supported assignment operators.
7686
7687 Note that some older @code{m4} implementations use @samp{^} as an
7688 alternate operator for the exponentiation, although POSIX
7689 requires the C behavior of bitwise exclusive-or.  The precedence of the
7690 negation operators, @samp{~} and @samp{!}, was traditionally lower than
7691 equality.  The unary operators could not be used reliably more than once
7692 on the same term without intervening parentheses.  The traditional
7693 precedence of the equality operators @samp{==} and @samp{!=} was
7694 identical instead of lower than the relational operators such as
7695 @samp{<}, even through GNU M4 1.4.8.  Starting with version
7696 1.4.9, GNU M4 correctly follows POSIX precedence
7697 rules.  M4 scripts designed to be portable between releases must be
7698 aware that parentheses may be required to enforce C precedence rules.
7699 Likewise, division by zero, even in the unused branch of a
7700 short-circuiting operator, is not always well-defined in other
7701 implementations.
7702
7703 Following are some examples where the current version of M4 follows C
7704 precedence rules, but where older versions and some other
7705 implementations of @code{m4} require explicit parentheses to get the
7706 correct result:
7707
7708 @example
7709 eval(`1 == 2 > 0')
7710 @result{}1
7711 eval(`(1 == 2) > 0')
7712 @result{}0
7713 eval(`! 0 * 2')
7714 @result{}2
7715 eval(`! (0 * 2)')
7716 @result{}1
7717 eval(`1 | 1 ^ 1')
7718 @result{}1
7719 eval(`(1 | 1) ^ 1')
7720 @result{}0
7721 eval(`+ + - ~ ! ~ 0')
7722 @result{}1
7723 eval(`++0')
7724 @error{}m4:stdin:8: warning: eval: invalid operator: '++0'
7725 @result{}
7726 eval(`1 = 1')
7727 @error{}m4:stdin:9: warning: eval: invalid operator: '1 = 1'
7728 @result{}
7729 eval(`0 |= 1')
7730 @error{}m4:stdin:10: warning: eval: invalid operator: '0 |= 1'
7731 @result{}
7732 eval(`2 || 1 / 0')
7733 @result{}1
7734 eval(`0 || 1 / 0')
7735 @error{}m4:stdin:12: warning: eval: divide by zero: '0 || 1 / 0'
7736 @result{}
7737 eval(`0 && 1 % 0')
7738 @result{}0
7739 eval(`2 && 1 % 0')
7740 @error{}m4:stdin:14: warning: eval: modulo by zero: '2 && 1 % 0'
7741 @result{}
7742 @end example
7743
7744 @cindex GNU extensions
7745 As a GNU extension, @code{eval} supports several operators
7746 that do not appear in C@.  A right-associative exponentiation operator
7747 @samp{**} computes the value of the left argument raised to the right,
7748 modulo the numeric precision width.  If evaluated, the exponent must be
7749 non-negative, and at least one of the arguments must be non-zero, or a
7750 warning is issued.  An unsigned shift operator @samp{>>>} allows
7751 shifting a negative number as though it were an unsigned bit pattern,
7752 which shifts in 0 bits rather than twos-complement sign-extension.  A
7753 ratio operator @samp{\} behaves like normal division @samp{/} on
7754 integers, but is provided for symmetry with @code{mpeval}.
7755 Additionally, the C operators @samp{,} and @samp{?:} are supported.
7756
7757 @example
7758 eval(`2 ** 3 ** 2')
7759 @result{}512
7760 eval(`(2 ** 3) ** 2')
7761 @result{}64
7762 eval(`0 ** 1')
7763 @result{}0
7764 eval(`2 ** 0')
7765 @result{}1
7766 eval(`0 ** 0')
7767 @result{}
7768 @error{}m4:stdin:5: warning: eval: divide by zero: '0 ** 0'
7769 eval(`4 ** -2')
7770 @error{}m4:stdin:6: warning: eval: negative exponent: '4 ** -2'
7771 @result{}
7772 eval(`2 || 4 ** -2')
7773 @result{}1
7774 eval(`(-1 >> 1) == -1')
7775 @result{}1
7776 eval(`(-1 >>> 1) > (1 << 30)')
7777 @result{}1
7778 eval(`6 \ 3')
7779 @result{}2
7780 eval(`1 ? 2 : 3')
7781 @result{}2
7782 eval(`0 ? 2 : 3')
7783 @result{}3
7784 eval(`1 ? 2 : 1/0')
7785 @result{}2
7786 eval(`0 ? 1/0 : 3')
7787 @result{}3
7788 eval(`4, 5')
7789 @result{}5
7790 @end example
7791
7792 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
7793 without a special prefix are decimal.  A simple @samp{0} prefix
7794 introduces an octal number.  @samp{0x} introduces a hexadecimal number.
7795 As GNU extensions, @samp{0b} introduces a binary number.
7796 @samp{0r} introduces a number expressed in any radix between 1 and 36:
7797 the prefix should be immediately followed by the decimal expression of
7798 the radix, a colon, then the digits making the number.  For radix 1,
7799 leading zeros are ignored, and all remaining digits must be @samp{1};
7800 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
7801 @dots{}.  Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
7802 to @samp{z}.  Lower and upper case letters can be used interchangeably
7803 in numbers prefixes and as number digits.
7804
7805 Parentheses may be used to group subexpressions whenever needed.  For the
7806 relational operators, a true relation returns @code{1}, and a false
7807 relation return @code{0}.
7808
7809 Here are a few examples of use of @code{eval}.
7810
7811 @example
7812 eval(`-3 * 5')
7813 @result{}-15
7814 eval(`-99 / 10')
7815 @result{}-9
7816 eval(`-99 % 10')
7817 @result{}-9
7818 eval(`99 % -10')
7819 @result{}9
7820 eval(index(`Hello world', `llo') >= 0)
7821 @result{}1
7822 eval(`0r1:0111 + 0b100 + 0r3:12')
7823 @result{}12
7824 define(`square', `eval(`($1) ** 2')')
7825 @result{}
7826 square(`9')
7827 @result{}81
7828 square(square(`5')` + 1')
7829 @result{}676
7830 define(`foo', `666')
7831 @result{}
7832 eval(`foo / 6')
7833 @error{}m4:stdin:11: warning: eval: bad expression: 'foo / 6'
7834 @result{}
7835 eval(foo / 6)
7836 @result{}111
7837 @end example
7838
7839 As the last two lines show, @code{eval} does not handle macro
7840 names, even if they expand to a valid expression (or part of a valid
7841 expression).  Therefore all macros must be expanded before they are
7842 passed to @code{eval}.
7843 @comment update this if we add support for variables.
7844
7845 Some calculations are not portable to other implementations, since they
7846 have undefined semantics in C, but GNU @code{m4} has
7847 well-defined behavior on overflow.  When shifting, an out-of-range shift
7848 amount is implicitly brought into the range of the precision using
7849 modulo arithmetic (for example, on 32-bit integers, this would be an
7850 implicit bit-wise and with 0x1f).  This example should work whether your
7851 platform uses 32-bit integers, 64-bit integers, or even some other
7852 atypical size.
7853
7854 @example
7855 define(`max_int', eval(`-1 >>> 1'))
7856 @result{}
7857 define(`min_int', eval(max_int` + 1'))
7858 @result{}
7859 eval(min_int` < 0')
7860 @result{}1
7861 eval(max_int` > 0')
7862 @result{}1
7863 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
7864 @result{}overflow occurred
7865 eval(`0x80000000 % -1')
7866 @result{}0
7867 eval(`-4 >> 1')
7868 @result{}-2
7869 eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2'))
7870 @result{}-2
7871 @end example
7872
7873 If @var{radix} is specified, it specifies the radix to be used in the
7874 expansion.  The default radix is 10; this is also the case if
7875 @var{radix} is the empty string.  A warning results if the radix is
7876 outside the range of 1 through 36, inclusive.  The result of @code{eval}
7877 is always taken to be signed.  No radix prefix is output, and for
7878 radices greater than 10, the digits are lower case (although some
7879 other implementations use upper case).  The output is unquoted, and
7880 subject to further macro expansion.  The @var{width}
7881 argument specifies the minimum output width, excluding any negative
7882 sign.  The result is zero-padded to extend the expansion to the
7883 requested width.  A warning results if the width is negative.  If
7884 @var{radix} or @var{width} is out of bounds, the expansion of
7885 @code{eval} is empty.
7886
7887 @example
7888 eval(`666', `10')
7889 @result{}666
7890 eval(`666', `11')
7891 @result{}556
7892 eval(`666', `6')
7893 @result{}3030
7894 eval(`666', `6', `10')
7895 @result{}0000003030
7896 eval(`-666', `6', `10')
7897 @result{}-0000003030
7898 eval(`10', `', `0')
7899 @result{}10
7900 `0r1:'eval(`10', `1', `11')
7901 @result{}0r1:01111111111
7902 eval(`10', `16')
7903 @result{}a
7904 eval(`1', `37')
7905 @error{}m4:stdin:9: warning: eval: radix out of range: 37
7906 @result{}
7907 eval(`1', , `-1')
7908 @error{}m4:stdin:10: warning: eval: negative width: -1
7909 @result{}
7910 eval()
7911 @error{}m4:stdin:11: warning: eval: empty string treated as 0
7912 @result{}0
7913 eval(` ')
7914 @error{}m4:stdin:12: warning: eval: empty string treated as 0
7915 @result{}0
7916 define(`a', `hi')eval(` 10 ', `16')
7917 @result{}hi
7918 @end example
7919
7920 @node Mpeval
7921 @section Multiple precision arithmetic
7922
7923 When @code{m4} is compiled with a multiple precision arithmetic library
7924 (@pxref{Experiments}), a builtin @code{mpeval} is defined.
7925
7926 @deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @
7927   @ovar{width})
7928 Behaves similarly to @code{eval}, except the calculations are done with
7929 infinite precision, and rational numbers are supported.  Numbers may be
7930 of any length.
7931
7932 The macro @code{mpeval} is recognized only with parameters.
7933 @end deffn
7934
7935 For the most part, using @code{mpeval} is similar to using @code{eval}:
7936
7937 @comment options: mpeval -
7938 @example
7939 $ @kbd{m4 mpeval -}
7940 mpeval(`(1 << 70) + 2 ** 68 * 3', `16')
7941 @result{}700000000000000000
7942 `0r24:'mpeval(`0r36:zYx', `24', `5')
7943 @result{}0r24:038m9
7944 @end example
7945
7946 The ratio operator, @samp{\}, is provided with the same precedence as
7947 division, and rationally divides two numbers and canonicalizes the
7948 result, whereas the division operator @samp{/} always returns the
7949 integer quotient of the division.  To convert a rational value to
7950 integral, divide (@samp{/}) by 1.  Some operators, such as @samp{%},
7951 @samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate
7952 only on integers and will truncate any rational remainder.  The unsigned
7953 shift operator, @samp{>>>}, behaves identically with regular right
7954 shifts, @samp{>>}, since with infinite precision, it is not possible to
7955 convert a negative number to a positive using shifts.  The
7956 exponentiation operator, @samp{**}, assumes that the exponent is
7957 integral, but allows negative exponents.  With the short-circuit logical
7958 operators, @samp{||} and @samp{&&}, a non-zero result preserves the
7959 value of the argument that ended evaluation, rather than collapsing to
7960 @samp{1}.  The operators @samp{?:} and @samp{,} are always available,
7961 even in POSIX mode, since @code{mpeval} does not have to
7962 conform to the POSIX rules for @code{eval}.
7963
7964 @comment options: mpeval -
7965 @example
7966 $ @kbd{m4 mpeval -}
7967 mpeval(`2 / 4')
7968 @result{}0
7969 mpeval(`2 \ 4')
7970 @result{}1\2
7971 mpeval(`2 || 3')
7972 @result{}2
7973 mpeval(`1 && 3')
7974 @result{}3
7975 mpeval(`-1 >> 1')
7976 @result{}-1
7977 mpeval(`-1 >>> 1')
7978 @result{}-1
7979 @end example
7980
7981 @node Shell commands
7982 @chapter Macros for running shell commands
7983
7984 @cindex UNIX commands, running
7985 @cindex executing shell commands
7986 @cindex running shell commands
7987 @cindex shell commands, running
7988 @cindex commands, running shell
7989 There are a few builtin macros in @code{m4} that allow you to run shell
7990 commands from within @code{m4}.
7991
7992 Note that the definition of a valid shell command is system dependent.
7993 On UNIX systems, this is the typical @command{/bin/sh}.  But on other
7994 systems, such as native Windows, the shell has a different syntax of
7995 commands that it understands.  Some examples in this chapter assume
7996 @command{/bin/sh}, and also demonstrate how to quit early with a known
7997 exit value if this is not the case.
7998
7999 @menu
8000 * Platform macros::             Determining the platform
8001 * Syscmd::                      Executing simple commands
8002 * Esyscmd::                     Reading the output of commands
8003 * Sysval::                      Exit status
8004 * Mkstemp::                     Making temporary files
8005 * Mkdtemp::                     Making temporary directories
8006 @end menu
8007
8008 @node Platform macros
8009 @section Determining the platform
8010
8011 @cindex platform macros
8012 Sometimes it is desirable for an input file to know which platform
8013 @code{m4} is running on.  GNU @code{m4} provides several
8014 macros that are predefined to expand to the empty string; checking for
8015 their existence will confirm platform details.
8016
8017 @deffn {Optional builtin (gnu)} __os2__
8018 @deffnx {Optional builtin (traditional)} os2
8019 @deffnx {Optional builtin (gnu)} __unix__
8020 @deffnx {Optional builtin (traditional)} unix
8021 @deffnx {Optional builtin (gnu)} __windows__
8022 @deffnx {Optional builtin (traditional)} windows
8023 Each of these macros is conditionally defined as needed to describe the
8024 environment of @code{m4}.  If defined, each macro expands to the empty
8025 string.
8026 @end deffn
8027
8028 On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
8029 in the @samp{gnu} module, and @code{unix} in the @samp{traditional}
8030 module.
8031
8032 On native Windows systems, GNU @code{m4} will define
8033 @code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in
8034 the @samp{traditional} module.
8035
8036 On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
8037 in the @samp{gnu} module, and @code{os2} in the @samp{traditional}
8038 module.
8039
8040 If GNU M4 does not provide a platform macro for your system,
8041 please report that as a bug.
8042
8043 @example
8044 define(`provided', `0')
8045 @result{}
8046 ifdef(`__unix__', `define(`provided', incr(provided))')
8047 @result{}
8048 ifdef(`__windows__', `define(`provided', incr(provided))')
8049 @result{}
8050 ifdef(`__os2__', `define(`provided', incr(provided))')
8051 @result{}
8052 provided
8053 @result{}1
8054 @end example
8055
8056 @node Syscmd
8057 @section Executing simple commands
8058
8059 Any shell command can be executed, using @code{syscmd}:
8060
8061 @deffn {Builtin (m4)} syscmd (@var{shell-command})
8062 Executes @var{shell-command} as a shell command.
8063
8064 The expansion of @code{syscmd} is void, @emph{not} the output from
8065 @var{shell-command}!  Output or error messages from @var{shell-command}
8066 are not read by @code{m4}.  @xref{Esyscmd}, if you need to process the
8067 command output.
8068
8069 Prior to executing the command, @code{m4} flushes its buffers.
8070 The default standard input, output and error of @var{shell-command} are
8071 the same as those of @code{m4}.
8072
8073 By default, the @var{shell-command} will be used as the argument to the
8074 @option{-c} option of the @command{/bin/sh} shell (or the version of
8075 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8076 supports that).  If you prefer a different shell, the
8077 @command{configure} script can be given the option
8078 @option{--with-syscmd-shell=@var{location}} to set the location of an
8079 alternative shell at GNU @code{m4} installation; the
8080 alternative shell must still support @option{-c}.
8081
8082 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8083 m4}) is in effect, @code{syscmd} results in an error, since otherwise an
8084 input file could execute arbitrary code.
8085
8086 The macro @code{syscmd} is recognized only with parameters.
8087 @end deffn
8088
8089 @example
8090 define(`foo', `FOO')
8091 @result{}
8092 syscmd(`echo foo')
8093 @result{}foo
8094 @result{}
8095 @end example
8096
8097 Note how the expansion of @code{syscmd} keeps the trailing newline of
8098 the command, as well as using the newline that appeared after the macro.
8099
8100 The following is an example of @var{shell-command} using the same
8101 standard input as @code{m4}:
8102
8103 @comment The testsuite does not know how to parse pipes from the
8104 @comment texinfo.  Fortunately, there are other tests in the testsuite
8105 @comment that test this same feature.
8106 @comment ignore
8107 @example
8108 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
8109 @result{}
8110 @end example
8111
8112 It tells @code{m4} to read all of its input before executing the wrapped
8113 text, then hands a valid (albeit emptied) pipe as standard input for the
8114 @code{cat} subcommand.  Therefore, you should be careful when using
8115 standard input (either by specifying no files, or by passing @samp{-} as
8116 a file name on the command line, @pxref{Command line files, , Invoking
8117 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
8118 that consume data from standard input.  When standard input is a
8119 seekable file, the subprocess will pick up with the next character not
8120 yet processed by @code{m4}; when it is a pipe or other non-seekable
8121 file, there is no guarantee how much data will already be buffered by
8122 @code{m4} and thus unavailable to the child.
8123
8124 Following is an example of how potentially unsafe actions can be
8125 suppressed.
8126
8127 @comment options: --safer
8128 @comment status: 1
8129 @example
8130 $ @kbd{m4 --safer}
8131 syscmd(`echo hi')
8132 @error{}m4:stdin:1: syscmd: disabled by --safer
8133 @result{}
8134 @end example
8135
8136 @node Esyscmd
8137 @section Reading the output of commands
8138
8139 @cindex GNU extensions
8140 If you want @code{m4} to read the output of a shell command, use
8141 @code{esyscmd}:
8142
8143 @deffn {Builtin (gnu)} esyscmd (@var{shell-command})
8144 Expands to the standard output of the shell command
8145 @var{shell-command}.
8146
8147 Prior to executing the command, @code{m4} flushes its buffers.
8148 The default standard input and standard error of @var{shell-command} are
8149 the same as those of @code{m4}.  The error output of @var{shell-command}
8150 is not a part of the expansion: it will appear along with the error
8151 output of @code{m4}.
8152
8153 By default, the @var{shell-command} will be used as the argument to the
8154 @option{-c} option of the @command{/bin/sh} shell (or the version of
8155 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8156 supports that).  If you prefer a different shell, the
8157 @command{configure} script can be given the option
8158 @option{--with-syscmd-shell=@var{location}} to set the location of an
8159 alternative shell at GNU @code{m4} installation; the
8160 alternative shell must still support @option{-c}.
8161
8162 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8163 m4}) is in effect, @code{esyscmd} results in an error, since otherwise
8164 an input file could execute arbitrary code.
8165
8166 The macro @code{esyscmd} is recognized only with parameters.
8167 @end deffn
8168
8169 @example
8170 define(`foo', `FOO')
8171 @result{}
8172 esyscmd(`echo foo')
8173 @result{}FOO
8174 @result{}
8175 @end example
8176
8177 Note how the expansion of @code{esyscmd} keeps the trailing newline of
8178 the command, as well as using the newline that appeared after the macro.
8179
8180 Just as with @code{syscmd}, care must be exercised when sharing standard
8181 input between @code{m4} and the child process of @code{esyscmd}.
8182 Likewise, potentially unsafe actions can be suppressed.
8183
8184 @comment options: --safer
8185 @comment status: 1
8186 @example
8187 $ @kbd{m4 --safer}
8188 esyscmd(`echo hi')
8189 @error{}m4:stdin:1: esyscmd: disabled by --safer
8190 @result{}
8191 @end example
8192
8193 @node Sysval
8194 @section Exit status
8195
8196 @cindex UNIX commands, exit status from
8197 @cindex exit status from shell commands
8198 @cindex shell commands, exit status from
8199 @cindex commands, exit status from shell
8200 @cindex status of shell commands
8201 To see whether a shell command succeeded, use @code{sysval}:
8202
8203 @deffn {Builtin (m4)} sysval
8204 Expands to the exit status of the last shell command run with
8205 @code{syscmd} or @code{esyscmd}.  Expands to 0 if no command has been
8206 run yet.
8207 @end deffn
8208
8209 @example
8210 sysval
8211 @result{}0
8212 syscmd(`false')
8213 @result{}
8214 ifelse(sysval, `0', `zero', `non-zero')
8215 @result{}non-zero
8216 syscmd(`exit 2')
8217 @result{}
8218 sysval
8219 @result{}2
8220 syscmd(`true')
8221 @result{}
8222 sysval
8223 @result{}0
8224 esyscmd(`false')
8225 @result{}
8226 ifelse(sysval, `0', `zero', `non-zero')
8227 @result{}non-zero
8228 esyscmd(`echo dnl && exit 127')
8229 @result{}
8230 sysval
8231 @result{}127
8232 esyscmd(`true')
8233 @result{}
8234 sysval
8235 @result{}0
8236 @end example
8237
8238 @code{sysval} results in 127 if there was a problem executing the
8239 command, for example, if the system-imposed argument length is exceeded,
8240 or if there were not enough resources to fork.  It is not possible to
8241 distinguish between failed execution and successful execution that had
8242 an exit status of 127, unless there was output from the child process.
8243
8244 On UNIX platforms, where it is possible to detect when command execution
8245 is terminated by a signal, rather than a normal exit, the result is the
8246 signal number shifted left by eight bits.
8247
8248 @comment This test has difficulties being portable, even on platforms
8249 @comment where syscmd invokes /bin/sh.  Kill is not portable with signal
8250 @comment names.  According to autoconf, the only portable signal numbers
8251 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM).  But
8252 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
8253 @comment exits normally rather than letting the signal terminate it).
8254 @comment Also, TERM is flaky, as it can also kill the running m4 on
8255 @comment systems where /bin/sh does not create its own process group.
8256 @comment And PIPE is unreliable, since people tend to run with it
8257 @comment ignored, with m4 inheriting that choice.  That leaves KILL as
8258 @comment the only signal we can reliably test.
8259 @example
8260 dnl This test assumes kill is a shell builtin, and that signals are
8261 dnl recognizable.
8262 ifdef(`__unix__', ,
8263       `errprint(` skipping: syscmd does not have unix semantics
8264 ')m4exit(`77')')dnl
8265 syscmd(`kill -9 $$')
8266 @result{}
8267 sysval
8268 @result{}2304
8269 syscmd()
8270 @result{}
8271 sysval
8272 @result{}0
8273 esyscmd(`kill -9 $$')
8274 @result{}
8275 sysval
8276 @result{}2304
8277 @end example
8278
8279 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8280 m4}) is in effect, @code{sysval} will always remain at its default value
8281 of zero.
8282
8283 @comment options: --safer
8284 @comment status: 1
8285 @example
8286 $ @kbd{m4 --safer}
8287 sysval
8288 @result{}0
8289 syscmd(`false')
8290 @error{}m4:stdin:2: syscmd: disabled by --safer
8291 @result{}
8292 sysval
8293 @result{}0
8294 @end example
8295
8296 @node Mkstemp
8297 @section Making temporary files
8298
8299 @cindex temporary file names
8300 @cindex files, names of temporary
8301 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8302 temporary file, for output or for some other purpose.  There is a
8303 builtin macro, @code{mkstemp}, for making a temporary file:
8304
8305 @deffn {Builtin (m4)} mkstemp (@var{template})
8306 @deffnx {Builtin (m4)} maketemp (@var{template})
8307 Expands to the quoted name of a new, empty file, made from the string
8308 @var{template}, which should end with the string @samp{XXXXXX}.  The six
8309 @samp{X} characters are then replaced with random characters matching
8310 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
8311 name unique.  If fewer than six @samp{X} characters are found at the end
8312 of @code{template}, the result will be longer than the template.  The
8313 created file will have access permissions as if by @kbd{chmod =rw,go=},
8314 meaning that the current umask of the @code{m4} process is taken into
8315 account, and at most only the current user can read and write the file.
8316
8317 The traditional behavior, standardized by POSIX, is that
8318 @code{maketemp} merely replaces the trailing @samp{X} with the process
8319 id, without creating a file or quoting the expansion, and without
8320 ensuring that the resulting
8321 string is a unique file name.  In part, this means that using the same
8322 @var{template} twice in the same input file will result in the same
8323 expansion.  This behavior is a security hole, as it is very easy for
8324 another process to guess the name that will be generated, and thus
8325 interfere with a subsequent use of @code{syscmd} trying to manipulate
8326 that file name.  Hence, POSIX has recommended that all new
8327 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
8328 and that users of @code{m4} check for its existence.
8329
8330 The expansion is void and an error issued if a temporary file could
8331 not be created.
8332
8333 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8334 is in effect, @code{mkstemp} and GNU-mode @code{maketemp}
8335 result in an error, since otherwise an input file could perform a mild
8336 denial-of-service attack by filling up a disk with multiple empty files.
8337
8338 The macros @code{mkstemp} and @code{maketemp} are recognized only with
8339 parameters.
8340 @end deffn
8341
8342 If you try this next example, you will most likely get different output
8343 for the two file names, since the replacement characters are randomly
8344 chosen:
8345
8346 @comment ignore
8347 @example
8348 $ @kbd{m4}
8349 define(`tmp', `oops')
8350 @result{}
8351 maketemp(`/tmp/fooXXXXXX')
8352 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8353 @result{}/tmp/fooa07346
8354 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
8355       `define(`mkstemp', defn(`maketemp'))dnl
8356 errprint(`warning: potentially insecure maketemp implementation
8357 ')')
8358 @result{}
8359 mkstemp(`doc')
8360 @result{}docQv83Uw
8361 @end example
8362
8363 @comment options: --safer
8364 @comment status: 1
8365 @example
8366 $ @kbd{m4 --safer}
8367 maketemp(`/tmp/fooXXXXXX')
8368 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8369 @error{}m4:stdin:1: maketemp: disabled by --safer
8370 @result{}
8371 mkstemp(`/tmp/fooXXXXXX')
8372 @error{}m4:stdin:2: mkstemp: disabled by --safer
8373 @result{}
8374 @end example
8375
8376 @cindex GNU extensions
8377 Unless you use the @option{--traditional} command line option (or
8378 @option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
8379 version of @code{maketemp} is secure.  This means that using the same
8380 template to multiple calls will generate multiple files.  However, we
8381 recommend that you use the new @code{mkstemp} macro, introduced in
8382 GNU M4 1.4.8, which is secure even in traditional mode.  Also,
8383 as of M4 1.4.11, the secure implementation quotes the resulting file
8384 name, so that you are guaranteed to know what file was created even if
8385 the random file name happens to match an existing macro.  Notice that
8386 this example is careful to use @code{defn} to avoid unintended expansion
8387 of @samp{foo}.
8388
8389 @example
8390 $ @kbd{m4}
8391 define(`foo', `errprint(`oops')')
8392 @result{}
8393 syscmd(`rm -f foo-??????')sysval
8394 @result{}0
8395 define(`file1', maketemp(`foo-XXXXXX'))dnl
8396 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8397 ifelse(esyscmd(`echo \` foo-?????? \''), `foo-??????',
8398        `no file', `created')
8399 @result{}created
8400 define(`file2', maketemp(`foo-XX'))dnl
8401 @error{}m4:stdin:6: warning: maketemp: recommend using mkstemp instead
8402 define(`file3', mkstemp(`foo-XXXXXX'))dnl
8403 ifelse(len(defn(`file1')), len(defn(`file2')),
8404        `same length', `different')
8405 @result{}same length
8406 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
8407 @result{}different file
8408 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
8409 @result{}different file
8410 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
8411 @result{}different file
8412 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
8413 @result{}
8414 sysval
8415 @result{}0
8416 @end example
8417
8418 @comment options: -G
8419 @example
8420 $ @kbd{m4 -G}
8421 syscmd(`rm -f foo-*')sysval
8422 @result{}0
8423 define(`file1', maketemp(`foo-XXXXXX'))dnl
8424 @error{}m4:stdin:2: warning: maketemp: recommend using mkstemp instead
8425 define(`file2', maketemp(`foo-XXXXXX'))dnl
8426 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8427 ifelse(file1, file2, `same', `different file')
8428 @result{}same
8429 len(maketemp(`foo-XXXXX'))
8430 @error{}m4:stdin:5: warning: maketemp: recommend using mkstemp instead
8431 @result{}9
8432 define(`abc', `def')
8433 @result{}
8434 maketemp(`foo-abc')
8435 @result{}foo-def
8436 @error{}m4:stdin:7: warning: maketemp: recommend using mkstemp instead
8437 syscmd(`test -f foo-*')sysval
8438 @result{}1
8439 @end example
8440
8441 @node Mkdtemp
8442 @section Making temporary directories
8443
8444 @cindex temporary directory
8445 @cindex directories, temporary
8446 @cindex GNU extensions
8447 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8448 temporary directory, for holding multiple temporary files; such a
8449 directory can be created with @code{mkdtemp}:
8450
8451 @deffn {Builtin (gnu)} mkdtemp (@var{template})
8452 Expands to the quoted name of a new, empty directory, made from the string
8453 @var{template}, which should end with the string @samp{XXXXXX}.  The six
8454 @samp{X} characters are then replaced with random characters matching
8455 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name
8456 unique.  If fewer than six @samp{X} characters are found at the end of
8457 @code{template}, the result will be longer than the template.  The
8458 created directory will have access permissions as if by @kbd{chmod
8459 =rwx,go=}, meaning that the current umask of the @code{m4} process is
8460 taken into account, and at most only the current user can read, write,
8461 and search the directory.
8462
8463 The expansion is void and an error issued if a temporary directory could
8464 not be created.
8465
8466 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8467 is in effect, @code{mkdtemp} results in an error, since otherwise an
8468 input file could perform a mild denial-of-service attack by filling up a
8469 disk with multiple directories.
8470
8471 The macro @code{mkdtemp} is recognized only with parameters.
8472 This macro was added in M4 2.0.
8473 @end deffn
8474
8475 If you try this next example, you will most likely get different output
8476 for the directory names, since the replacement characters are randomly
8477 chosen:
8478
8479 @comment ignore
8480 @example
8481 $ @kbd{m4}
8482 define(`tmp', `oops')
8483 @result{}
8484 mkdtemp(`/tmp/fooXXXXXX')
8485 @result{}/tmp/foo2h89Vo
8486 mkdtemp(`dir)
8487 @result{}dirrg079A
8488 @end example
8489
8490 @comment options: --safer
8491 @comment status: 1
8492 @example
8493 $ @kbd{m4 --safer}
8494 mkdtemp(`/tmp/fooXXXXXX')
8495 @error{}m4:stdin:1: mkdtemp: disabled by --safer
8496 @result{}
8497 @end example
8498
8499 Multiple calls with the same template will generate multiple
8500 directories.
8501
8502 @example
8503 $ @kbd{m4}
8504 syscmd(`echo foo??????')dnl
8505 @result{}foo??????
8506 define(`dir1', mkdtemp(`fooXXXXXX'))dnl
8507 ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created')
8508 @result{}created
8509 define(`dir2', mkdtemp(`fooXXXXXX'))dnl
8510 ifelse(dir1, dir2, `same', `different directories')
8511 @result{}different directories
8512 syscmd(`rmdir 'dir1 dir2)
8513 @result{}
8514 sysval
8515 @result{}0
8516 @end example
8517
8518 @node Miscellaneous
8519 @chapter Miscellaneous builtin macros
8520
8521 This chapter describes various builtins, that do not really belong in
8522 any of the previous chapters.
8523
8524 @menu
8525 * Errprint::                    Printing error messages
8526 * Location::                    Printing current location
8527 * M4exit::                      Exiting from @code{m4}
8528 * Syncoutput::                  Turning on and off sync lines
8529 @end menu
8530
8531 @node Errprint
8532 @section Printing error messages
8533
8534 @cindex printing error messages
8535 @cindex error messages, printing
8536 @cindex messages, printing error
8537 @cindex standard error, output to
8538 You can print error messages using @code{errprint}:
8539
8540 @deffn {Builtin (m4)} errprint (@var{message}, @dots{})
8541 Prints @var{message} and the rest of the arguments to standard error,
8542 separated by spaces.  Standard error is used, regardless of the
8543 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
8544
8545 The expansion of @code{errprint} is void.
8546 The macro @code{errprint} is recognized only with parameters.
8547 @end deffn
8548
8549 @example
8550 errprint(`Invalid arguments to forloop
8551 ')
8552 @error{}Invalid arguments to forloop
8553 @result{}
8554 errprint(`1')errprint(`2',`3
8555 ')
8556 @error{}12 3
8557 @result{}
8558 @end example
8559
8560 A trailing newline is @emph{not} printed automatically, so it should be
8561 supplied as part of the argument, as in the example.  Unfortunately, the
8562 exact output of @code{errprint} is not very portable to other @code{m4}
8563 implementations: POSIX requires that all arguments be printed,
8564 but some implementations of @code{m4} only print the first.
8565 Furthermore, some BSD implementations always append a newline
8566 for each @code{errprint} call, regardless of whether the last argument
8567 already had one, and POSIX is silent on whether this is
8568 acceptable.
8569
8570 @node Location
8571 @section Printing current location
8572
8573 @cindex location, input
8574 @cindex input location
8575 To make it possible to specify the location of an error, three
8576 utility builtins exist:
8577
8578 @deffn {Builtin (gnu)} __file__
8579 @deffnx {Builtin (gnu)} __line__
8580 @deffnx {Builtin (gnu)} __program__
8581 Expand to the quoted name of the current input file, the
8582 current input line number in that file, and the quoted name of the
8583 current invocation of @code{m4}.
8584 @end deffn
8585
8586 @example
8587 errprint(__program__:__file__:__line__: `input error
8588 ')
8589 @error{}m4:stdin:1: input error
8590 @result{}
8591 @end example
8592
8593 Line numbers start at 1 for each file.  If the file was found due to the
8594 @option{-I} option or @env{M4PATH} environment variable, that is
8595 reflected in the file name.  Synclines, via @code{syncoutput}
8596 (@pxref{Syncoutput}) or the command line option @option{--synclines}
8597 (or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the
8598 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}),
8599 also use this notion of current file and line.  Redefining the three
8600 location macros has no effect on syncline, debug, warning, or error
8601 message output.
8602
8603 This example reuses the file @file{incl.m4} mentioned earlier
8604 (@pxref{Include}):
8605
8606 @comment examples
8607 @example
8608 $ @kbd{m4 -I examples}
8609 define(`foo', ``$0' called at __file__:__line__')
8610 @result{}
8611 foo
8612 @result{}foo called at stdin:2
8613 include(`incl.m4')
8614 @result{}Include file start
8615 @result{}foo called at examples/incl.m4:2
8616 @result{}Include file end
8617 @result{}
8618 @end example
8619
8620 The location of macros invoked during the rescanning of macro expansion
8621 text corresponds to the location in the file where the expansion was
8622 triggered, regardless of how many newline characters the expansion text
8623 contains.  As of GNU M4 1.4.8, the location of text wrapped
8624 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
8625 @code{m4wrap} was invoked.  Previous versions, however, behaved as
8626 though wrapped text came from line 0 of the file ``''.
8627
8628 @example
8629 define(`echo', `$@@')
8630 @result{}
8631 define(`foo', `echo(__line__
8632 __line__)')
8633 @result{}
8634 echo(__line__
8635 __line__)
8636 @result{}4
8637 @result{}5
8638 m4wrap(`foo
8639 ')
8640 @result{}
8641 foo(errprint(__line__
8642 __line__
8643 ))
8644 @error{}8
8645 @error{}9
8646 @result{}8
8647 @result{}8
8648 __line__
8649 @result{}11
8650 m4wrap(`__line__
8651 ')
8652 @result{}
8653 ^D
8654 @result{}6
8655 @result{}6
8656 @result{}12
8657 @end example
8658
8659 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
8660 terminology.  If you invoke @code{m4} through an absolute path or a link
8661 with a different spelling, rather than by relying on a @env{PATH} search
8662 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
8663 The intent is that you can use it to produce error messages with the
8664 same formatting that @code{m4} produces internally.  It can also be used
8665 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
8666 @code{m4} that is currently running, rather than whatever version of
8667 @code{m4} happens to be first in @env{PATH}.  It was first introduced in
8668 GNU M4 1.4.6.
8669
8670 @node M4exit
8671 @section Exiting from @code{m4}
8672
8673 @cindex exiting from @code{m4}
8674 @cindex status, setting @code{m4} exit
8675 If you need to exit from @code{m4} before the entire input has been
8676 read, you can use @code{m4exit}:
8677
8678 @deffn {Builtin (m4)} m4exit (@ovar{code})
8679 Causes @code{m4} to exit, with exit status @var{code}.  If @var{code} is
8680 left out, the exit status is zero.  If @var{code} cannot be parsed, or
8681 is outside the range of 0 to 255, the exit status is one.  No further
8682 input is read, and all wrapped and diverted text is discarded.
8683 @end deffn
8684
8685 @example
8686 m4wrap(`This text is lost due to `m4exit'.')
8687 @result{}
8688 divert(`1') So is this.
8689 divert
8690 @result{}
8691 m4exit And this is never read.
8692 @end example
8693
8694 A common use of this is to abort processing:
8695
8696 @deffn Composite fatal_error (@var{message})
8697 Abort processing with an error message and non-zero status.  Prefix
8698 @var{message} with details about where the error occurred, and print the
8699 resulting string to standard error.
8700 @end deffn
8701
8702 @comment status: 1
8703 @example
8704 define(`fatal_error',
8705        `errprint(__program__:__file__:__line__`: fatal error: $*
8706 ')m4exit(`1')')
8707 @result{}
8708 fatal_error(`this is a BAD one, buster')
8709 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
8710 @end example
8711
8712 After this macro call, @code{m4} will exit with exit status 1.  This macro
8713 is only intended for error exits, since the normal exit procedures are
8714 not followed, i.e., diverted text is not undiverted, and saved text
8715 (@pxref{M4wrap}) is not reread.  (This macro could be made more robust
8716 to earlier versions of @code{m4}.  You should try to see if you can find
8717 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
8718
8719 Note that it is still possible for the exit status to be different than
8720 what was requested by @code{m4exit}.  If @code{m4} detects some other
8721 error, such as a write error on standard output, the exit status will be
8722 non-zero even if @code{m4exit} requested zero.
8723
8724 If standard input is seekable, then the file will be positioned at the
8725 next unread character.  If it is a pipe or other non-seekable file,
8726 then there are no guarantees how much data @code{m4} might have read
8727 into buffers, and thus discarded.
8728
8729 @node Syncoutput
8730 @section Turning on and off sync lines
8731
8732 @cindex toggling synchronization lines
8733 @cindex synchronization lines
8734 @cindex location, input
8735 @cindex input location
8736 It is possible to adjust whether synclines are printed to output:
8737
8738 @deffn {Builtin (gnu)} syncoutput (@var{truth})
8739 If @var{truth} matches the extended regular expression
8740 @samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the
8741 form: @samp{#line <number> ["<file>"]}.
8742
8743 If @var{truth} is empty, or matches the extended regular expression
8744 @samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off.
8745
8746 All other arguments are ignored and issue a warning.
8747
8748 The macro @code{syncoutput} is recognized only with parameters.
8749 This macro was added in M4 2.0.
8750 @end deffn
8751
8752 @example
8753 define(`twoline', `1
8754 2')
8755 @result{}
8756 changecom(`/*', `*/')
8757 @result{}
8758 define(`comment', `/*1
8759 2*/')
8760 @result{}
8761 twoline
8762 @result{}1
8763 @result{}2
8764 dnl no line
8765 syncoutput(`on')
8766 @result{}#line 8 "stdin"
8767 @result{}
8768 twoline
8769 @result{}1
8770 @result{}#line 9
8771 @result{}2
8772 dnl no line
8773 hello
8774 @result{}#line 11
8775 @result{}hello
8776 comment
8777 @result{}/*1
8778 @result{}2*/
8779 one comment `two
8780 three'
8781 @result{}#line 13
8782 @result{}one /*1
8783 @result{}2*/ two
8784 @result{}three
8785 goodbye
8786 @result{}#line 15
8787 @result{}goodbye
8788 syncoutput(`off')
8789 @result{}
8790 twoline
8791 @result{}1
8792 @result{}2
8793 syncoutput(`blah')
8794 @error{}m4:stdin:18: warning: syncoutput: unknown directive 'blah'
8795 @result{}
8796 @end example
8797
8798 Notice that a syncline is output any time a single source line expands
8799 to multiple output lines, or any time multiple source lines expand to a
8800 single output line.  When there is a one-for-one correspondence, no
8801 additional synclines are needed.
8802
8803 Synchronization lines can be used to track where input comes from; an
8804 optional file designation is printed when the syncline algorithm
8805 detects that consecutive output lines come from different files.  You
8806 can also use the @option{--synclines} command-line option (or
8807 @option{-s}, @pxref{Preprocessor features, , Invoking m4}) to start
8808 with synchronization on.  This example reuses the file @file{incl.m4}
8809 mentioned earlier (@pxref{Include}):
8810
8811 @comment examples
8812 @comment options: -s
8813 @example
8814 $ @kbd{m4 --synclines -I examples}
8815 include(`incl.m4')
8816 @result{}#line 1 "examples/incl.m4"
8817 @result{}Include file start
8818 @result{}foo
8819 @result{}Include file end
8820 @result{}#line 1 "stdin"
8821 @result{}
8822 @end example
8823
8824 @node Frozen files
8825 @chapter Fast loading of frozen state
8826
8827 Some bigger @code{m4} applications may be built over a common base
8828 containing hundreds of definitions and other costly initializations.
8829 Usually, the common base is kept in one or more declarative files,
8830 which files are listed on each @code{m4} invocation prior to the
8831 user's input file, or else each input file uses @code{include}.
8832
8833 Reading the common base of a big application, over and over again, may
8834 be time consuming.  GNU @code{m4} offers some machinery to
8835 speed up the start of an application using lengthy common bases.
8836
8837 @menu
8838 * Using frozen files::          Using frozen files
8839 * Frozen file format 1::        Frozen file format 1
8840 * Frozen file format 2::        Frozen file format 2
8841 @end menu
8842
8843 @node Using frozen files
8844 @section Using frozen files
8845
8846 @cindex fast loading of frozen files
8847 @cindex frozen files for fast loading
8848 @cindex initialization, frozen state
8849 @cindex dumping into frozen file
8850 @cindex reloading a frozen file
8851 @cindex GNU extensions
8852 Suppose a user has a library of @code{m4} initializations in
8853 @file{base.m4}, which is then used with multiple input files:
8854
8855 @comment ignore
8856 @example
8857 $ @kbd{m4 base.m4 input1.m4}
8858 $ @kbd{m4 base.m4 input2.m4}
8859 $ @kbd{m4 base.m4 input3.m4}
8860 @end example
8861
8862 Rather than spending time parsing the fixed contents of @file{base.m4}
8863 every time, the user might rather execute:
8864
8865 @comment ignore
8866 @example
8867 $ @kbd{m4 -F base.m4f base.m4}
8868 @end example
8869
8870 @noindent
8871 once, and further execute, as often as needed:
8872
8873 @comment ignore
8874 @example
8875 $ @kbd{m4 -R base.m4f input1.m4}
8876 $ @kbd{m4 -R base.m4f input2.m4}
8877 $ @kbd{m4 -R base.m4f input3.m4}
8878 @end example
8879
8880 @noindent
8881 with the varying input.  The first call, containing the @option{-F}
8882 option, only reads and executes file @file{base.m4}, defining
8883 various application macros and computing other initializations.
8884 Once the input file @file{base.m4} has been completely processed, GNU
8885 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
8886 file which contains a kind of snapshot of the @code{m4} internal state.
8887
8888 Later calls, containing the @option{-R} option, are able to reload
8889 the internal state of @code{m4}, from @file{base.m4f},
8890 @emph{prior} to reading any other input files.  This means
8891 instead of starting with a virgin copy of @code{m4}, input will be
8892 read after having effectively recovered the effect of a prior run.
8893 In our example, the effect is the same as if file @file{base.m4} has
8894 been read anew.  However, this effect is achieved a lot faster.
8895
8896 Only one frozen file may be created or read in any one @code{m4}
8897 invocation.  It is not possible to recover two frozen files at once.
8898 However, frozen files may be updated incrementally, through using
8899 @option{-R} and @option{-F} options simultaneously.  For example, if
8900 some care is taken, the command:
8901
8902 @comment ignore
8903 @example
8904 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
8905 @end example
8906
8907 @noindent
8908 could be broken down in the following sequence, accumulating the same
8909 output:
8910
8911 @comment ignore
8912 @example
8913 $ @kbd{m4 -F file1.m4f file1.m4}
8914 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
8915 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
8916 $ @kbd{m4 -R file3.m4f file4.m4}
8917 @end example
8918
8919 Some care is necessary because the frozen file does not save all state
8920 information.  Stacks of macro definitions via @code{pushdef} are
8921 accurately stored, along with all renamed or undefined builtins, as are
8922 the current syntax rules such as from @code{changequote}.  However, the
8923 value of @code{sysval} and text saved in @code{m4wrap} are not currently
8924 preserved.  Also, changing command line options between runs may cause
8925 unexpected behavior.  A future release of GNU M4 may improve
8926 on the quality of frozen files.
8927
8928 When an @code{m4} run is to be frozen, the automatic undiversion
8929 which takes place at end of execution is inhibited.  Instead, all
8930 positively numbered diversions are saved into the frozen file.
8931 The active diversion number is also transmitted.
8932
8933 A frozen file to be reloaded need not reside in the current directory.
8934 It is looked up the same way as an @code{include} file (@pxref{Search
8935 Path}).
8936
8937 If the frozen file was generated with a newer version of @code{m4}, and
8938 contains directives that an older @code{m4} cannot parse, attempting to
8939 load the frozen file with option @option{-R} will cause @code{m4} to
8940 exit with status 63 to indicate version mismatch.
8941
8942 @node Frozen file format 1
8943 @section Frozen file format 1
8944
8945 @cindex frozen file format 1
8946 @cindex file format, frozen file version 1
8947 Frozen files are sharable across architectures.  It is safe to write
8948 a frozen file on one machine and read it on another, given that the
8949 second machine uses the same or newer version of GNU @code{m4}.
8950 It is conventional, but not required, to give a frozen file the suffix
8951 of @code{.m4f}.
8952
8953 Older versions of GNU @code{m4} create frozen files with
8954 syntax version 1.  These files can be read by the current version, but
8955 are no longer produced.  Version 1 files are mostly text files, although
8956 any macros or diversions that contained nonprintable characters or long
8957 lines cause the resulting frozen file to do likewise, since there are no
8958 escape sequences.  The file can be edited to change the state that
8959 @code{m4} will start with.  It is composed of several directives, each
8960 starting with a single letter and ending with a newline (@key{NL}).
8961 Wherever a directive is expected, the character @samp{#} can be used
8962 instead to introduce a comment line; empty lines are also ignored if
8963 they are not part of an embedded string.
8964
8965 In the following descriptions, each @var{len} refers to the length of a
8966 corresponding subsequent @var{str}.  Numbers are always expressed in
8967 decimal, and an omitted number defaults to 0.  The valid directives in
8968 version 1 are:
8969
8970 @table @code
8971 @item V @var{number} @key{NL}
8972 Confirms the format of the file.  Version 1 is recognized when
8973 @var{number} is 1.  This directive must be the first non-comment in the
8974 file, and may not appear more than once.
8975
8976 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8977 Uses @var{str1} and @var{str2} as the begin-comment and
8978 end-comment strings.  If omitted, then @samp{#} and @key{NL} are the
8979 comment delimiters.
8980
8981 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
8982 Selects diversion @var{number}, making it current, then copy @var{str}
8983 in the current diversion.  @var{number} may be a negative number for a
8984 diversion that discards text.  To merely specify an active selection,
8985 use this command with an empty @var{str}.  With 0 as the diversion
8986 @var{number}, @var{str} will be issued on standard output at reload
8987 time.  GNU @code{m4} will not produce the @samp{D} directive
8988 with non-zero length for diversion 0, but this can be done with manual
8989 edits.  This directive may appear more than once for the same diversion,
8990 in which case the diversion is the concatenation of the various uses.
8991 If omitted, then diversion 0 is current.
8992
8993 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8994 Defines, through @code{pushdef}, a definition for @var{str1} expanding
8995 to the function whose builtin name is @var{str2}.  If the builtin does
8996 not exist (for example, if the frozen file was produced by a copy of
8997 @code{m4} compiled with the now-abandoned @code{changeword} support),
8998 the reload is silent, but any subsequent use of the definition of
8999 @var{str1} will result in a warning.  This directive may appear more
9000 than once for the same name, and its order, along with @samp{T}, is
9001 important.  If omitted, you will have no access to any builtins.
9002
9003 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
9004 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
9005 strings.  If omitted, then @samp{`} and @samp{'} are the quote
9006 delimiters.
9007
9008 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
9009 Defines, though @code{pushdef}, a definition for @var{str1}
9010 expanding to the text given by @var{str2}.  This directive may appear
9011 more than once for the same name, and its order, along with @samp{F}, is
9012 important.
9013 @end table
9014
9015 When loading format 1, the syntax categories @samp{@{} and @samp{@}} are
9016 disabled (reverting braces to be treated like plain characters).  This
9017 is because frozen files created with M4 1.4.x did not understand
9018 @samp{$@{@dots{}@}} extended argument notation, and a frozen macro that
9019 contained this character sequence should not behave differently just
9020 because a newer version of M4 reloaded the file.
9021
9022 @node Frozen file format 2
9023 @section Frozen file format 2
9024
9025 @cindex frozen file format 2
9026 @cindex file format, frozen file version 2
9027 The syntax of version 1 has some drawbacks; if any macro or diversion
9028 contained non-printable characters or long lines, the resulting frozen
9029 file would not qualify as a text file, making it harder to edit with
9030 some vendor tools.  The concatenation of multiple strings on a single
9031 line, such as for the @samp{T} directive, makes distinguishing the two
9032 strings a bit more difficult.  Finally, the format lacks support for
9033 several items of @code{m4} state, such that a reloaded file did not
9034 always behave the same as the original file.
9035
9036 These shortcomings have been addressed in version 2 of the frozen file
9037 syntax.  New directives have been added, and existing directives have
9038 additional, and sometimes optional, parameters.  All @var{str} instances
9039 in the grammar are now followed by @key{NL}, which makes the split
9040 between consecutive strings easier to recognize.  Strings may now
9041 contain escape sequences modeled after C, such as @samp{\n} for newline
9042 or @samp{\0} for @sc{nul}, so that the frozen file can be pure
9043 @sc{ascii} (although when hand-editing a frozen file, it is still
9044 acceptable to use the original byte rather than an escape sequence for
9045 all bytes except @samp{\}).  Also in the context of a @var{str}, the
9046 escape sequence @samp{\@key{NL}} is discarded, allowing a user to split
9047 lines that are too long for some platform tools.
9048
9049 @table @code
9050 @item V @var{number} @key{NL}
9051 Confirms the format of the file.  @code{m4} @value{VERSION} only creates
9052 frozen files where @var{number} is 2.  This directive must be the first
9053 non-comment in the file, and may not appear more than once.
9054
9055 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9056 Uses @var{str1} and @var{str2} as the begin-comment and
9057 end-comment strings.  If omitted, then @samp{#} and @key{NL} are the
9058 comment delimiters.
9059
9060 @item d @var{len} @key{NL} @var{str} @key{NL}
9061 Sets the debug flags, using @var{str} as the argument to
9062 @code{debugmode}.  If omitted, then the debug flags start in their
9063 default disabled state.
9064
9065 @item D @var{number} , @var{len} @key{NL} @var{str} @key{NL}
9066 Selects diversion @var{number}, making it current, then copy @var{str}
9067 in the current diversion.  @var{number} may be a negative number for a
9068 diversion that discards text.  To merely specify an active selection,
9069 use this command with an empty @var{string}.  With 0 as the diversion
9070 @var{number}, @var{str} will be issued on standard output at reload
9071 time.  GNU @code{m4} will not produce the @samp{D} directive
9072 with non-zero length for diversion 0, but this can be done with manual
9073 edits.  This directive may appear more than once for the same diversion,
9074 in which case the diversion is the concatenation of the various uses.
9075 If omitted, then diversion 0 is current.
9076
9077 @comment FIXME - the first usage, with only one string, is not supported
9078 @comment in the current code
9079 @c @item F @var{len1} @key{NL} @var{str1} @key{NL}
9080 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9081 @itemx F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9082 Defines, through @code{pushdef}, a definition for @var{str1} expanding
9083 to the function whose builtin name is given by @var{str2} (defaulting to
9084 @var{str1} if not present).  With two arguments, the builtin name is
9085 searched for among the intrinsic builtin functions only; with three
9086 arguments, the builtin name is searched for amongst the builtin
9087 functions defined by the module named by @var{str3}.
9088
9089 @item M @var{len} @key{NL} @var{str} @key{NL}
9090 Names a module which will be searched for according to the module search
9091 path and loaded.  Modules loaded from a frozen file don't add their
9092 builtin entries to the symbol table.  Modules must be loaded prior to
9093 specifying module-specific builtins via the three-argument @code{F} or
9094 @code{T}.
9095
9096 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9097 Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings.
9098 If omitted, then @samp{`} and @samp{'} are the quote delimiters.
9099
9100 @item R @var{len} @key{NL} @var{str} @key{NL}
9101 Sets the default regexp syntax, where @var{str} encodes one of the
9102 regular expression syntaxes supported by GNU M4.
9103 @xref{Changeresyntax}, for more details.
9104
9105 @item S @var{syntax-code} @var{len} @key{NL} @var{str} @key{NL}
9106 Defines, through @code{changesyntax}, a syntax category for each of the
9107 characters in @var{str}.  The @var{syntax-code} must be one of the
9108 characters described in @ref{Changesyntax}.
9109
9110 @item t @var{len} @key{NL} @var{str} @key{NL}
9111 Enables tracing for any macro named @var{str}, similar to using the
9112 @code{traceon} builtin.  This option may occur more than once for
9113 multiple macros; if omitted, no macro starts out as traced.
9114
9115 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9116 @itemx T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9117 Defines, though @code{pushdef}, a definition for @var{str1} expanding to
9118 the text given by @var{str2}.  This directive may appear more than once
9119 for the same name, and its order, along with @samp{F}, is important.  If
9120 present, the optional third argument associates the macro with a module
9121 named by @var{str3}.
9122 @end table
9123
9124 @node Compatibility
9125 @chapter Compatibility with other versions of @code{m4}
9126
9127 @cindex compatibility
9128 This chapter describes the many of the differences between this
9129 implementation of @code{m4}, and of other implementations found under
9130 UNIX, such as System V Release 4, Solaris, and BSD flavors.
9131 In particular, it lists the known differences and extensions to
9132 POSIX.  However, the list is not necessarily comprehensive.
9133
9134 At the time of this writing, POSIX 2001 (also known as IEEE
9135 Std 1003.1-2001) is the latest standard, although a new version of
9136 POSIX is under development and includes several proposals for
9137 modifying what @code{m4} is required to do.  The requirements for
9138 @code{m4} are shared between SUSv3 and POSIX, and
9139 can be viewed at
9140 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
9141
9142 @menu
9143 * Extensions::                  Extensions in GNU M4
9144 * Incompatibilities::           Other incompatibilities
9145 * Experiments::                 Experimental features in GNU M4
9146 @end menu
9147
9148 @node Extensions
9149 @section Extensions in GNU M4
9150
9151 @cindex GNU extensions
9152 @cindex POSIX
9153 @cindex @env{POSIXLY_CORRECT}
9154 This version of @code{m4} contains a few facilities that do not exist
9155 in System V @code{m4}.  These extra facilities are all suppressed by
9156 using the @option{-G} command line option, unless overridden by other
9157 command line options.
9158 Most of these extensions are compatible with
9159 @uref{http://www.unix.org/single_unix_specification/,
9160 POSIX}; the few exceptions are suppressed if the
9161 @env{POSIXLY_CORRECT} environment variable is set.
9162
9163 @itemize @bullet
9164 @item
9165 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
9166 several digits, while the System V @code{m4} only accepts one digit.
9167 This allows macros in GNU @code{m4} to take any number of
9168 arguments, and not only nine (@pxref{Arguments}).
9169 POSIX does not allow this extension, so it is disabled if
9170 @env{POSIXLY_CORRECT} is set.
9171 @c FIXME - update this bullet when ${11} is implemented.
9172
9173 @item
9174 The @code{divert} (@pxref{Divert}) macro can manage more than 9
9175 diversions.  GNU @code{m4} treats all positive numbers as valid
9176 diversions, rather than discarding diversions greater than 9.
9177
9178 @item
9179 Files included with @code{include} and @code{sinclude} are sought in a
9180 user specified search path, if they are not found in the working
9181 directory.  The search path is specified by the @option{-I} option and the
9182 @samp{M4PATH} environment variable (@pxref{Search Path}).
9183
9184 @item
9185 Arguments to @code{undivert} can be non-numeric, in which case the named
9186 file will be included uninterpreted in the output (@pxref{Undivert}).
9187
9188 @item
9189 Formatted output is supported through the @code{format} builtin, which
9190 is modeled after the C library function @code{printf} (@pxref{Format}).
9191
9192 @item
9193 Searches and text substitution through regular expressions are supported
9194 by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
9195 (@pxref{Patsubst}) builtins.
9196
9197 The syntax of regular expressions in M4 has never been clearly
9198 formalized.  While OpenBSD M4 uses extended regular
9199 expressions for @code{regexp} and @code{patsubst}, GNU M4
9200 defaults to basic regular expressions, but provides
9201 @code{changeresyntax} (@pxref{Changeresyntax}) to change the flavor of
9202 regular expression syntax in use.
9203
9204 @item
9205 The output of shell commands can be read into @code{m4} with
9206 @code{esyscmd} (@pxref{Esyscmd}).
9207
9208 @item
9209 There is indirect access to any builtin macro with @code{builtin}
9210 (@pxref{Builtin}).
9211
9212 @item
9213 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
9214
9215 @item
9216 The name of the program, the current input file, and the current input
9217 line number are accessible through the builtins @code{@w{__program__}},
9218 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
9219
9220 @item
9221 The generation of sync lines can be controlled through @code{syncoutput}
9222 (@pxref{Syncoutput}).
9223
9224 @item
9225 The format of the output from @code{dumpdef} and macro tracing can be
9226 controlled with @code{debugmode} (@pxref{Debugmode}).
9227
9228 @item
9229 The destination of trace and debug output can be controlled with
9230 @code{debugfile} (@pxref{Debugfile}).
9231
9232 @item
9233 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
9234 creating a new file with a unique name on every invocation, rather than
9235 following the insecure behavior of replacing the trailing @samp{X}
9236 characters with the @code{m4} process id.  POSIX does not
9237 allow this extension, so @code{maketemp} is insecure if
9238 @env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in
9239 the first place.
9240
9241 @item
9242 POSIX only requires support for the command line options
9243 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
9244 by GNU M4 are extensions.  @xref{Invoking m4}, for a
9245 description of these options.
9246
9247 @item
9248 The debugging and tracing facilities in GNU @code{m4} are much
9249 more extensive than in most other versions of @code{m4}.
9250
9251 @item
9252 Some traditional implementations only allow reading standard input
9253 once, but GNU @code{m4} correctly handles multiple instances
9254 of @samp{-} on the command line.
9255
9256 @item
9257 POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
9258 (first-in, first-out) order, and most other implementations obey this.
9259 However, versions of GNU @code{m4} earlier than 1.6 used
9260 LIFO order.  Furthermore, POSIX states that only the first
9261 argument to @code{m4wrap} is saved for later evaluation, but
9262 GNU @code{m4} saves and processes all arguments, with output
9263 separated by spaces.
9264
9265 @item
9266 POSIX states that builtins that require arguments, but are
9267 called without arguments, have undefined behavior.  Traditional
9268 implementations simply behave as though empty strings had been passed.
9269 For example, @code{a`'define`'b} would expand to @code{ab}.  But
9270 GNU @code{m4} ignores certain builtins if they have missing
9271 arguments, giving @code{adefineb} for the above example.
9272 @end itemize
9273
9274 @node Incompatibilities
9275 @section Other incompatibilities
9276
9277 There are a few other incompatibilities between this implementation of
9278 @code{m4}, and what POSIX requires, or what the System V
9279 version implemented.
9280
9281 @itemize @bullet
9282 @item
9283 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
9284 by undefining the entire stack of previous definitions, and if doing
9285 @code{undefine(`f')} first.  GNU @code{m4} replaces just the top
9286 definition on the stack, as if doing @code{popdef(`f')} followed by
9287 @code{pushdef(`f',`1')}.  POSIX allows either behavior.
9288
9289 @item
9290 At one point, POSIX required @code{changequote(@var{arg})}
9291 (@pxref{Changequote}) to use newline as the close quote, but this was a
9292 bug, and the next version of POSIX is anticipated to state
9293 that using empty strings or just one argument is unspecified.
9294 Meanwhile, the GNU @code{m4} behavior of treating an empty
9295 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
9296 repeating the start-quote delimiter, and BSD treats it as leaving the
9297 previous end-quote delimiter unchanged.  For predictable results, never
9298 call changequote with just one argument, or with empty strings for
9299 arguments.
9300
9301 @item
9302 At one point, POSIX required @code{changecom(@var{arg},)}
9303 (@pxref{Changecom}) to make it impossible to end a comment, but this is
9304 a bug, and the next version of POSIX is anticipated to state
9305 that using empty strings is unspecified.  Meanwhile, the GNU
9306 @code{m4} behavior of treating an empty end-comment delimiter as newline
9307 is not portable, as BSD treats it as leaving the previous end-comment
9308 delimiter unchanged.  It is also impossible in BSD implementations to
9309 disable comments, even though that is required by POSIX.  For
9310 predictable results, never call changecom with empty strings for
9311 arguments.
9312
9313 @item
9314 Traditional implementations allow argument collection, but not string
9315 and comment processing, to span file boundaries.  Thus, if @file{a.m4}
9316 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
9317 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
9318 gives an error message that the end of file was encountered inside a
9319 macro with GNU @code{m4}.  On the other hand, traditional
9320 implementations do end of file processing for files included with
9321 @code{include} or @code{sinclude} (@pxref{Include}), while GNU
9322 @code{m4} seamlessly integrates the content of those files.  Thus
9323 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
9324 giving an error.
9325
9326 @item
9327 POSIX requires @code{eval} (@pxref{Eval}) to treat all
9328 operators with the same precedence as C@.  However, earlier versions of
9329 GNU @code{m4} followed the traditional behavior of other
9330 @code{m4} implementations, where bitwise and logical negation (@samp{~}
9331 and @samp{!}) have lower precedence than equality operators; and where
9332 equality operators (@samp{==} and @samp{!=}) had the same precedence as
9333 relational operators (such as @samp{<}).  Use explicit parentheses to
9334 ensure proper precedence.  As extensions to POSIX,
9335 GNU @code{m4} gives well-defined semantics to operations that
9336 C leaves undefined, such as when overflow occurs, when shifting negative
9337 numbers, or when performing division by zero.  POSIX also
9338 requires @samp{=} to cause an error, but many traditional
9339 implementations allowed it as an alias for @samp{==}.
9340
9341 @item
9342 POSIX 2001 requires @code{translit} (@pxref{Translit}) to
9343 treat each character of the second and third arguments literally.
9344 However, it is anticipated that the next version of POSIX will
9345 allow the GNU @code{m4} behavior of treating @samp{-} as a
9346 range operator.
9347
9348 @item
9349 POSIX requires @code{m4} to honor the locale environment
9350 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
9351 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
9352 implemented in GNU @code{m4}.
9353
9354 @item
9355 GNU @code{m4} implements sync lines differently from System V
9356 @code{m4}, when text is being diverted.  GNU @code{m4} outputs
9357 the sync lines when the text is being diverted, and System V @code{m4}
9358 when the diverted text is being brought back.
9359
9360 The problem is which lines and file names should be attached to text
9361 that is being, or has been, diverted.  System V @code{m4} regards all
9362 the diverted text as being generated by the source line containing the
9363 @code{undivert} call, whereas GNU @code{m4} regards the
9364 diverted text as being generated at the time it is diverted.
9365
9366 The sync line option is used mostly when using @code{m4} as
9367 a front end to a compiler.  If a diverted line causes a compiler error,
9368 the error messages should most probably refer to the place where the
9369 diversion was made, and not where it was inserted again.
9370
9371 @comment options: -s
9372 @example
9373 divert(2)2
9374 divert(1)1
9375 divert`'0
9376 @result{}#line 3 "stdin"
9377 @result{}0
9378 ^D
9379 @result{}#line 2 "stdin"
9380 @result{}1
9381 @result{}#line 1 "stdin"
9382 @result{}2
9383 @end example
9384
9385 @comment FIXME - this needs to be fixed before 2.0.
9386 The current @code{m4} implementation has a limitation that the syncline
9387 output at the start of each diversion occurs no matter what, even if the
9388 previous diversion did not end with a newline.  This goes contrary to
9389 the claim that synclines appear on a line by themselves, so this
9390 limitation may be corrected in a future version of @code{m4}.  In the
9391 meantime, when using @option{-s}, it is wisest to make sure all
9392 diversions end with newline.
9393
9394 @item
9395 GNU @code{m4} makes no attempt at prohibiting self-referential
9396 definitions like:
9397
9398 @comment ignore
9399 @example
9400 define(`x', `x')
9401 @result{}
9402 define(`x', `x ')
9403 @result{}
9404 @end example
9405
9406 @cindex rescanning
9407 There is nothing inherently wrong with defining @samp{x} to
9408 return @samp{x}.  The wrong thing is to expand @samp{x} unquoted,
9409 because that would cause an infinite rescan loop.
9410 In @code{m4}, one might use macros to hold strings, as we do for
9411 variables in other programming languages, further checking them with:
9412
9413 @comment ignore
9414 @example
9415 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
9416 @end example
9417
9418 @noindent
9419 In cases like this one, an interdiction for a macro to hold its own name
9420 would be a useless limitation.  Of course, this leaves more rope for the
9421 GNU @code{m4} user to hang himself!  Rescanning hangs may be
9422 avoided through careful programming, a little like for endless loops in
9423 traditional programming languages.
9424
9425 @item
9426 POSIX states that only unquoted leading newlines and blanks
9427 (that is, space and tab) are ignored when collecting macro arguments.
9428 However, this appears to be a bug in POSIX, since most
9429 traditional implementations also ignore all whitespace (formfeed,
9430 carriage return, and vertical tab).  GNU @code{m4} follows
9431 tradition and ignores all leading unquoted whitespace.
9432 @end itemize
9433
9434 @node Experiments
9435 @section Experimental features in GNU M4
9436
9437 Certain features of GNU @code{m4} are experimental.
9438
9439 Some are only available if activated by an option given to
9440 @file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation
9441 time.  The functionality
9442 might change or even go away in the future.  @emph{Do not rely on it}.
9443 Please direct your comments about it the same way you would do for bugs.
9444
9445 @section Changesyntax
9446
9447 An experimental feature, which improves the flexibility of @code{m4},
9448 allows for changing the way the input is parsed (@pxref{Changesyntax}).
9449 No compile time option is needed for @code{changesyntax}.  The
9450 implementation is careful to not slow down @code{m4} parsing, unlike the
9451 withdrawn experiment of @code{changeword} that appeared earlier in M4
9452 1.4.x.
9453
9454 @section Multiple precision arithmetic
9455
9456 Another experimental feature, which would improve @code{m4} usefulness,
9457 allows for multiple precision rational arithmetic similar to
9458 @code{eval}.  You must have the GNU multi-precision (gmp)
9459 library installed, and should use @kbd{./configure --with-gmp} if you
9460 want this feature compiled in.  The current implementation is unproven
9461 and might go away.  Do not count on it yet.
9462
9463 @node Answers
9464 @chapter Correct version of some examples
9465
9466 Some of the examples in this manuals are buggy or not very robust, for
9467 demonstration purposes.  Improved versions of these composite macros are
9468 presented here.
9469
9470 @menu
9471 * Improved exch::               Solution for @code{exch}
9472 * Improved forloop::            Solution for @code{forloop}
9473 * Improved foreach::            Solution for @code{foreach}
9474 * Improved copy::               Solution for @code{copy}
9475 * Improved m4wrap::             Solution for @code{m4wrap}
9476 * Improved cleardivert::        Solution for @code{cleardivert}
9477 * Improved capitalize::         Solution for @code{capitalize}
9478 * Improved fatal_error::        Solution for @code{fatal_error}
9479 @end menu
9480
9481 @node Improved exch
9482 @section Solution for @code{exch}
9483
9484 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
9485 to double quote their arguments.  A nicer definition, which lets
9486 clients follow the rule of thumb of one level of quoting per level of
9487 parentheses, involves adding quotes in the definition of @code{exch}, as
9488 follows:
9489
9490 @example
9491 define(`exch', ``$2', `$1'')
9492 @result{}
9493 define(exch(`expansion text', `macro'))
9494 @result{}
9495 macro
9496 @result{}expansion text
9497 @end example
9498
9499 @node Improved forloop
9500 @section Solution for @code{forloop}
9501
9502 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
9503 into an infinite loop if given an iterator that is not parsed as a macro
9504 name.  It does not do any sanity checking on its numeric bounds, and
9505 only permits decimal numbers for bounds.  Here is an improved version,
9506 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
9507 version also optimizes overhead by calling four macros instead of six
9508 per iteration (excluding those in @var{text}), by not dereferencing the
9509 @var{iterator} in the helper @code{@w{_forloop}}.
9510
9511 @comment examples
9512 @example
9513 $ @kbd{m4 -I examples}
9514 undivert(`forloop2.m4')dnl
9515 @result{}divert(`-1')
9516 @result{}# forloop(var, from, to, stmt) - improved version:
9517 @result{}#   works even if VAR is not a strict macro name
9518 @result{}#   performs sanity check that FROM is larger than TO
9519 @result{}#   allows complex numerical expressions in TO and FROM
9520 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9521 @result{}  `pushdef(`$1')_$0(`$1', eval(`$2'),
9522 @result{}    eval(`$3'), `$4')popdef(`$1')')')
9523 @result{}define(`_forloop',
9524 @result{}  `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
9525 @result{}    `$0(`$1', incr(`$2'), `$3', `$4')')')
9526 @result{}divert`'dnl
9527 include(`forloop2.m4')
9528 @result{}
9529 forloop(`i', `2', `1', `no iteration occurs')
9530 @result{}
9531 forloop(`', `1', `2', ` odd iterator name')
9532 @result{} odd iterator name odd iterator name
9533 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
9534 @result{} 0xa 0xb 0xc
9535 forloop(`i', `a', `b', `non-numeric bounds')
9536 @error{}m4:stdin:6: warning: eval: bad input: '(a) <= (b)'
9537 @result{}
9538 @end example
9539
9540 One other change to notice is that the improved version used @samp{_$0}
9541 rather than @samp{_foreach} to invoke the helper routine.  In general,
9542 this is a good practice to follow, because then the set of macros can be
9543 uniformly transformed.  The following example shows a transformation
9544 that doubles the current quoting and appends a suffix @samp{2} to each
9545 transformed macro.  If @code{foreach} refers to the literal
9546 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
9547 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
9548 to an infinite recursion loop in this example.
9549
9550 @comment options: -L9
9551 @comment status: 1
9552 @comment examples
9553 @example
9554 $ @kbd{m4 -d -L 9 -I examples}
9555 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
9556 @result{}
9557 define(`double', `define(`$1'`2',
9558   arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
9559 @result{}
9560 double(`forloop')double(`_forloop')defn(`forloop2')
9561 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
9562 @result{}  ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
9563 @result{}    eval(``$3''), ``$4'')popdef(``$1'')'')
9564 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9565 @result{}
9566 changequote(`[', `]')changequote([``], [''])
9567 @result{}
9568 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9569 @result{}
9570 changequote`'include(`forloop.m4')
9571 @result{}
9572 double(`forloop')double(`_forloop')defn(`forloop2')
9573 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
9574 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9575 @result{}
9576 changequote(`[', `]')changequote([``], [''])
9577 @result{}
9578 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9579 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
9580 @end example
9581
9582 One more optimization is still possible.  Instead of repeatedly
9583 assigning a variable then invoking or dereferencing it, it is possible
9584 to pass the current iterator value as a single argument.  Coupled with
9585 @code{curry} if other arguments are needed (@pxref{Composition}), or
9586 with helper macros if the argument is needed in more than one place in
9587 the expansion, the output can be generated with three, rather than four,
9588 macros of overhead per iteration.  Notice how the file
9589 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
9590 arguments of the helper @code{_forloop} to take two arguments that are
9591 placed around the current value.  By splitting a balanced set of
9592 parantheses across multiple arguments, the helper macro can now be
9593 shared by @code{forloop} and the new @code{forloop_arg}.
9594
9595 @comment examples
9596 @example
9597 $ @kbd{m4 -I examples}
9598 include(`forloop3.m4')
9599 @result{}
9600 undivert(`forloop3.m4')dnl
9601 @result{}divert(`-1')
9602 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
9603 @result{}#   each value between FROM and TO, without define overhead
9604 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
9605 @result{}  `_forloop(`$1', eval(`$2'), `$3(', `)')')')
9606 @result{}# forloop(var, from, to, stmt) - refactored to share code
9607 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9608 @result{}  `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
9609 @result{}    `define(`$1',', `)$4')popdef(`$1')')')
9610 @result{}define(`_forloop',
9611 @result{}  `$3`$1'$4`'ifelse(`$1', `$2', `',
9612 @result{}    `$0(incr(`$1'), `$2', `$3', `$4')')')
9613 @result{}divert`'dnl
9614 forloop(`i', `1', `3', ` i')
9615 @result{} 1 2 3
9616 define(`echo', `$@@')
9617 @result{}
9618 forloop_arg(`1', `3', ` echo')
9619 @result{} 1 2 3
9620 include(`curry.m4')
9621 @result{}
9622 forloop_arg(`1', `3', `curry(`pushdef', `a')')
9623 @result{}
9624 a
9625 @result{}3
9626 popdef(`a')a
9627 @result{}2
9628 popdef(`a')a
9629 @result{}1
9630 popdef(`a')a
9631 @result{}a
9632 @end example
9633
9634 Of course, it is possible to make even more improvements, such as
9635 adding an optional step argument, or allowing iteration through
9636 descending sequences.  GNU Autoconf provides some of these
9637 additional bells and whistles in its @code{m4_for} macro.
9638
9639 @node Improved foreach
9640 @section Solution for @code{foreach}
9641
9642 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
9643 presented earlier each have flaws.  First, we will examine and fix the
9644 quadratic behavior of @code{foreachq}:
9645
9646 @comment examples
9647 @example
9648 $ @kbd{m4 -I examples}
9649 include(`foreachq.m4')
9650 @result{}
9651 traceon(`shift')debugmode(`aq')
9652 @result{}
9653 foreachq(`x', ``1', `2', `3', `4'', `x
9654 ')dnl
9655 @result{}1
9656 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9657 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9658 @result{}2
9659 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9660 @error{}m4trace: -3- shift(`2', `3', `4')
9661 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9662 @error{}m4trace: -2- shift(`2', `3', `4')
9663 @result{}3
9664 @error{}m4trace: -5- shift(`1', `2', `3', `4')
9665 @error{}m4trace: -4- shift(`2', `3', `4')
9666 @error{}m4trace: -3- shift(`3', `4')
9667 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9668 @error{}m4trace: -3- shift(`2', `3', `4')
9669 @error{}m4trace: -2- shift(`3', `4')
9670 @result{}4
9671 @error{}m4trace: -6- shift(`1', `2', `3', `4')
9672 @error{}m4trace: -5- shift(`2', `3', `4')
9673 @error{}m4trace: -4- shift(`3', `4')
9674 @error{}m4trace: -3- shift(`4')
9675 @end example
9676
9677 @cindex quadratic behavior, avoiding
9678 @cindex avoiding quadratic behavior
9679 Each successive iteration was adding more quoted @code{shift}
9680 invocations, and the entire list contents were passing through every
9681 iteration.  In general, when recursing, it is a good idea to make the
9682 recursion use fewer arguments, rather than adding additional quoted
9683 uses of @code{shift}.  By doing so, @code{m4} uses less memory, invokes
9684 fewer macros, is less likely to run into machine limits, and most
9685 importantly, performs faster.  The fixed version of @code{foreachq} can
9686 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
9687
9688 @comment examples
9689 @example
9690 $ @kbd{m4 -I examples}
9691 include(`foreachq2.m4')
9692 @result{}
9693 undivert(`foreachq2.m4')dnl
9694 @result{}include(`quote.m4')dnl
9695 @result{}divert(`-1')
9696 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9697 @result{}#   quoted list, improved version
9698 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
9699 @result{}define(`_arg1q', ``$1'')
9700 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
9701 @result{}define(`_foreachq', `ifelse(`$2', `', `',
9702 @result{}  `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
9703 @result{}divert`'dnl
9704 traceon(`shift')debugmode(`aq')
9705 @result{}
9706 foreachq(`x', ``1', `2', `3', `4'', `x
9707 ')dnl
9708 @result{}1
9709 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9710 @result{}2
9711 @error{}m4trace: -3- shift(`2', `3', `4')
9712 @result{}3
9713 @error{}m4trace: -3- shift(`3', `4')
9714 @result{}4
9715 @end example
9716
9717 Note that the fixed version calls unquoted helper macros in
9718 @code{@w{_foreachq}} to trim elements immediately; those helper macros
9719 in turn must re-supply the layer of quotes lost in the macro invocation.
9720 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
9721 element, with @code{@w{_arg1}} of the earlier implementation that
9722 returned the first list element directly.  Additionally, by calling the
9723 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
9724 contains unexpanded macros.
9725
9726 The astute m4 programmer might notice that the solution above still uses
9727 more macro invocations than strictly necessary.  Note that @samp{$2},
9728 which contains an arbitrarily long quoted list, is expanded and
9729 rescanned three times per iteration of @code{_foreachq}.  Furthermore,
9730 every iteration of the algorithm effectively unboxes then reboxes the
9731 list, which costs a couple of macro invocations.  It is possible to
9732 rewrite the algorithm by swapping the order of the arguments to
9733 @code{_foreachq} in order to operate on an unboxed list in the first
9734 place, and by using the fixed-length @samp{$#} instead of an arbitrary
9735 length list as the key to end recursion.  The result is an overhead of
9736 six macro invocations per loop (excluding any macros in @var{text}),
9737 instead of eight.  This alternative approach is available as
9738 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
9739
9740 @comment examples
9741 @example
9742 $ @kbd{m4 -I examples}
9743 include(`foreachq3.m4')
9744 @result{}
9745 undivert(`foreachq3.m4')dnl
9746 @result{}divert(`-1')
9747 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9748 @result{}#   quoted list, alternate improved version
9749 @result{}define(`foreachq', `ifelse(`$2', `', `',
9750 @result{}  `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
9751 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
9752 @result{}  `define(`$1', `$4')$2`'$0(`$1', `$2',
9753 @result{}    shift(shift(shift($@@))))')')
9754 @result{}divert`'dnl
9755 traceon(`shift')debugmode(`aq')
9756 @result{}
9757 foreachq(`x', ``1', `2', `3', `4'', `x
9758 ')dnl
9759 @result{}1
9760 @error{}m4trace: -4- shift(`x', `x
9761 @error{}', `', `1', `2', `3', `4')
9762 @error{}m4trace: -3- shift(`x
9763 @error{}', `', `1', `2', `3', `4')
9764 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
9765 @result{}2
9766 @error{}m4trace: -4- shift(`x', `x
9767 @error{}', `1', `2', `3', `4')
9768 @error{}m4trace: -3- shift(`x
9769 @error{}', `1', `2', `3', `4')
9770 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9771 @result{}3
9772 @error{}m4trace: -4- shift(`x', `x
9773 @error{}', `2', `3', `4')
9774 @error{}m4trace: -3- shift(`x
9775 @error{}', `2', `3', `4')
9776 @error{}m4trace: -2- shift(`2', `3', `4')
9777 @result{}4
9778 @error{}m4trace: -4- shift(`x', `x
9779 @error{}', `3', `4')
9780 @error{}m4trace: -3- shift(`x
9781 @error{}', `3', `4')
9782 @error{}m4trace: -2- shift(`3', `4')
9783 @end example
9784
9785 Prior to M4 1.6, every instance of @samp{$@@} was rescanned as it was
9786 encountered.  Thus, the @file{foreachq3.m4} alternative used much less
9787 memory than @file{foreachq2.m4}, and executed as much as 10% faster,
9788 since each iteration encountered fewer @samp{$@@}.  However, the
9789 implementation of rescanning every byte in @samp{$@@} was quadratic in
9790 the number of bytes scanned (for example, making the broken version in
9791 @file{foreachq.m4} cubic, rather than quadratic, in behavior).  Once the
9792 underlying M4 implementation was improved in 1.6 to reuse results of
9793 previous scans, both styles of @code{foreachq} become linear in the
9794 number of bytes scanned, but the @file{foreachq3.m4} version remains
9795 noticeably faster because of fewer macro invocations.  Notice how the
9796 implementation injects an empty argument prior to expanding @samp{$2}
9797 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
9798 the third argument altogether, and ends recursion when there are three
9799 arguments left because there was nothing left to pass through
9800 @code{shift}.  Thus, each iteration only needs one @code{ifelse}, rather
9801 than the two conditionals used in the version from @file{foreachq2.m4}.
9802
9803 @cindex nine arguments, more than
9804 @cindex more than nine arguments
9805 @cindex arguments, more than nine
9806 So far, all of the implementations of @code{foreachq} presented have
9807 been quadratic with M4 1.4.x.  But @code{forloop} is linear, because
9808 each iteration parses a constant amount of arguments.  So, it is
9809 possible to design a variant that uses @code{forloop} to do the
9810 iteration, then uses @samp{$@@} only once at the end, giving a linear
9811 result even with older M4 implementations.  This implementation relies
9812 on the GNU extension that @samp{$10} expands to the tenth
9813 argument rather than the first argument concatenated with @samp{0}.  The
9814 trick is to define an intermediate macro that repeats the text
9815 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
9816 integers corresponding to each argument.  The helper macro
9817 @code{_foreachq_} is needed in order to generate the literal sequences
9818 such as @samp{$1} into the intermediate macro, rather than expanding
9819 them as the arguments of @code{_foreachq}.  With this approach, no
9820 @code{shift} calls are even needed!  However, when linear recursion is
9821 available in new enough M4, the time and memory cost of using
9822 @code{forloop} to build an intermediate macro outweigh the costs of any
9823 of the previous implementations (there are seven macros of overhead per
9824 iteration instead of six in @file{foreachq3.m4}, and the entire
9825 intermediate macro must be built in memory before any iteration is
9826 expanded).  Additionally, this approach will need adjustment when a
9827 future version of M4 follows POSIX by no longer treating
9828 @samp{$10} as the tenth argument; the anticipation is that
9829 @samp{$@{10@}} can be used instead, although that alternative syntax is
9830 not yet supported.
9831
9832 @comment examples
9833 @example
9834 $ @kbd{m4 -I examples}
9835 include(`foreachq4.m4')
9836 @result{}
9837 undivert(`foreachq4.m4')dnl
9838 @result{}include(`forloop2.m4')dnl
9839 @result{}divert(`-1')
9840 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9841 @result{}#   quoted list, version based on forloop
9842 @result{}define(`foreachq',
9843 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
9844 @result{}define(`_foreachq',
9845 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
9846 @result{}  `$0_(`1', `2', indir(`$1'))')`popdef(
9847 @result{}    `$1')')indir(`$1', $@@)')
9848 @result{}define(`_foreachq_',
9849 @result{}``define(`$$1', `$$3')$$2`''')
9850 @result{}divert`'dnl
9851 traceon(`shift')debugmode(`aq')
9852 @result{}
9853 foreachq(`x', ``1', `2', `3', `4'', `x
9854 ')dnl
9855 @result{}1
9856 @result{}2
9857 @result{}3
9858 @result{}4
9859 @end example
9860
9861 For yet another approach, the improved version of @code{foreach},
9862 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
9863 overquotes the arguments to @code{@w{_foreach}} to begin with, using
9864 @code{dquote_elt}.  Then @code{@w{_foreach}} can just use
9865 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
9866 front:
9867
9868 @comment examples
9869 @example
9870 $ @kbd{m4 -I examples}
9871 include(`foreach2.m4')
9872 @result{}
9873 undivert(`foreach2.m4')dnl
9874 @result{}include(`quote.m4')dnl
9875 @result{}divert(`-1')
9876 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
9877 @result{}#   parenthesized list, improved version
9878 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
9879 @result{}  (dquote(dquote_elt$2)), `$3')popdef(`$1')')
9880 @result{}define(`_arg1', `$1')
9881 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
9882 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
9883 @result{}divert`'dnl
9884 traceon(`shift')debugmode(`aq')
9885 @result{}
9886 foreach(`x', `(`1', `2', `3', `4')', `x
9887 ')dnl
9888 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9889 @error{}m4trace: -4- shift(`2', `3', `4')
9890 @error{}m4trace: -4- shift(`3', `4')
9891 @result{}1
9892 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
9893 @result{}2
9894 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
9895 @result{}3
9896 @error{}m4trace: -3- shift(``3'', ``4'')
9897 @result{}4
9898 @error{}m4trace: -3- shift(``4'')
9899 @end example
9900
9901 It is likewise possible to write a variant of @code{foreach} that
9902 performs in linear time on M4 1.4.x; the easiest method is probably
9903 writing a version of @code{foreach} that unboxes its list, then invokes
9904 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
9905
9906 @cindex filtering defined symbols
9907 @cindex subset of defined symbols
9908 @cindex defined symbols, filtering
9909 With a robust @code{foreachq} implementation, it is possible to create a
9910 filter on a list of defined symbols.  This next example will find all
9911 symbols that contain @samp{if} or @samp{def}, via two different
9912 approaches.  In the first approach, @code{dquote_elt} is used to
9913 overquote each list element, then @code{dquote} forms the list; that
9914 way, the iterator @code{macro} can be expanded in place because its
9915 contents are already quoted.  This approach also uses a self-modifying
9916 macro @code{sep} to provide the correct number of commas.  In the second
9917 approach, the iterator @code{macro} contains live text, so it must be
9918 used with @code{defn} to avoid unintentional expansion.  The correct
9919 number of commas is achieved by using @code{shift} to ignore the first
9920 one, although a leading space still remains.
9921
9922 @comment examples
9923 @example
9924 $ @kbd{m4 -I examples}
9925 include(`quote.m4')include(`foreachq2.m4')
9926 @result{}
9927 pushdef(`sep', `define(`sep', ``, '')')
9928 @result{}
9929 foreachq(`macro', dquote(dquote_elt(m4symbols)),
9930   `regexp(macro, `.*if.*', `sep`\&'')')
9931 @result{}ifdef, ifelse, shift
9932 popdef(`sep')
9933 @result{}
9934 shift(foreachq(`macro', dquote(m4symbols),
9935   `regexp(defn(`macro'), `def', `,` ''dquote(defn(`macro')))'))
9936 @result{} define, defn, dumpdef, ifdef, popdef, pushdef, undefine
9937 @end example
9938
9939 In summary, recursion over list elements is trickier than it appeared at
9940 first glance, but provides a powerful idiom within @code{m4} processing.
9941 As a final demonstration, both list styles are now able to handle
9942 several scenarios that would wreak havoc on one or both of the original
9943 implementations.  This points out one other difference between the
9944 list styles.  @code{foreach} evaluates unquoted list elements only once,
9945 in preparation for calling @code{@w{_foreach}}, similary for
9946 @code{foreachq} as provided by @file{foreachq3.m4} or
9947 @file{foreachq4.m4}.  But
9948 @code{foreachq}, as provided by @file{foreachq2.m4},
9949 evaluates unquoted list elements twice while visiting the first list
9950 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}.  When
9951 deciding which list style to use, one must take into account whether
9952 repeating the side effects of unquoted list elements will have any
9953 detrimental effects.
9954
9955 @comment examples
9956 @example
9957 $ @kbd{m4 -d -I examples}
9958 include(`foreach2.m4')
9959 @result{}
9960 include(`foreachq2.m4')
9961 @result{}
9962 dnl 0-element list:
9963 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
9964 @result{} /@w{ }
9965 dnl 1-element list of empty element
9966 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
9967 @result{}<> / <>
9968 dnl 2-element list of empty elements
9969 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
9970 @result{}<><> / <><>
9971 dnl 1-element list of a comma
9972 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
9973 @result{}<,> / <,>
9974 dnl 2-element list of unbalanced parentheses
9975 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
9976 @result{}<(><)> / <(><)>
9977 define(`ab', `oops')dnl using defn(`iterator')
9978 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
9979  foreachq(`x', ``a', `b'', `defn(`x')')
9980 @result{}ab / ab
9981 define(`active', `ACT, IVE')
9982 @result{}
9983 traceon(`active')
9984 @result{}
9985 dnl list of unquoted macros; expansion occurs before recursion
9986 foreach(`x', `(active, active)', `<x>
9987 ')dnl
9988 @error{}m4trace: -4- active -> `ACT, IVE'
9989 @error{}m4trace: -4- active -> `ACT, IVE'
9990 @result{}<ACT>
9991 @result{}<IVE>
9992 @result{}<ACT>
9993 @result{}<IVE>
9994 foreachq(`x', `active, active', `<x>
9995 ')dnl
9996 @error{}m4trace: -3- active -> `ACT, IVE'
9997 @error{}m4trace: -3- active -> `ACT, IVE'
9998 @result{}<ACT>
9999 @error{}m4trace: -3- active -> `ACT, IVE'
10000 @error{}m4trace: -3- active -> `ACT, IVE'
10001 @result{}<IVE>
10002 @result{}<ACT>
10003 @result{}<IVE>
10004 dnl list of quoted macros; expansion occurs during recursion
10005 foreach(`x', `(`active', `active')', `<x>
10006 ')dnl
10007 @error{}m4trace: -1- active -> `ACT, IVE'
10008 @result{}<ACT, IVE>
10009 @error{}m4trace: -1- active -> `ACT, IVE'
10010 @result{}<ACT, IVE>
10011 foreachq(`x', ``active', `active'', `<x>
10012 ')dnl
10013 @error{}m4trace: -1- active -> `ACT, IVE'
10014 @result{}<ACT, IVE>
10015 @error{}m4trace: -1- active -> `ACT, IVE'
10016 @result{}<ACT, IVE>
10017 dnl list of double-quoted macro names; no expansion
10018 foreach(`x', `(``active'', ``active'')', `<x>
10019 ')dnl
10020 @result{}<active>
10021 @result{}<active>
10022 foreachq(`x', ```active'', ``active''', `<x>
10023 ')dnl
10024 @result{}<active>
10025 @result{}<active>
10026 @end example
10027
10028 @node Improved copy
10029 @section Solution for @code{copy}
10030
10031 The macro @code{copy} presented above works with M4 1.6 and newer, but
10032 is unable to handle builtin tokens with M4 1.4.x, because it tries to
10033 pass the builtin token through the macro @code{curry}, where it is
10034 silently flattened to an empty string (@pxref{Composition}).  Rather
10035 than using the problematic @code{curry} to work around the limitation
10036 that @code{stack_foreach} expects to invoke a macro that takes exactly
10037 one argument, we can write a new macro that lets us form the exact
10038 two-argument @code{pushdef} call sequence needed, so that we are no
10039 longer passing a builtin token through a text macro.
10040
10041 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
10042   @var{sep})
10043 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
10044   @var{post}, @var{sep})
10045 For each of the @code{pushdef} definitions associated with @var{macro},
10046 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
10047 Additionally, expand @var{sep} between definitions.
10048 @code{stack_foreach_sep} visits the oldest definition first, while
10049 @code{stack_foreach_sep_lifo} visits the current definition first.  The
10050 expansion may dereference @var{macro}, but should not modify it.  There
10051 are a few special macros, such as @code{defn}, which cannot be used as
10052 the @var{macro} parameter.
10053 @end deffn
10054
10055 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
10056 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
10057 `)')}.  By supplying explicit parentheses, split among the @var{pre} and
10058 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
10059 construct macro calls with more than one argument, without passing
10060 builtin tokens through a macro call.  It is likewise possible to
10061 directly reference the stack definitions without a macro call, by
10062 leaving @var{pre} and @var{post} empty.  Thus, in addition to fixing
10063 @code{copy} on builtin tokens, it also executes with fewer macro
10064 invocations.
10065
10066 The new macro also adds a separator that is only output after the first
10067 iteration of the helper @code{_stack_reverse_sep}, implemented by
10068 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
10069 argument in subsequent iterations.  Note that the empty string that
10070 separates @var{sep} from @var{pre} is provided as part of the fourth
10071 argument when originally calling @code{_stack_reverse_sep}, and not by
10072 writing @code{$4`'$3} as the third argument in the recursive call; while
10073 the other approach would give the same output, it does so at the expense
10074 of increasing the argument size on each iteration of
10075 @code{_stack_reverse_sep}, which results in quadratic instead of linear
10076 execution time.  The improved stack walking macros are available in
10077 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
10078
10079 @comment examples
10080 @example
10081 $ @kbd{m4 -I examples}
10082 include(`stack_sep.m4')
10083 @result{}
10084 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
10085 ')m4exit(`1')',
10086    `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
10087 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
10088 @result{}
10089 copy(`a', `b')
10090 @result{}
10091 b
10092 @result{}0
10093 popdef(`b')
10094 @result{}
10095 b
10096 @result{}1
10097 pushdef(`c', `1')pushdef(`c', `2')
10098 @result{}
10099 stack_foreach_sep_lifo(`c', `', `', `, ')
10100 @result{}2, 1
10101 undivert(`stack_sep.m4')dnl
10102 @result{}divert(`-1')
10103 @result{}# stack_foreach_sep(macro, pre, post, sep)
10104 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
10105 @result{}# from the definition stack of MACRO, starting with the oldest, and
10106 @result{}# separated by SEP between definitions.
10107 @result{}define(`stack_foreach_sep',
10108 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
10109 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
10110 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
10111 @result{}# Like stack_foreach_sep, but starting with the newest definition.
10112 @result{}define(`stack_foreach_sep_lifo',
10113 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
10114 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
10115 @result{}define(`_stack_reverse_sep',
10116 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
10117 @result{}  `$1', `$2', `$4$3')')')
10118 @result{}divert`'dnl
10119 @end example
10120
10121 @node Improved m4wrap
10122 @section Solution for @code{m4wrap}
10123
10124 The replacement @code{m4wrap} versions presented above, designed to
10125 guarantee FIFO or LIFO order regardless of the underlying M4
10126 implementation, share a bug when dealing with wrapped text that looks
10127 like parameter expansion.  Note how the invocation of
10128 @code{m4wrap@var{n}} interprets these parameters, while using the
10129 builtin preserves them for their intended use.
10130
10131 @comment examples
10132 @example
10133 $ @kbd{m4 -I examples}
10134 include(`wraplifo.m4')
10135 @result{}
10136 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10137 ')
10138 @result{}
10139 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
10140 ')
10141 @result{}
10142 ^D
10143 @result{}m4wrap0:---0-
10144 @result{}bar:-a-a,b-2-
10145 @end example
10146
10147 Additionally, the computation of @code{_m4wrap_level} and creation of
10148 multiple @code{m4wrap@var{n}} placeholders in the original examples is
10149 more expensive in time and memory than strictly necessary.  Notice how
10150 the improved version grabs the wrapped text via @code{defn} to avoid
10151 parameter expansion, then undefines @code{_m4wrap_text}, before
10152 stripping a level of quotes with @code{_arg1} to expand the text.  That
10153 way, each level of wrapping reuses the single placeholder, which starts
10154 each nesting level in an undefined state.
10155
10156 Finally, it is worth emulating the GNU M4 extension of saving
10157 all arguments to @code{m4wrap}, separated by a space, rather than saving
10158 just the first argument.  This is done with the @code{join} macro
10159 documented previously (@pxref{Shift}).  The improved LIFO example is
10160 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
10161 easily be converted to a FIFO solution by swapping the adjacent
10162 invocations of @code{joinall} and @code{defn}.
10163
10164 @comment examples
10165 @example
10166 $ @kbd{m4 -I examples}
10167 include(`wraplifo2.m4')
10168 @result{}
10169 undivert(`wraplifo2.m4')dnl
10170 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
10171 @result{}include(`join.m4')dnl
10172 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
10173 @result{}define(`_arg1', `$1')dnl
10174 @result{}define(`m4wrap',
10175 @result{}`ifdef(`_$0_text',
10176 @result{}       `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
10177 @result{}       `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
10178 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
10179 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10180 ')
10181 @result{}
10182 m4wrap(`lifo text
10183 m4wrap(`nested', `', `$@@
10184 ')')
10185 @result{}
10186 ^D
10187 @result{}lifo text
10188 @result{}foo:-a-a,b-2-
10189 @result{}nested  $@@
10190 @end example
10191
10192 @node Improved cleardivert
10193 @section Solution for @code{cleardivert}
10194
10195 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
10196 called without arguments to clear all pending diversions.  That is
10197 because using undivert with an empty string for an argument is different
10198 than using it with no arguments at all.  Compare the earlier definition
10199 with one that takes the number of arguments into account:
10200
10201 @example
10202 define(`cleardivert',
10203   `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
10204 @result{}
10205 divert(`1')one
10206 divert
10207 @result{}
10208 cleardivert
10209 @result{}
10210 undivert
10211 @result{}one
10212 @result{}
10213 define(`cleardivert',
10214   `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
10215     `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
10216 @result{}
10217 divert(`2')two
10218 divert
10219 @result{}
10220 cleardivert
10221 @result{}
10222 undivert
10223 @result{}
10224 @end example
10225
10226 @node Improved capitalize
10227 @section Solution for @code{capitalize}
10228
10229 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
10230 not allow clients to follow the quoting rule of thumb.  Consider the
10231 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
10232 difference between calling @code{capitalize} with the expansion of a
10233 macro, expanding the result of a case change, and changing the case of a
10234 double-quoted string:
10235
10236 @comment examples
10237 @example
10238 $ @kbd{m4 -I examples}
10239 include(`capitalize.m4')dnl
10240 define(`active', `act1, ive')dnl
10241 define(`Active', `Act2, Ive')dnl
10242 define(`ACTIVE', `ACT3, IVE')dnl
10243 upcase(active)
10244 @result{}ACT1,IVE
10245 upcase(`active')
10246 @result{}ACT3, IVE
10247 upcase(``active'')
10248 @result{}ACTIVE
10249 downcase(ACTIVE)
10250 @result{}act3,ive
10251 downcase(`ACTIVE')
10252 @result{}act1, ive
10253 downcase(``ACTIVE'')
10254 @result{}active
10255 capitalize(active)
10256 @result{}Act1
10257 capitalize(`active')
10258 @result{}Active
10259 capitalize(``active'')
10260 @result{}_capitalize(`active')
10261 define(`A', `OOPS')
10262 @result{}
10263 capitalize(active)
10264 @result{}OOPSct1
10265 capitalize(`active')
10266 @result{}OOPSctive
10267 @end example
10268
10269 First, when @code{capitalize} is called with more than one argument, it
10270 was throwing away later arguments, whereas @code{upcase} and
10271 @code{downcase} used @samp{$*} to collect them all.  The fix is simple:
10272 use @samp{$*} consistently.
10273
10274 Next, with single-quoting, @code{capitalize} outputs a single character,
10275 a set of quotes, then the rest of the characters, making it impossible
10276 to invoke @code{Active} after the fact, and allowing the alternate macro
10277 @code{A} to interfere.  Here, the solution is to use additional quoting
10278 in the helper macros, then pass the final over-quoted output string
10279 through @code{_arg1} to remove the extra quoting and finally invoke the
10280 concatenated portions as a single string.
10281
10282 Finally, when passed a double-quoted string, the nested macro
10283 @code{_capitalize} is never invoked because it ended up nested inside
10284 quotes.  This one is the toughest to fix.  In short, we have no idea how
10285 many levels of quotes are in effect on the substring being altered by
10286 @code{patsubst}.  If the replacement string cannot be expressed entirely
10287 in terms of literal text and backslash substitutions, then we need a
10288 mechanism to guarantee that the helper macros are invoked outside of
10289 quotes.  In other words, this sounds like a job for @code{changequote}
10290 (@pxref{Changequote}).  By changing the active quoting characters, we
10291 can guarantee that replacement text injected by @code{patsubst} always
10292 occurs in the middle of a string that has exactly one level of
10293 over-quoting using alternate quotes; so the replacement text closes the
10294 quoted string, invokes the helper macros, then reopens the quoted
10295 string.  In turn, that means the replacement text has unbalanced quotes,
10296 necessitating another round of @code{changequote}.
10297
10298 In the fixed version below, (also shipped as
10299 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
10300 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
10301 strings are chosen so as to be less likely to appear in the text being
10302 converted).  The helpers @code{_to_alt} and @code{_from_alt} merely
10303 reduce the number of characters required to perform a
10304 @code{changequote}, since the definition changes twice.  The outermost
10305 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
10306 with alternate quoting; the innermost pair is used so that the third
10307 argument to @code{patsubst} can contain an unbalanced
10308 @samp{]>>}/@samp{<<[} pair.  Note that @code{upcase} and @code{downcase}
10309 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
10310 they contain nested quotes but are invoked with the alternate quoting
10311 scheme in effect.
10312
10313 @comment examples
10314 @example
10315 $ @kbd{m4 -I examples}
10316 include(`capitalize2.m4')dnl
10317 define(`active', `act1, ive')dnl
10318 define(`Active', `Act2, Ive')dnl
10319 define(`ACTIVE', `ACT3, IVE')dnl
10320 define(`A', `OOPS')dnl
10321 capitalize(active; `active'; ``active''; ```actIVE''')
10322 @result{}Act1,Ive; Act2, Ive; Active; `Active'
10323 undivert(`capitalize2.m4')dnl
10324 @result{}divert(`-1')
10325 @result{}# upcase(text)
10326 @result{}# downcase(text)
10327 @result{}# capitalize(text)
10328 @result{}#   change case of text, improved version
10329 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
10330 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
10331 @result{}define(`_arg1', `$1')
10332 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
10333 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
10334 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
10335 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
10336 @result{}define(`_capitalize_alt',
10337 @result{}  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
10338 @result{}    <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
10339 @result{}define(`capitalize',
10340 @result{}  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
10341 @result{}    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
10342 @result{}divert`'dnl
10343 @end example
10344
10345 @node Improved fatal_error
10346 @section Solution for @code{fatal_error}
10347
10348 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
10349 of GNU M4 earlier than 1.4.8, where invoking @code{@w{__file__}}
10350 (@pxref{Location}) inside @code{m4wrap} would result in an empty string,
10351 and @code{@w{__line__}} resulted in @samp{0} even though all files start
10352 at line 1.  Furthermore, versions earlier than 1.4.6 did not support the
10353 @code{@w{__program__}} macro.  If you want @code{fatal_error} to work
10354 across the entire 1.4.x release series, a better implementation would
10355 be:
10356
10357 @comment status: 1
10358 @example
10359 define(`fatal_error',
10360   `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
10361 `:ifelse(__line__, `0', `',
10362     `__file__:__line__:')` fatal error: $*
10363 ')m4exit(`1')')
10364 @result{}
10365 m4wrap(`divnum(`demo of internal message')
10366 fatal_error(`inside wrapped text')')
10367 @result{}
10368 ^D
10369 @error{}m4:stdin:6: warning: divnum: extra arguments ignored: 1 > 0
10370 @result{}0
10371 @error{}m4:stdin:6: fatal error: inside wrapped text
10372 @end example
10373
10374 @c ========================================================== Appendices
10375
10376 @node Copying This Package
10377 @appendix How to make copies of the overall M4 package
10378 @cindex License, code
10379
10380 This appendix covers the license for copying the source code of the
10381 overall M4 package.  This manual is under a different set of
10382 restrictions, covered later (@pxref{Copying This Manual}).
10383
10384 @menu
10385 * GNU General Public License::  License for copying the M4 package
10386 @end menu
10387
10388 @node GNU General Public License
10389 @appendixsec License for copying the M4 package
10390 @cindex GPL, GNU General Public License
10391 @cindex GNU General Public License
10392 @cindex General Public License (GPL), GNU
10393 @include gpl-3.0.texi
10394
10395 @node Copying This Manual
10396 @appendix How to make copies of this manual
10397 @cindex License, manual
10398
10399 This appendix covers the license for copying this manual.  Note that
10400 some of the longer examples in this manual are also distributed in the
10401 directory @file{m4-@value{VERSION}/@/examples/}, where a more
10402 permissive license is in effect when copying just the examples.
10403
10404 @menu
10405 * GNU Free Documentation License::  License for copying this manual
10406 @end menu
10407
10408 @node GNU Free Documentation License
10409 @appendixsec License for copying this manual
10410 @cindex FDL, GNU Free Documentation License
10411 @cindex GNU Free Documentation License
10412 @cindex Free Documentation License (FDL), GNU
10413 @include fdl-1.3.texi
10414
10415 @node Indices
10416 @appendix Indices of concepts and macros
10417
10418 @menu
10419 * Macro index::                 Index for all @code{m4} macros
10420 * Concept index::               Index for many concepts
10421 @end menu
10422
10423 @node Macro index
10424 @appendixsec Index for all @code{m4} macros
10425
10426 This index covers all @code{m4} builtins, as well as several useful
10427 composite macros.  References are exclusively to the places where a
10428 macro is introduced the first time.
10429
10430 @printindex fn
10431
10432 @node Concept index
10433 @appendixsec Index for many concepts
10434
10435 @printindex cp
10436
10437 @bye
10438
10439 @c Local Variables:
10440 @c fill-column: 72
10441 @c ispell-local-dictionary: "american"
10442 @c indent-tabs-mode: nil
10443 @c whitespace-check-buffer-indent: nil
10444 @c End: