doc/m4.texinfo

   1 \input texinfo @c -*- texinfo -*-
   2 @comment ========================================================
   3 @comment %**start of header
   4 @setfilename m4.info
   5 @include version.texi
   6 @settitle GNU M4 @value{VERSION} macro processor
   7 @setchapternewpage odd
   8 @ifnothtml
   9 @setcontentsaftertitlepage
  10 @end ifnothtml
  11 @finalout
  12
  13 @c @tabchar{}
  14 @c ----------
  15 @c The testsuite expects literal tab output in some examples, but
  16 @c literal tabs in texinfo lead to formatting issues.
  17 @macro tabchar
  18 @       @c
  19 @end macro
  20
  21 @c @ovar{ARG}
  22 @c -------------------
  23 @c The ARG is an optional argument.  To be used for macro arguments in
  24 @c their documentation.
  25 @macro ovar{varname}
  26 @r{[}@var{\varname\}@r{]}
  27 @end macro
  28
  29 @c @dvar{ARG, DEFAULT}
  30 @c -------------------
  31 @c The ARG is an optional argument, defaulting to DEFAULT.  To be used
  32 @c for macro arguments in their documentation.
  33 @macro dvar{varname, default}
  34 @r{[}@var{\varname\} = @samp{\default\}@r{]}
  35 @end macro
  36
  37 @comment %**end of header
  38 @comment ========================================================
  39
  40 @copying
  41
  42 This manual is for @acronym{GNU} M4 (version @value{VERSION}, @value{UPDATED}),
  43 a package containing an implementation of the m4 macro language.
  44
  45 Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 2004, 2005,
  46 2006, 2007, 2008 Free Software Foundation, Inc.
  47
  48 @quotation
  49 Permission is granted to copy, distribute and/or modify this document
  50 under the terms of the @acronym{GNU} Free Documentation License,
  51 Version 1.2 or any later version published by the Free Software
  52 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
  53 Back-Cover Texts.  A copy of the license is included in the section
  54 entitled ``@acronym{GNU} Free Documentation License.''
  55 @end quotation
  56 @end copying
  57
  58 @dircategory Text creation and manipulation
  59 @direntry
  60 * M4: (m4).                     A powerful macro processor.
  61 @end direntry
  62
  63 @titlepage
  64 @title GNU M4, version @value{VERSION}
  65 @subtitle A powerful macro processor
  66 @subtitle Edition @value{EDITION}, @value{UPDATED}
  67 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
  68 @author Gary V. Vaughan, and Eric Blake
  69 @author (@email{bug-m4@@gnu.org})
  70
  71 @page
  72 @vskip 0pt plus 1filll
  73 @insertcopying
  74 @end titlepage
  75
  76 @contents
  77
  78 @ifnottex
  79 @node Top
  80 @top GNU M4
  81 @insertcopying
  82 @end ifnottex
  83
  84 @acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
  85 processor.  It is mostly SVR4 compatible, although it has some
  86 extensions (for example, handling more than 9 positional parameters
  87 to macros).  @code{m4} also has builtin functions for including
  88 files, running shell commands, doing arithmetic, etc.  Autoconf needs
  89 @acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
  90 running them.
  91
  92 @acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
  93 subsequent changes by Fran@,{c}ois Pinard and other volunteers
  94 on the Internet.  All names and email addresses can be found in the
  95 files @file{m4-@value{VERSION}/@/AUTHORS} and
  96 @file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
  97 distribution.
  98
  99 This is release @value{VERSION}.  It is now considered stable:  future
 100 releases in the 1.4.x series are only meant to fix bugs, increase speed,
 101 or improve documentation.  However@dots{}
 102
 103 An experimental feature, which would improve @code{m4} usefulness,
 104 allows for changing the syntax for what is a @dfn{word} in @code{m4}.
 105 You should use:
 106 @comment ignore
 107 @example
 108 ./configure --enable-changeword
 109 @end example
 110 @noindent
 111 if you want this feature compiled in.  The current implementation
 112 slows down @code{m4} considerably and is hardly acceptable.  In the
 113 future, @code{m4} 2.0 will come with a different set of new features
 114 that provide similar capabilities, but without the inefficiencies, so
 115 changeword will go away and @emph{you should not count on it}.
 116
 117 @menu
 118 * Preliminaries::               Introduction and preliminaries
 119 * Invoking m4::                 Invoking @code{m4}
 120 * Syntax::                      Lexical and syntactic conventions
 121
 122 * Macros::                      How to invoke macros
 123 * Definitions::                 How to define new macros
 124 * Conditionals::                Conditionals, loops, and recursion
 125
 126 * Debugging::                   How to debug macros and input
 127
 128 * Input Control::               Input control
 129 * File Inclusion::              File inclusion
 130 * Diversions::                  Diverting and undiverting output
 131
 132 * Text handling::               Macros for text handling
 133 * Arithmetic::                  Macros for doing arithmetic
 134 * Shell commands::              Macros for running shell commands
 135 * Miscellaneous::               Miscellaneous builtin macros
 136 * Frozen files::                Fast loading of frozen state
 137
 138 * Compatibility::               Compatibility with other versions of @code{m4}
 139 * Answers::                     Correct version of some examples
 140
 141 * Copying This Package::        How to make copies of the overall M4 package
 142 * Copying This Manual::         How to make copies of this manual
 143 * Indices::                     Indices of concepts and macros
 144
 145 @detailmenu
 146  --- The Detailed Node Listing ---
 147
 148 Introduction and preliminaries
 149
 150 * Intro::                       Introduction to @code{m4}
 151 * History::                     Historical references
 152 * Bugs::                        Problems and bugs
 153 * Manual::                      Using this manual
 154
 155 Invoking @code{m4}
 156
 157 * Operation modes::             Command line options for operation modes
 158 * Preprocessor features::       Command line options for preprocessor features
 159 * Limits control::              Command line options for limits control
 160 * Frozen state::                Command line options for frozen state
 161 * Debugging options::           Command line options for debugging
 162 * Command line files::          Specifying input files on the command line
 163
 164 Lexical and syntactic conventions
 165
 166 * Names::                       Macro names
 167 * Quoted strings::              Quoting input to @code{m4}
 168 * Comments::                    Comments in @code{m4} input
 169 * Other tokens::                Other kinds of input tokens
 170 * Input processing::            How @code{m4} copies input to output
 171
 172 How to invoke macros
 173
 174 * Invocation::                  Macro invocation
 175 * Inhibiting Invocation::       Preventing macro invocation
 176 * Macro Arguments::             Macro arguments
 177 * Quoting Arguments::           On Quoting Arguments to macros
 178 * Macro expansion::             Expanding macros
 179
 180 How to define new macros
 181
 182 * Define::                      Defining a new macro
 183 * Arguments::                   Arguments to macros
 184 * Pseudo Arguments::            Special arguments to macros
 185 * Undefine::                    Deleting a macro
 186 * Defn::                        Renaming macros
 187 * Pushdef::                     Temporarily redefining macros
 188
 189 * Indir::                       Indirect call of macros
 190 * Builtin::                     Indirect call of builtins
 191
 192 Conditionals, loops, and recursion
 193
 194 * Ifdef::                       Testing if a macro is defined
 195 * Ifelse::                      If-else construct, or multibranch
 196 * Shift::                       Recursion in @code{m4}
 197 * Forloop::                     Iteration by counting
 198 * Foreach::                     Iteration by list contents
 199
 200 How to debug macros and input
 201
 202 * Dumpdef::                     Displaying macro definitions
 203 * Trace::                       Tracing macro calls
 204 * Debug Levels::                Controlling debugging output
 205 * Debug Output::                Saving debugging output
 206
 207 Input control
 208
 209 * Dnl::                         Deleting whitespace in input
 210 * Changequote::                 Changing the quote characters
 211 * Changecom::                   Changing the comment delimiters
 212 * Changeword::                  Changing the lexical structure of words
 213 * M4wrap::                      Saving text until end of input
 214
 215 File inclusion
 216
 217 * Include::                     Including named files
 218 * Search Path::                 Searching for include files
 219
 220 Diverting and undiverting output
 221
 222 * Divert::                      Diverting output
 223 * Undivert::                    Undiverting output
 224 * Divnum::                      Diversion numbers
 225 * Cleardivert::                 Discarding diverted text
 226
 227 Macros for text handling
 228
 229 * Len::                         Calculating length of strings
 230 * Index macro::                 Searching for substrings
 231 * Regexp::                      Searching for regular expressions
 232 * Substr::                      Extracting substrings
 233 * Translit::                    Translating characters
 234 * Patsubst::                    Substituting text by regular expression
 235 * Format::                      Formatting strings (printf-like)
 236
 237 Macros for doing arithmetic
 238
 239 * Incr::                        Decrement and increment operators
 240 * Eval::                        Evaluating integer expressions
 241
 242 Macros for running shell commands
 243
 244 * Platform macros::             Determining the platform
 245 * Syscmd::                      Executing simple commands
 246 * Esyscmd::                     Reading the output of commands
 247 * Sysval::                      Exit status
 248 * Mkstemp::                     Making temporary files
 249
 250 Miscellaneous builtin macros
 251
 252 * Errprint::                    Printing error messages
 253 * Location::                    Printing current location
 254 * M4exit::                      Exiting from @code{m4}
 255
 256 Fast loading of frozen state
 257
 258 * Using frozen files::          Using frozen files
 259 * Frozen file format::          Frozen file format
 260
 261 Compatibility with other versions of @code{m4}
 262
 263 * Extensions::                  Extensions in @acronym{GNU} M4
 264 * Incompatibilities::           Facilities in System V m4 not in GNU M4
 265 * Other Incompatibilities::     Other incompatibilities
 266
 267 Correct version of some examples
 268
 269 * Improved exch::               Solution for @code{exch}
 270 * Improved forloop::            Solution for @code{forloop}
 271 * Improved foreach::            Solution for @code{foreach}
 272 * Improved cleardivert::        Solution for @code{cleardivert}
 273 * Improved capitalize::         Solution for @code{capitalize}
 274 * Improved fatal_error::        Solution for @code{fatal_error}
 275
 276 How to make copies of the overall M4 package
 277
 278 * GNU General Public License::  License for copying the M4 package
 279
 280 How to make copies of this manual
 281
 282 * GNU Free Documentation License::  License for copying this manual
 283
 284 Indices of concepts and macros
 285
 286 * Macro index::                 Index for all @code{m4} macros
 287 * Concept index::               Index for many concepts
 288
 289 @end detailmenu
 290 @end menu
 291
 292 @node Preliminaries
 293 @chapter Introduction and preliminaries
 294
 295 This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
 296 comes from, how to read and use this documentation, how to call the
 297 @code{m4} program, and how to report bugs about it.  It concludes by
 298 giving tips for reading the remainder of the manual.
 299
 300 The following chapters then detail all the features of the @code{m4}
 301 language.
 302
 303 @menu
 304 * Intro::                       Introduction to @code{m4}
 305 * History::                     Historical references
 306 * Bugs::                        Problems and bugs
 307 * Manual::                      Using this manual
 308 @end menu
 309
 310 @node Intro
 311 @section Introduction to @code{m4}
 312
 313 @cindex overview of @code{m4}
 314 @code{m4} is a macro processor, in the sense that it copies its
 315 input to the output, expanding macros as it goes.  Macros are either
 316 builtin or user-defined, and can take any number of arguments.
 317 Besides just doing macro expansion, @code{m4} has builtin functions
 318 for including named files, running shell commands, doing integer
 319 arithmetic, manipulating text in various ways, performing recursion,
 320 etc.@dots{}  @code{m4} can be used either as a front-end to a compiler,
 321 or as a macro processor in its own right.
 322
 323 The @code{m4} macro processor is widely available on all UNIXes, and has
 324 been standardized by @acronym{POSIX}.
 325 Usually, only a small percentage of users are aware of its existence.
 326 However, those who find it often become committed users.  The
 327 popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
 328 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
 329 for many to install it, while these people will not themselves
 330 program in @code{m4}.  @acronym{GNU} @code{m4} is mostly compatible with the
 331 System V, Release 3 version, except for some minor differences.
 332 @xref{Compatibility}, for more details.
 333
 334 Some people find @code{m4} to be fairly addictive.  They first use
 335 @code{m4} for simple problems, then take bigger and bigger challenges,
 336 learning how to write complex sets of @code{m4} macros along the way.
 337 Once really addicted, users pursue writing of sophisticated @code{m4}
 338 applications even to solve simple problems, devoting more time
 339 debugging their @code{m4} scripts than doing real work.  Beware that
 340 @code{m4} may be dangerous for the health of compulsive programmers.
 341
 342 @node History
 343 @section Historical references
 344
 345 @cindex history of @code{m4}
 346 @cindex @acronym{GNU} M4, history of
 347 @code{GPM} was an important ancestor of @code{m4}.  See
 348 C. Stratchey: ``A General Purpose Macro generator'', Computer Journal
 349 8,3 (1965), pp.@: 225 ff.  @code{GPM} is also succinctly described into
 350 David Gries classic ``Compiler Construction for Digital Computers''.
 351
 352 The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
 353 Addison-Wesley, Inc.@: (1976) describes and implements a Unix
 354 macro-processor language, which inspired Dennis Ritchie to write
 355 @code{m3}, a macro processor for the AP-3 minicomputer.
 356
 357 Kernighan and Ritchie then joined forces to develop the original
 358 @code{m4}, as described in ``The M4 Macro Processor'', Bell
 359 Laboratories (1977).  It had only 21 builtin macros.
 360
 361 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
 362 the true intricacies of real life: macros can be recognized without
 363 being pre-announced, skipping whitespace or end-of-lines is easier,
 364 more constructs are builtin instead of derived, etc.
 365
 366 Originally, the Kernighan and Plauger macro-processor, and then
 367 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
 368 that is, the @code{Ratfor} equivalent of @code{cpp}.  Later, @code{m4}
 369 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
 370
 371 Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
 372 @code{m4},
 373 in 1990, with the aim of removing the artificial limitations in many
 374 of the traditional @code{m4} implementations, such as maximum line
 375 length, macro size, or number of macros.
 376
 377 The late Professor A. Dain Samples described and implemented a further
 378 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
 379 Language: 2nd edition'', Electronic Announcement on comp.compilers
 380 newsgroup (1992).
 381
 382 Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
 383 1992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
 384 the stable release for 10 years.  It was at this time that @acronym{GNU}
 385 Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
 386 engine, since all other implementations of @code{m4} had too many
 387 limitations.
 388
 389 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
 390 addressed some long standing bugs in the venerable 1.4 release.  Then in
 391 2005, Gary V. Vaughan collected together the many patches to
 392 @acronym{GNU} @code{m4} 1.4 that were floating around the net and
 393 released 1.4.3 and 1.4.4.  And in 2006, Eric Blake joined the team and
 394 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
 395 More bug fixes were incorporated in 2007, with releases 1.4.9 and
 396 1.4.10.  In 2008, Eric additionally rewrote the scanning engine to
 397 reduce recursive evaluation from quadratic to linear complexity for
 398 1.4.11.  The 1.4.x branch remains open for bug fixes.
 399
 400 Meanwhile, development has continued on new features for @code{m4}, such
 401 as dynamic module loading and additional builtins.  When complete,
 402 @acronym{GNU} @code{m4} 2.0 will start a new series of releases.
 403
 404 @node Bugs
 405 @section Problems and bugs
 406
 407 @cindex reporting bugs
 408 @cindex bug reports
 409 @cindex suggestions, reporting
 410 If you have problems with @acronym{GNU} M4 or think you've found a bug,
 411 please report it.  Before reporting a bug, make sure you've actually
 412 found a real bug.  Carefully reread the documentation and see if it
 413 really says you can do what you're trying to do.  If it's not clear
 414 whether you should be able to do something or not, report that too; it's
 415 a bug in the documentation!
 416
 417 Before reporting a bug or trying to fix it yourself, try to isolate it
 418 to the smallest possible input file that reproduces the problem.  Then
 419 send us the input file and the exact results @code{m4} gave you.  Also
 420 say what you expected to occur; this will help us decide whether the
 421 problem was really in the documentation.
 422
 423 Once you've got a precise problem, send e-mail to
 424 @email{bug-m4@@gnu.org}.  Please include the version number of @code{m4}
 425 you are using.  You can get this information with the command
 426 @kbd{m4 --version}.  Also provide details about the platform you are
 427 executing on.
 428
 429 Non-bug suggestions are always welcome as well.  If you have questions
 430 about things that are unclear in the documentation or are just obscure
 431 features, please report them too.
 432
 433 @node Manual
 434 @section Using this manual
 435
 436 @cindex examples, understanding
 437 This manual contains a number of examples of @code{m4} input and output,
 438 and a simple notation is used to distinguish input, output and error
 439 messages from @code{m4}.  Examples are set out from the normal text, and
 440 shown in a fixed width font, like this
 441
 442 @comment ignore
 443 @example
 444 This is an example of an example!
 445 @end example
 446
 447 To distinguish input from output, all output from @code{m4} is prefixed
 448 by the string @samp{@result{}}, and all error messages by the string
 449 @samp{@error{}}.  When showing how command line options affect matters,
 450 the command line is shown with a prompt @samp{$ @kbd{like this}},
 451 otherwise, you can assume that a simple @kbd{m4} invocation will work.
 452 Thus:
 453
 454 @comment ignore
 455 @example
 456 $ @kbd{command line to invoke m4}
 457 Example of input line
 458 @result{}Output line from m4
 459 @error{}and an error message
 460 @end example
 461
 462 The sequence @samp{^D} in an example indicates the end of the input
 463 file.  The sequence @samp{@key{NL}} refers to the newline character.
 464 The majority of these examples are self-contained, and you can run them
 465 with similar results by invoking @kbd{m4 -d}.  In fact, the testsuite
 466 that is bundled in the @acronym{GNU} M4 package consists of the examples
 467 in this document!  Some of the examples assume that your current
 468 directory is located where you unpacked the installation, so if you plan
 469 on following along, you may find it helpful to do this now:
 470
 471 @comment ignore
 472 @example
 473 $ @kbd{cd m4-@value{VERSION}}
 474 @end example
 475
 476 As each of the predefined macros in @code{m4} is described, a prototype
 477 call of the macro will be shown, giving descriptive names to the
 478 arguments, e.g.,
 479
 480 @deffn Composite example (@var{string}, @dvar{count, 1}, @
 481   @ovar{argument}@dots{})
 482 This is a sample prototype.  There is not really a macro named
 483 @code{example}, but this documents that if there were, it would be a
 484 Composite macro, rather than a Builtin.  It requires at least one
 485 argument, @var{string}.  Remember that in @code{m4}, there must not be a
 486 space between the macro name and the opening parenthesis, unless it was
 487 intended to call the macro without any arguments.  The brackets around
 488 @var{count} and @var{argument} show that these arguments are optional.
 489 If @var{count} is omitted, the macro behaves as if count were @samp{1},
 490 whereas if @var{argument} is omitted, the macro behaves as if it were
 491 the empty string.  A blank argument is not the same as an omitted
 492 argument.  For example, @samp{example(`a')}, @samp{example(`a',`1')},
 493 and @samp{example(`a',`1',)} would behave identically with @var{count}
 494 set to @samp{1}; while @samp{example(`a',)} and @samp{example(`a',`')}
 495 would explicitly pass the empty string for @var{count}.  The ellipses
 496 (@samp{@dots{}}) show that the macro processes additional arguments
 497 after @var{argument}, rather than ignoring them.
 498 @end deffn
 499
 500 @cindex numbers
 501 All macro arguments in @code{m4} are strings, but some are given
 502 special interpretation, e.g., as numbers, file names, regular
 503 expressions, etc.  The documentation for each macro will state how the
 504 parameters are interpreted, and what happens if the argument cannot be
 505 parsed according to the desired interpretation.  Unless specified
 506 otherwise, a parameter specified to be a number is parsed as a decimal,
 507 even if the argument has leading zeros; and parsing the empty string as
 508 a number results in 0 rather than an error, although a warning will be
 509 issued.
 510
 511 This document consistently writes and uses @dfn{builtin}, without a
 512 hyphen, as if it were an English word.  This is how the @code{builtin}
 513 primitive is spelled within @code{m4}.
 514
 515 @node Invoking m4
 516 @chapter Invoking @code{m4}
 517
 518 @cindex command line
 519 @cindex invoking @code{m4}
 520 The format of the @code{m4} command is:
 521
 522 @comment ignore
 523 @example
 524 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
 525 @end example
 526
 527 @cindex command line, options
 528 @cindex options, command line
 529 @cindex @env{POSIXLY_CORRECT}
 530 All options begin with @samp{-}, or if long option names are used, with
 531 @samp{--}.  A long option name need not be written completely, any
 532 unambiguous prefix is sufficient.  @acronym{POSIX} requires @code{m4} to
 533 recognize arguments intermixed with files, even when
 534 @env{POSIXLY_CORRECT} is set in the environment.  Most options take
 535 effect at startup regardless of their position, but some are documented
 536 below as taking effect after any files that occurred earlier in the
 537 command line.  The argument @option{--} is a marker to denote the end of
 538 options.
 539
 540 @comment FIXME option -d+f only works on head right now...
 541 With short options, options that do not take arguments may be combined
 542 into a single command line argument with subsequent options, options
 543 with mandatory arguments may be provided either as a single command line
 544 argument or as two arguments, and options with optional arguments must
 545 be provided as a single argument.  In other words,
 546 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
 547 @kbd{m4 -Q -P -D foo -d -d+f -- ./a}, although the latter form is
 548 considered canonical.
 549
 550 With long options, options with mandatory arguments may be provided with
 551 an equal sign (@samp{=}) in a single argument, or as two arguments, and
 552 options with optional arguments must be provided as a single argument.
 553 In other words, @kbd{m4 --def foo --debug a} is equivalent to
 554 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
 555 considered canonical (not to mention more robust, in case a future
 556 version of @code{m4} introduces an option named @option{--default}).
 557
 558 @code{m4} understands the following options, grouped by functionality.
 559
 560 @menu
 561 * Operation modes::             Command line options for operation modes
 562 * Preprocessor features::       Command line options for preprocessor features
 563 * Limits control::              Command line options for limits control
 564 * Frozen state::                Command line options for frozen state
 565 * Debugging options::           Command line options for debugging
 566 * Command line files::          Specifying input files on the command line
 567 @end menu
 568
 569 @node Operation modes
 570 @section Command line options for operation modes
 571
 572 Several options control the overall operation of @code{m4}:
 573
 574 @table @code
 575 @item --help
 576 Print a help summary on standard output, then immediately exit
 577 @code{m4} without reading any input files or performing any other
 578 actions.
 579
 580 @item --version
 581 Print the version number of the program on standard output, then
 582 immediately exit @code{m4} without reading any input files or
 583 performing any other actions.
 584
 585 @item -E
 586 @itemx --fatal-warnings
 587 @cindex errors, fatal
 588 @cindex fatal errors
 589 Controls the effect of warnings.  If unspecified, then execution
 590 continues and exit status is unaffected when a warning is printed.  If
 591 specified exactly once, warnings become fatal; when one is issued,
 592 execution continues, but the exit status will be non-zero.  If specified
 593 multiple times, then execution halts with non-zero status the first time
 594 a warning is issued.  The introduction of behavior levels is new to M4
 595 1.4.9; for behavior consistent with earlier versions, you should specify
 596 @option{-E} twice.
 597
 598 @item -i
 599 @itemx --interactive
 600 @itemx -e
 601 Makes this invocation of @code{m4} interactive.  This means that all
 602 output will be unbuffered, and interrupts will be ignored.  The
 603 spelling @option{-e} exists for compatibility with other @code{m4}
 604 implementations, and issues a warning because it may be withdrawn in a
 605 future version of @acronym{GNU} M4.
 606
 607 @item -P
 608 @itemx --prefix-builtins
 609 Internally modify @emph{all} builtin macro names so they all start with
 610 the prefix @samp{m4_}.  For example, using this option, one should write
 611 @samp{m4_define} instead of @samp{define}, and @samp{m4___file__}
 612 instead of @samp{__file__}.  This option has no effect if @option{-R}
 613 is also specified.
 614
 615 @item -Q
 616 @itemx --quiet
 617 @itemx --silent
 618 Suppress warnings, such as missing or superfluous arguments in macro
 619 calls, or treating the empty string as zero.
 620
 621 @item --warn-macro-sequence@r{[}=@var{REGEXP}@r{]}
 622 Issue a warning if the regular expression @var{REGEXP} has a non-empty
 623 match in any macro definition (either by @code{define} or
 624 @code{pushdef}).  Empty matches are ignored; therefore, supplying the
 625 empty string as @var{REGEXP} disables any warning.  If the optional
 626 @var{REGEXP} is not supplied, then the default regular expression is
 627 @samp{\$\(@{[^@}]*@}\|[0-9][0-9]+\)} (a literal @samp{$} followed by
 628 multiple digits or by an open brace), since these sequences will
 629 change semantics in the default operation of @acronym{GNU} M4 2.0 (due
 630 to a change in how more than 9 arguments in a macro definition will be
 631 handled, @pxref{Arguments}).  Providing an alternate regular
 632 expression can provide a useful reverse lookup feature of finding
 633 where a macro is defined to have a given definition.
 634
 635 @item -W @var{REGEXP}
 636 @itemx --word-regexp=@var{REGEXP}
 637 Use @var{REGEXP} as an alternative syntax for macro names.  This
 638 experimental option will not be present in all @acronym{GNU} @code{m4}
 639 implementations (@pxref{Changeword}).
 640 @end table
 641
 642 @node Preprocessor features
 643 @section Command line options for preprocessor features
 644
 645 @cindex macro definitions, on the command line
 646 @cindex command line, macro definitions on the
 647 @cindex preprocessor features
 648 Several options allow @code{m4} to behave more like a preprocessor.
 649 Macro definitions and deletions can be made on the command line, the
 650 search path can be altered, and the output file can track where the
 651 input came from.  These features occur with the following options:
 652
 653 @table @code
 654 @item -D @var{NAME}@r{[}=@var{VALUE}@r{]}
 655 @itemx --define=@var{NAME}@r{[}=@var{VALUE}@r{]}
 656 This enters @var{NAME} into the symbol table.  If @samp{=@var{VALUE}} is
 657 missing, the value is taken to be the empty string.  The @var{VALUE} can
 658 be any string, and the macro can be defined to take arguments, just as
 659 if it was defined from within the input.  This option may be given more
 660 than once; order with respect to file names is significant, and
 661 redefining the same @var{NAME} loses the previous value.
 662
 663 @item -I @var{DIRECTORY}
 664 @itemx --include=@var{DIRECTORY}
 665 Make @code{m4} search @var{DIRECTORY} for included files that are not
 666 found in the current working directory.  @xref{Search Path}, for more
 667 details.  This option may be given more than once.
 668
 669 @item -s
 670 @itemx --synclines
 671 @cindex synchronization lines
 672 @cindex location, input
 673 @cindex input location
 674 Generate synchronization lines, for use by the C preprocessor or other
 675 similar tools.  Order is significant with respect to file names.  This
 676 option is useful, for example, when @code{m4} is used as a
 677 front end to a compiler.  Source file name and line number information
 678 is conveyed by directives of the form @samp{#line @var{linenum}
 679 "@var{file}"}, which are inserted as needed into the middle of the
 680 output.  Such directives mean that the following line originated or was
 681 expanded from the contents of input file @var{file} at line
 682 @var{linenum}.  The @samp{"@var{file}"} part is often omitted when
 683 the file name did not change from the previous directive.
 684
 685 Synchronization directives are always given on complete lines by
 686 themselves.  When a synchronization discrepancy occurs in the middle of
 687 an output line, the associated synchronization directive is delayed
 688 until the next newline that does not occur in the middle of a quoted
 689 string or comment.
 690
 691 @comment options: -s
 692 @example
 693 define(`twoline', `1
 694 2')
 695 @result{}#line 2 "stdin"
 696 @result{}
 697 changecom(`/*', `*/')
 698 @result{}
 699 define(`comment', `/*1
 700 2*/')
 701 @result{}#line 5
 702 @result{}
 703 dnl no line
 704 hello
 705 @result{}#line 7
 706 @result{}hello
 707 twoline
 708 @result{}1
 709 @result{}#line 8
 710 @result{}2
 711 comment
 712 @result{}/*1
 713 @result{}2*/
 714 one comment `two
 715 three'
 716 @result{}#line 10
 717 @result{}one /*1
 718 @result{}2*/ two
 719 @result{}three
 720 goodbye
 721 @result{}#line 12
 722 @result{}goodbye
 723 @end example
 724
 725 @item -U @var{NAME}
 726 @itemx --undefine=@var{NAME}
 727 This deletes any predefined meaning @var{NAME} might have.  Obviously,
 728 only predefined macros can be deleted in this way.  This option may be
 729 given more than once; undefining a @var{NAME} that does not have a
 730 definition is silently ignored.  Order is significant with respect to
 731 file names.
 732 @end table
 733
 734 @node Limits control
 735 @section Command line options for limits control
 736
 737 There are some limits within @code{m4} that can be tuned.  For
 738 compatibility, @code{m4} also accepts some options that control limits
 739 in other implementations, but which are automatically unbounded (limited
 740 only by your hardware and operating system constraints) in @acronym{GNU}
 741 @code{m4}.
 742
 743 @table @code
 744 @item -G
 745 @itemx --traditional
 746 Suppress all the extensions made in this implementation, compared to the
 747 System V version.  @xref{Compatibility}, for a list of these.
 748
 749 @item -H @var{NUM}
 750 @itemx --hashsize=@var{NUM}
 751 Make the internal hash table for symbol lookup be @var{NUM} entries big.
 752 For better performance, the number should be prime, but this is not
 753 checked.  The default is 509 entries.  It should not be necessary to
 754 increase this value, unless you define an excessive number of macros.
 755
 756 @item -L @var{NUM}
 757 @itemx --nesting-limit=@var{NUM}
 758 @cindex nesting limit
 759 @cindex limit, nesting
 760 Artificially limit the nesting of macro calls to @var{NUM} levels,
 761 stopping program execution if this limit is ever exceeded.  When not
 762 specified, nesting is limited to 1024 levels.  A value of zero means
 763 unlimited; but then heavily nested code could potentially cause a stack
 764 overflow.
 765
 766 The precise effect of this option might be more correctly associated
 767 with textual nesting than dynamic recursion.  It has been useful
 768 when some complex @code{m4} input was generated by mechanical means.
 769 Most users would never need this option.  If shown to be obtrusive,
 770 this option (which is still experimental) might well disappear.
 771
 772 @cindex rescanning
 773 This option does @emph{not} have the ability to break endless
 774 rescanning loops, since these do not necessarily consume much memory
 775 or stack space.  Through clever usage of rescanning loops, one can
 776 request complex, time-consuming computations from @code{m4} with useful
 777 results.  Putting limitations in this area would break @code{m4} power.
 778 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
 779 only the simplest example (but @pxref{Compatibility}).  Expecting @acronym{GNU}
 780 @code{m4} to detect these would be a little like expecting a compiler
 781 system to detect and diagnose endless loops: it is a quite @emph{hard}
 782 problem in general, if not undecidable!
 783
 784 @item -B @var{NUM}
 785 @itemx -S @var{NUM}
 786 @itemx -T @var{NUM}
 787 These options are present for compatibility with System V @code{m4}, but
 788 do nothing in this implementation.  They may disappear in future
 789 releases, and issue a warning to that effect.
 790
 791 @item -N @var{NUM}
 792 @itemx --diversions=@var{NUM}
 793 These options are present only for compatibility with previous
 794 versions of @acronym{GNU} @code{m4}, and were controlling the number of
 795 possible diversions which could be used at the same time.  They do nothing,
 796 because there is no fixed limit anymore.  They may disappear in future
 797 releases, and issue a warning to that effect.
 798 @end table
 799
 800 @node Frozen state
 801 @section Command line options for frozen state
 802
 803 @acronym{GNU} @code{m4} comes with a feature of freezing internal state
 804 (@pxref{Frozen files}).  This can be used to speed up @code{m4}
 805 execution when reusing a common initialization script.
 806
 807 @table @code
 808 @item -F @var{FILE}
 809 @itemx --freeze-state=@var{FILE}
 810 Once execution is finished, write out the frozen state on the specified
 811 @var{FILE}.  It is conventional, but not required, for @var{FILE} to end
 812 in @samp{.m4f}.
 813
 814 @item -R @var{FILE}
 815 @itemx --reload-state=@var{FILE}
 816 Before execution starts, recover the internal state from the specified
 817 frozen @var{FILE}.  The options @option{-D}, @option{-U}, and
 818 @option{-t} take effect after state is reloaded, but before the input
 819 files are read.
 820 @end table
 821
 822 @node Debugging options
 823 @section Command line options for debugging
 824
 825 Finally, there are several options for aiding in debugging @code{m4}
 826 scripts.
 827
 828 @table @code
 829 @item -d@r{[}@var{FLAGS}@r{]}
 830 @itemx --debug@r{[}=@var{FLAGS}@r{]}
 831 Set the debug-level according to the flags @var{FLAGS}.  The debug-level
 832 controls the format and amount of information presented by the debugging
 833 functions.  @xref{Debug Levels}, for more details on the format and
 834 meaning of @var{FLAGS}.  If omitted, @var{FLAGS} defaults to @samp{aeq}.
 835
 836 @item --debugfile=@var{FILE}
 837 @itemx -o @var{FILE}
 838 @itemx --error-output=@var{FILE}
 839 Redirect @code{dumpdef} output, debug messages, and trace output to the
 840 named @var{FILE}.  Warnings, error messages, and @code{errprint} output
 841 are still printed to standard error.  If unspecified, debug output goes
 842 to standard error; if empty, debug output is discarded.  @xref{Debug
 843 Output}, for more details.  The spellings @option{-o} and
 844 @option{--error-output} are misleading and inconsistent with other
 845 @acronym{GNU} tools; for now they are silently accepted as synonyms of
 846 @option{--debugfile}, but in a future version of M4, using them will
 847 cause a warning to be issued.
 848
 849 @item -l @var{NUM}
 850 @itemx --arglength=@var{NUM}
 851 Restrict the size of the output generated by macro tracing to @var{NUM}
 852 characters per trace line.  If unspecified or zero, output is
 853 unlimited.  @xref{Debug Levels}, for more details.
 854
 855 @item -t @var{NAME}
 856 @itemx --trace=@var{NAME}
 857 This enables tracing for the macro @var{NAME}, at any point where it is
 858 defined.  @var{NAME} need not be defined when this option is given.
 859 This option may be given more than once, and order is significant with
 860 respect to file names.  @xref{Trace}, for more details.
 861 @end table
 862
 863 @node Command line files
 864 @section Specifying input files on the command line
 865
 866 @cindex command line, file names on the
 867 @cindex file names, on the command line
 868 The remaining arguments on the command line are taken to be input file
 869 names.  If no names are present, standard input is read.  A file
 870 name of @file{-} is taken to mean standard input.  It is
 871 conventional, but not required, for input files to end in @samp{.m4}.
 872
 873 The input files are read in the sequence given.  Standard input can be
 874 read more than once, so the file name @file{-} may appear multiple times
 875 on the command line; this makes a difference when input is from a
 876 terminal or other special file type.  It is an error if an input file
 877 ends in the middle of argument collection, a comment, or a quoted
 878 string.
 879
 880 The options @option{--define} (@option{-D}), @option{--undefine}
 881 (@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
 882 (@option{-t}) only take effect after processing input from any file
 883 names that occur earlier on the command line.  For example, assume the
 884 file @file{foo} contains:
 885
 886 @comment ignore
 887 @example
 888 $ @kbd{cat foo}
 889 bar
 890 @end example
 891
 892 The text @samp{bar} can then be redefined over multiple uses of
 893 @file{foo}:
 894
 895 @comment options: -Dbar=hello foo -Dbar=world foo
 896 @example
 897 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
 898 @result{}hello
 899 @result{}world
 900 @end example
 901
 902 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
 903 exit status of @code{m4} will be 0 for success, 1 for general failure
 904 (such as problems with reading an input file), and 63 for version
 905 mismatch (@pxref{Using frozen files}).
 906
 907 If you need to read a file whose name starts with a @file{-}, you can
 908 specify it as @samp{./-file}, or use @option{--} to mark the end of
 909 options.
 910
 911 @node Syntax
 912 @chapter Lexical and syntactic conventions
 913
 914 @cindex input tokens
 915 @cindex tokens
 916 As @code{m4} reads its input, it separates it into @dfn{tokens}.  A
 917 token is either a name, a quoted string, or any single character, that
 918 is not a part of either a name or a string.  Input to @code{m4} can also
 919 contain comments.  @acronym{GNU} @code{m4} does not yet understand
 920 multibyte locales; all operations are byte-oriented rather than
 921 character-oriented (although if your locale uses a single byte
 922 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
 923 However, @code{m4} is eight-bit clean, so you can
 924 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
 925 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
 926 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
 927
 928 @ignore
 929 @comment FIXME - each builtin needs to document how it handles NUL, then
 930 @comment update the above paragraph to mention that NUL is now handled
 931 @comment transparently.  Meanwhile, test that we don't regress.
 932
 933 @comment xout: null.out
 934 @comment xerr: null.err
 935 @example
 936 define(`m4exit')include(`null.m4')dnl
 937 @end example
 938
 939 @comment status: 2
 940 @example
 941 include(`null.m4')
 942 @result{}# This file tests m4 behavior on NUL bytes.
 943 @end example
 944 @end ignore
 945
 946 @menu
 947 * Names::                       Macro names
 948 * Quoted strings::              Quoting input to @code{m4}
 949 * Comments::                    Comments in @code{m4} input
 950 * Other tokens::                Other kinds of input tokens
 951 * Input processing::            How @code{m4} copies input to output
 952 @end menu
 953
 954 @node Names
 955 @section Macro names
 956
 957 @cindex names
 958 @cindex words
 959 A name is any sequence of letters, digits, and the character @samp{_}
 960 (underscore), where the first character is not a digit.  @code{m4} will
 961 use the longest such sequence found in the input.  If a name has a
 962 macro definition, it will be subject to macro expansion
 963 (@pxref{Macros}).  Names are case-sensitive.
 964
 965 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
 966
 967 @node Quoted strings
 968 @section Quoting input to @code{m4}
 969
 970 @cindex quoted string
 971 @cindex string, quoted
 972 A quoted string is a sequence of characters surrounded by quote
 973 strings, defaulting to
 974 @samp{`} and @samp{'}, where the nested begin and end quotes within the
 975 string are balanced.  The value of a string token is the text, with one
 976 level of quotes stripped off.  Thus
 977
 978 @comment ignore
 979 @example
 980 `'
 981 @result{}
 982 @end example
 983
 984 @noindent
 985 is the empty string, and double-quoting turns into single-quoting.
 986
 987 @comment ignore
 988 @example
 989 ``quoted''
 990 @result{}`quoted'
 991 @end example
 992
 993 The quote characters can be changed at any time, using the builtin macro
 994 @code{changequote}.  @xref{Changequote}, for more information.
 995
 996 @node Comments
 997 @section Comments in @code{m4} input
 998
 999 @cindex comments
1000 Comments in @code{m4} are normally delimited by the characters @samp{#}
1001 and newline.  All characters between the comment delimiters are ignored,
1002 but the entire comment (including the delimiters) is passed through to
1003 the output---comments are @emph{not} discarded by @code{m4}.
1004
1005 Comments cannot be nested, so the first newline after a @samp{#} ends
1006 the comment.  The commenting effect of the begin-comment string
1007 can be inhibited by quoting it.
1008
1009 @example
1010 $ @kbd{m4}
1011 `quoted text' # `commented text'
1012 @result{}quoted text # `commented text'
1013 `quoting inhibits' `#' `comments'
1014 @result{}quoting inhibits # comments
1015 @end example
1016
1017 The comment delimiters can be changed to any string at any time, using
1018 the builtin macro @code{changecom}.  @xref{Changecom}, for more
1019 information.
1020
1021 @node Other tokens
1022 @section Other kinds of input tokens
1023
1024 @cindex tokens, special
1025 Any character, that is neither a part of a name, nor of a quoted string,
1026 nor a comment, is a token by itself.  When not in the context of macro
1027 expansion, all of these tokens are just copied to output.  However,
1028 during macro expansion, whitespace characters (space, tab, newline,
1029 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1030 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1031 roles, explained later.
1032
1033 @node Input processing
1034 @section How @code{m4} copies input to output
1035
1036 As @code{m4} reads the input token by token, it will copy each token
1037 directly to the output immediately.
1038
1039 The exception is when it finds a word with a macro definition.  In that
1040 case @code{m4} will calculate the macro's expansion, possibly reading
1041 more input to get the arguments.  It then inserts the expansion in front
1042 of the remaining input.  In other words, the resulting text from a macro
1043 call will be read and parsed into tokens again.
1044
1045 @code{m4} expands a macro as soon as possible.  If it finds a macro call
1046 when collecting the arguments to another, it will expand the second call
1047 first.  This process continues until there are no more macro calls to
1048 expand and all the input has been consumed.
1049
1050 For a running example, examine how @code{m4} handles this input:
1051
1052 @comment ignore
1053 @example
1054 format(`Result is %d', eval(`2**15'))
1055 @end example
1056
1057 @noindent
1058 First, @code{m4} sees that the token @samp{format} is a macro name, so
1059 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1060 and @samp{@w{ }}, before encountering another potential macro.  Sure
1061 enough, @samp{eval} is a macro name, so the nested argument collection
1062 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1063 with the lone argument of @samp{2**15}.  The expansion of
1064 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1065 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1066 combined with the next @samp{)}, the format macro now has all its
1067 arguments, as if the user had typed:
1068
1069 @comment ignore
1070 @example
1071 format(`Result is %d', 32768)
1072 @end example
1073
1074 @noindent
1075 The format macro expands to @samp{Result is 32768}, and we have another
1076 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1077 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1078 @samp{8}.  None of these are macros, so the final output is
1079
1080 @comment ignore
1081 @example
1082 @result{}Result is 32768
1083 @end example
1084
1085 As a more complicated example, we will contrast an actual code
1086 example from the Gnulib project@footnote{Derived from a patch in
1087 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1088 and a followup patch in
1089 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1090 showing both a buggy approach and the desired results.  The user desires
1091 to output a shell assignment statement that takes its argument and turns
1092 it into a shell variable by converting it to uppercase and prepending a
1093 prefix.  The original attempt looks like this:
1094
1095 @example
1096 changequote([,])dnl
1097 define([gl_STRING_MODULE_INDICATOR],
1098   [
1099     dnl comment
1100     GNULIB_]translit([$1],[a-z],[A-Z])[=1
1101   ])dnl
1102   gl_STRING_MODULE_INDICATOR([strcase])
1103 @result{} @w{ }
1104 @result{}        GNULIB_strcase=1
1105 @result{} @w{ }
1106 @end example
1107
1108 Oops -- the argument did not get capitalized.  And although the manual
1109 is not able to easily show it, both lines that appear empty actually
1110 contain two trailing spaces.  By stepping through the parse, it is easy
1111 to see what happened.  First, @code{m4} sees the token
1112 @samp{changequote}, which it recognizes as a macro, followed by
1113 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1114 argument list.  The macro expands to the empty string, but changes the
1115 quoting characters to something more useful for generating shell code
1116 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1117 but unbalanced @samp{[]} tend to be rare).  Also in the first line,
1118 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1119 macro that consumes the rest of the line, resulting in no output for
1120 that line.
1121
1122 The second line starts a macro definition.  @code{m4} sees the token
1123 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1124 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}.  Because an unquoted
1125 comma was encountered, the first argument is known to be the expansion
1126 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1127 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1128 whitespace is discarded as part of argument collection.  Then comes a
1129 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1130 comment@key{NL}@ @ @ @ GNULIB_]}.  This is followed by the token
1131 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1132 macro expansion has started.
1133
1134 The arguments to the @code{translit} are found by the tokens @samp{(},
1135 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1136 @samp{)}.  All three string arguments are expanded (or in other words,
1137 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1138 capitalization, the result of the macro is @samp{$1}.  This expansion is
1139 rescanned, resulting in the two literal characters @samp{$} and
1140 @samp{1}.
1141
1142 Scanning of the outer macro resumes, and picks up with
1143 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}.  The collected pieces of
1144 expanded text are concatenated, with the end result that the macro
1145 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1146 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1147 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1148
1149 The final line is then parsed, beginning with @samp{ } and @samp{ }
1150 that are output literally.  Then @samp{gl_STRING_MODULE_INDICATOR} is
1151 recognized as a macro name, with an argument list of @samp{(},
1152 @samp{[strcase]}, and @samp{)}.  Since the definition of the macro
1153 contains the sequence @samp{$1}, that sequence is replaced with the
1154 argument @samp{strcase} prior to starting the rescan.  The rescan sees
1155 @samp{@key{NL}} and four spaces, which are output literally, then
1156 @samp{dnl}, which discards the text @samp{ comment@key{NL}}.  Next
1157 comes four more spaces, also output literally, and the token
1158 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1159 substitution.  Since that is not a macro name, it is output literally,
1160 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1161 two more spaces.  Finally, the original @samp{@key{NL}} seen after the
1162 macro invocation is scanned and output literally.
1163
1164 Now for a corrected approach.  This rearranges the use of newlines and
1165 whitespace so that less whitespace is output (which, although harmless
1166 to shell scripts, can be visually unappealing), and fixes the quoting
1167 issues so that the capitalization occurs when the macro
1168 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1169 defined.
1170
1171 @example
1172 changequote([,])dnl
1173 define([gl_STRING_MODULE_INDICATOR],
1174   [dnl comment
1175   GNULIB_[]translit([$1], [a-z], [A-Z])=1dnl
1176 ])dnl
1177   gl_STRING_MODULE_INDICATOR([strcase])
1178 @result{}    GNULIB_STRCASE=1
1179 @end example
1180
1181 The parsing of the first line is unchanged.  The second line sees the
1182 name of the macro to define, then sees the discarded @samp{@key{NL}}
1183 and two spaces, as before.  But this time, the next token is
1184 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([$1], [a-z],
1185 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1186 @samp{)} to end the macro definition and @samp{dnl} to skip the
1187 newline.  No early expansion of @code{translit} occurs, so the entire
1188 string becomes the definition of the macro.
1189
1190 The final line is then parsed, beginning with two spaces that are
1191 output literally, and an invocation of
1192 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1193 Again, the @samp{$1} in the macro definition is substituted prior to
1194 rescanning.  Rescanning first encounters @samp{dnl}, and discards
1195 @samp{ comment@key{NL}}.  Then two spaces are output literally.  Next
1196 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1197 output literally.  The token @samp{[]} is an empty string, so it does
1198 not affect output.  Then the token @samp{translit} is encountered.
1199
1200 This time, the arguments to @code{translit} are parsed as @samp{(},
1201 @samp{[strcase]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1202 @samp{[A-Z]}, and @samp{)}.  The two spaces are discarded, and the
1203 translit results in the desired result @samp{STRCASE}.  This is
1204 rescanned, but since it is not a macro name, it is output literally.
1205 Then the scanner sees @samp{=} and @samp{1}, which are output
1206 literally, followed by @samp{dnl} which discards the rest of the
1207 definition of @code{gl_STRING_MODULE_INDICATOR}.  The newline at the
1208 end of output is the literal @samp{@key{NL}} that appeared after the
1209 invocation of the macro.
1210
1211 The order in which @code{m4} expands the macros can be further explored
1212 using the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
1213
1214 @node Macros
1215 @chapter How to invoke macros
1216
1217 This chapter covers macro invocation, macro arguments and how macro
1218 expansion is treated.
1219
1220 @menu
1221 * Invocation::                  Macro invocation
1222 * Inhibiting Invocation::       Preventing macro invocation
1223 * Macro Arguments::             Macro arguments
1224 * Quoting Arguments::           On Quoting Arguments to macros
1225 * Macro expansion::             Expanding macros
1226 @end menu
1227
1228 @node Invocation
1229 @section Macro invocation
1230
1231 @cindex macro invocation
1232 @cindex invoking macros
1233 Macro invocations has one of the forms
1234
1235 @comment ignore
1236 @example
1237 name
1238 @end example
1239
1240 @noindent
1241 which is a macro invocation without any arguments, or
1242
1243 @comment ignore
1244 @example
1245 name(arg1, arg2, @dots{}, arg@var{n})
1246 @end example
1247
1248 @noindent
1249 which is a macro invocation with @var{n} arguments.  Macros can have any
1250 number of arguments.  All arguments are strings, but different macros
1251 might interpret the arguments in different ways.
1252
1253 The opening parenthesis @emph{must} follow the @var{name} directly, with
1254 no spaces in between.  If it does not, the macro is called with no
1255 arguments at all.
1256
1257 For a macro call to have no arguments, the parentheses @emph{must} be
1258 left out.  The macro call
1259
1260 @comment ignore
1261 @example
1262 name()
1263 @end example
1264
1265 @noindent
1266 is a macro call with one argument, which is the empty string, not a call
1267 with no arguments.
1268
1269 @node Inhibiting Invocation
1270 @section Preventing macro invocation
1271
1272 An innovation of the @code{m4} language, compared to some of its
1273 predecessors (like Stratchey's @code{GPM}, for example), is the ability
1274 to recognize macro calls without resorting to any special, prefixed
1275 invocation character.  While generally useful, this feature might
1276 sometimes be the source of spurious, unwanted macro calls.  So, @acronym{GNU}
1277 @code{m4} offers several mechanisms or techniques for inhibiting the
1278 recognition of names as macro calls.
1279
1280 @cindex @acronym{GNU} extensions
1281 @cindex blind macro
1282 @cindex macro, blind
1283 First of all, many builtin macros cannot meaningfully be called without
1284 arguments.  As a @acronym{GNU} extension, for any of these macros,
1285 whenever an opening parenthesis does not immediately follow their name,
1286 the builtin macro call is not triggered.  This solves the most usual
1287 cases, like for @samp{include} or @samp{eval}.  Later in this document,
1288 the sentence ``This macro is recognized only with parameters'' refers to
1289 this specific provision of @acronym{GNU} M4, also known as a blind
1290 builtin macro.  For the builtins defined by @acronym{POSIX} that bear
1291 this disclaimer, @acronym{POSIX} specifically states that invoking those
1292 builtins without arguments is unspecified, because many other
1293 implementations simply invoke the builtin as though it were given one
1294 empty argument instead.
1295
1296 @example
1297 $ @kbd{m4}
1298 eval
1299 @result{}eval
1300 eval(`1')
1301 @result{}1
1302 @end example
1303
1304 There is also a command line option (@option{--prefix-builtins}, or
1305 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1306 builtin macros with a prefix of @samp{m4_} at startup.  The option has
1307 no effect whatsoever on user defined macros.  For example, with this option,
1308 one has to write @code{m4_dnl} and even @code{m4_m4exit}.  It also has
1309 no effect on whether a macro requires parameters.
1310
1311 @comment options: -P
1312 @example
1313 $ @kbd{m4 -P}
1314 eval
1315 @result{}eval
1316 eval(`1')
1317 @result{}eval(1)
1318 m4_eval
1319 @result{}m4_eval
1320 m4_eval(`1')
1321 @result{}1
1322 @end example
1323
1324 Another alternative is to redefine problematic macros to a name less
1325 likely to cause conflicts, @xref{Definitions}.
1326
1327 If your version of @acronym{GNU} @code{m4} has the @code{changeword} feature
1328 compiled in, it offers far more flexibility in specifying the
1329 syntax of macro names, both builtin or user-defined.  @xref{Changeword},
1330 for more information on this experimental feature.
1331
1332 Of course, the simplest way to prevent a name from being interpreted
1333 as a call to an existing macro is to quote it.  The remainder of
1334 this section studies a little more deeply how quoting affects macro
1335 invocation, and how quoting can be used to inhibit macro invocation.
1336
1337 Even if quoting is usually done over the whole macro name, it can also
1338 be done over only a few characters of this name (provided, of course,
1339 that the unquoted portions are not also a macro).  It is also possible
1340 to quote the empty string, but this works only @emph{inside} the name.
1341 For example:
1342
1343 @example
1344 `divert'
1345 @result{}divert
1346 `d'ivert
1347 @result{}divert
1348 di`ver't
1349 @result{}divert
1350 div`'ert
1351 @result{}divert
1352 @end example
1353
1354 @noindent
1355 all yield the string @samp{divert}.  While in both:
1356
1357 @example
1358 `'divert
1359 @result{}
1360 divert`'
1361 @result{}
1362 @end example
1363
1364 @noindent
1365 the @code{divert} builtin macro will be called, which expands to the
1366 empty string.
1367
1368 @cindex rescanning
1369 The output of macro evaluations is always rescanned.  In the following
1370 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1371 if @code{m4}
1372 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1373
1374 @example
1375 define(`cde', `CDE')
1376 @result{}
1377 define(`x', `substr(ab')
1378 @result{}
1379 define(`y', `cde, `1', `3')')
1380 @result{}
1381 x`'y
1382 @result{}bCD
1383 @end example
1384
1385 @ignore
1386 @comment Similar, but with argument references, to ensure good test
1387 @comment coverage.
1388 @example
1389 define(`x1', `len(`$1'')
1390 @result{}
1391 define(`y1', ``$1')')
1392 @result{}
1393 x1(`01234567890123456789')y1(`98765432109876543210')
1394 @result{}40
1395 @end example
1396 @end ignore
1397
1398 Unquoted strings on either side of a quoted string are subject to
1399 being recognized as macro names.  In the following example, quoting the
1400 empty string allows for the second @code{macro} to be recognized as such:
1401
1402 @example
1403 define(`macro', `m')
1404 @result{}
1405 macro(`m')macro
1406 @result{}mmacro
1407 macro(`m')`'macro
1408 @result{}mm
1409 @end example
1410
1411 Quoting may prevent recognizing as a macro name the concatenation of a
1412 macro expansion with the surrounding characters.  In this example:
1413
1414 @example
1415 define(`macro', `di$1')
1416 @result{}
1417 macro(`v')`ert'
1418 @result{}divert
1419 macro(`v')ert
1420 @result{}
1421 @end example
1422
1423 @noindent
1424 the input will produce the string @samp{divert}.  When the quotes were
1425 removed, the @code{divert} builtin was called instead.
1426
1427 @node Macro Arguments
1428 @section Macro arguments
1429
1430 @cindex macros, arguments to
1431 @cindex arguments to macros
1432 When a name is seen, and it has a macro definition, it will be expanded
1433 as a macro.
1434
1435 If the name is followed by an opening parenthesis, the arguments will be
1436 collected before the macro is called.  If too few arguments are
1437 supplied, the missing arguments are taken to be the empty string.
1438 However, some builtins are documented to behave differently for a
1439 missing optional argument than for an explicit empty string.  If there
1440 are too many arguments, the excess arguments are ignored.  Unquoted
1441 leading whitespace is stripped off all arguments, but whitespace
1442 generated by a macro expansion or occurring after a macro that expanded
1443 to an empty string remains intact.  Whitespace includes space, tab,
1444 newline, carriage return, vertical tab, and formfeed.
1445
1446 @example
1447 define(`macro', `$1')
1448 @result{}
1449 macro( unquoted leading space lost)
1450 @result{}unquoted leading space lost
1451 macro(` quoted leading space kept')
1452 @result{} quoted leading space kept
1453 macro(
1454  divert `unquoted space kept after expansion')
1455 @result{} unquoted space kept after expansion
1456 macro(macro(`
1457 ')`whitespace from expansion kept')
1458 @result{}
1459 @result{}whitespace from expansion kept
1460 macro(`unquoted trailing whitespace kept'
1461 )
1462 @result{}unquoted trailing whitespace kept
1463 @result{}
1464 @end example
1465
1466 @cindex warnings, suppressing
1467 @cindex suppressing warnings
1468 Normally @code{m4} will issue warnings if a builtin macro is called
1469 with an inappropriate number of arguments, but it can be suppressed with
1470 the @option{--quiet} command line option (or @option{--silent}, or
1471 @option{-Q}, @pxref{Operation modes, , Invoking m4}).  For user
1472 defined macros, there is no check of the number of arguments given.
1473
1474 @example
1475 $ @kbd{m4}
1476 index(`abc')
1477 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
1478 @result{}0
1479 index(`abc',)
1480 @result{}0
1481 index(`abc', `b', `ignored')
1482 @error{}m4:stdin:3: Warning: index: extra arguments ignored: 3 > 2
1483 @result{}1
1484 @end example
1485
1486 @comment options: -Q
1487 @example
1488 $ @kbd{m4 -Q}
1489 index(`abc')
1490 @result{}0
1491 index(`abc',)
1492 @result{}0
1493 index(`abc', `b', `ignored')
1494 @result{}1
1495 @end example
1496
1497 Macros are expanded normally during argument collection, and whatever
1498 commas, quotes and parentheses that might show up in the resulting
1499 expanded text will serve to define the arguments as well.  Thus, if
1500 @var{foo} expands to @samp{, b, c}, the macro call
1501
1502 @comment ignore
1503 @example
1504 bar(a foo, d)
1505 @end example
1506
1507 @noindent
1508 is a macro call with four arguments, which are @samp{a }, @samp{b},
1509 @samp{c} and @samp{d}.  To understand why the first argument contains
1510 whitespace, remember that unquoted leading whitespace is never part
1511 of an argument, but trailing whitespace always is.
1512
1513 It is possible for a macro's definition to change during argument
1514 collection, in which case the expansion uses the definition that was in
1515 effect at the time the opening @samp{(} was seen.
1516
1517 @example
1518 define(`f', `1')
1519 @result{}
1520 f(define(`f', `2'))
1521 @result{}1
1522 f
1523 @result{}2
1524 @end example
1525
1526 It is an error if the end of file occurs while collecting arguments.
1527
1528 @comment status: 1
1529 @example
1530 hello world
1531 @result{}hello world
1532 define(
1533 ^D
1534 @error{}m4:stdin:2: define: end of file in argument list
1535 @end example
1536
1537 @node Quoting Arguments
1538 @section On Quoting Arguments to macros
1539
1540 @cindex quoted macro arguments
1541 @cindex macros, quoted arguments to
1542 @cindex arguments, quoted macro
1543 Each argument has unquoted leading whitespace removed.  Within each
1544 argument, all unquoted parentheses must match.  For example, if
1545 @var{foo} is a macro,
1546
1547 @comment ignore
1548 @example
1549 foo(() (`(') `(')
1550 @end example
1551
1552 @noindent
1553 is a macro call, with one argument, whose value is @samp{() (() (}.
1554 Commas separate arguments, except when they occur inside quotes,
1555 comments, or unquoted parentheses.  @xref{Pseudo Arguments}, for
1556 examples.
1557
1558 It is common practice to quote all arguments to macros, unless you are
1559 sure you want the arguments expanded.  Thus, in the above
1560 example with the parentheses, the `right' way to do it is like this:
1561
1562 @comment ignore
1563 @example
1564 foo(`() (() (')
1565 @end example
1566
1567 @cindex quoting rule of thumb
1568 @cindex rule of thumb, quoting
1569 It is, however, in certain cases necessary (because nested expansion
1570 must occur to create the arguments for the outer macro) or convenient
1571 (because it uses fewer characters) to leave out quotes for some
1572 arguments, and there is nothing wrong in doing it.  It just makes life a
1573 bit harder, if you are not careful to follow a consistent quoting style.
1574 For consistency, this manual follows the rule of thumb that each layer
1575 of parentheses introduces another layer of single quoting, except when
1576 showing the consequences of quoting rules.  This is done even when the
1577 quoted string cannot be a macro, such as with integers when you have not
1578 changed the syntax via @code{changeword} (@pxref{Changeword}).
1579
1580 The quoting rule of thumb of one level of quoting per parentheses has a
1581 nice property: when a macro name appears inside parentheses, you can
1582 determine when it will be expanded.  If it is not quoted, it will be
1583 expanded prior to the outer macro, so that its expansion becomes the
1584 argument.  If it is single-quoted, it will be expanded after the outer
1585 macro.  And if it is double-quoted, it will be used as literal text
1586 instead of a macro name.
1587
1588 @example
1589 define(`active', `ACT, IVE')
1590 @result{}
1591 define(`show', `$1 $1')
1592 @result{}
1593 show(active)
1594 @result{}ACT ACT
1595 show(`active')
1596 @result{}ACT, IVE ACT, IVE
1597 show(``active'')
1598 @result{}active active
1599 @end example
1600
1601 @node Macro expansion
1602 @section Macro expansion
1603
1604 @cindex macros, expansion of
1605 @cindex expansion of macros
1606 When the arguments, if any, to a macro call have been collected, the
1607 macro is expanded, and the expansion text is pushed back onto the input
1608 (unquoted), and reread.  The expansion text from one macro call might
1609 therefore result in more macros being called, if the calls are included,
1610 completely or partially, in the first macro calls' expansion.
1611
1612 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1613 @var{bar} expands to @samp{Hello}, the input
1614
1615 @comment options: -Dbar=Hello -Dfoo=bar
1616 @example
1617 $ @kbd{m4 -Dbar=Hello -Dfoo=bar}
1618 foo
1619 @result{}Hello
1620 @end example
1621
1622 @noindent
1623 will expand first to @samp{bar}, and when this is reread and
1624 expanded, into @samp{Hello}.
1625
1626 @ignore
1627 @comment not worth documenting, but test that the command line can
1628 @comment define macros that take parameters
1629
1630 @comment options: -Dfoo -Decho=$@
1631 @example
1632 $ @kbd{m4 -Dfoo -Decho='$@'}
1633 foo
1634 @result{}
1635 foo(`silently ignored')
1636 @result{}
1637 echo(`1', `2')
1638 @result{}1,2
1639 @end example
1640 @end ignore
1641
1642 @node Definitions
1643 @chapter How to define new macros
1644
1645 @cindex macros, how to define new
1646 @cindex defining new macros
1647 Macros can be defined, redefined and deleted in several different ways.
1648 Also, it is possible to redefine a macro without losing a previous
1649 value, and bring back the original value at a later time.
1650
1651 @menu
1652 * Define::                      Defining a new macro
1653 * Arguments::                   Arguments to macros
1654 * Pseudo Arguments::            Special arguments to macros
1655 * Undefine::                    Deleting a macro
1656 * Defn::                        Renaming macros
1657 * Pushdef::                     Temporarily redefining macros
1658
1659 * Indir::                       Indirect call of macros
1660 * Builtin::                     Indirect call of builtins
1661 @end menu
1662
1663 @node Define
1664 @section Defining a macro
1665
1666 The normal way to define or redefine macros is to use the builtin
1667 @code{define}:
1668
1669 @deffn Builtin define (@var{name}, @ovar{expansion})
1670 Defines @var{name} to expand to @var{expansion}.  If
1671 @var{expansion} is not given, it is taken to be empty.
1672
1673 The expansion of @code{define} is void.
1674 The macro @code{define} is recognized only with parameters.
1675 @end deffn
1676
1677 The following example defines the macro @var{foo} to expand to the text
1678 @samp{Hello World.}.
1679
1680 @example
1681 define(`foo', `Hello world.')
1682 @result{}
1683 foo
1684 @result{}Hello world.
1685 @end example
1686
1687 The empty line in the output is there because the newline is not
1688 a part of the macro definition, and it is consequently copied to
1689 the output.  This can be avoided by use of the macro @code{dnl}.
1690 @xref{Dnl}, for details.
1691
1692 The first argument to @code{define} should be quoted; otherwise, if the
1693 macro is already defined, you will be defining a different macro.  This
1694 example shows the problems with underquoting, since we did not want to
1695 redefine @code{one}:
1696
1697 @example
1698 define(foo, one)
1699 @result{}
1700 define(foo, two)
1701 @result{}
1702 one
1703 @result{}two
1704 @end example
1705
1706 @cindex @acronym{GNU} extensions
1707 @acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
1708 definition of a macro if it has several definitions from @code{pushdef}
1709 (@pxref{Pushdef}).  Some other implementations of @code{m4} replace all
1710 definitions of a macro with @code{define}.  @xref{Incompatibilities},
1711 for more details.
1712
1713 As a @acronym{GNU} extension, the first argument to @code{define} does
1714 not have to be a simple word.
1715 It can be any text string, even the empty string.  A macro with a
1716 non-standard name cannot be invoked in the normal way, as the name is
1717 not recognized.  It can only be referenced by the builtins @code{indir}
1718 (@pxref{Indir}) and @code{defn} (@pxref{Defn}).
1719
1720 @cindex arrays
1721 Arrays and associative arrays can be simulated by using non-standard
1722 macro names.
1723
1724 @deffn Composite array (@var{index})
1725 @deffnx Composite array_set (@var{index}, @ovar{value})
1726 Provide access to entries within an array.  @code{array} reads the entry
1727 at location @var{index}, and @code{array_set} assigns @var{value} to
1728 location @var{index}.
1729 @end deffn
1730
1731 @example
1732 define(`array', `defn(format(``array[%d]'', `$1'))')
1733 @result{}
1734 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1735 @result{}
1736 array_set(`4', `array element no. 4')
1737 @result{}
1738 array_set(`17', `array element no. 17')
1739 @result{}
1740 array(`4')
1741 @result{}array element no. 4
1742 array(eval(`10 + 7'))
1743 @result{}array element no. 17
1744 @end example
1745
1746 Change the @samp{%d} to @samp{%s} and it is an associative array.
1747
1748 @node Arguments
1749 @section Arguments to macros
1750
1751 @cindex macros, arguments to
1752 @cindex arguments to macros
1753 Macros can have arguments.  The @var{n}th argument is denoted by
1754 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1755 argument, when the macro is expanded.  Replacement of arguments happens
1756 before rescanning, regardless of how many nesting levels of quoting
1757 appear in the expansion.  Here is an example of a macro with
1758 two arguments.
1759
1760 @deffn Composite exch (@var{arg1}, @var{arg2})
1761 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1762 their order.
1763 @end deffn
1764
1765 @example
1766 define(`exch', `$2, $1')
1767 @result{}
1768 exch(`arg1', `arg2')
1769 @result{}arg2, arg1
1770 @end example
1771
1772 This can be used, for example, if you like the arguments to
1773 @code{define} to be reversed.
1774
1775 @example
1776 define(`exch', `$2, $1')
1777 @result{}
1778 define(exch(``expansion text'', ``macro''))
1779 @result{}
1780 macro
1781 @result{}expansion text
1782 @end example
1783
1784 @xref{Quoting Arguments}, for an explanation of the double quotes.
1785 (You should try and improve this example so that clients of @code{exch}
1786 do not have to double quote; or @pxref{Improved exch, , Answers}).
1787
1788 As a special case, the zeroth argument, @code{$0}, is always the name
1789 of the macro being expanded.
1790
1791 @example
1792 define(`test', ``Macro name: $0'')
1793 @result{}
1794 test
1795 @result{}Macro name: test
1796 @end example
1797
1798 If you want quoted text to appear as part of the expansion text,
1799 remember that quotes can be nested in quoted strings.  Thus, in
1800
1801 @example
1802 define(`foo', `This is macro `foo'.')
1803 @result{}
1804 foo
1805 @result{}This is macro foo.
1806 @end example
1807
1808 @noindent
1809 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1810 a quoted string, and not a name.
1811
1812 @cindex @acronym{GNU} extensions
1813 @cindex nine arguments, more than
1814 @cindex more than nine arguments
1815 @cindex arguments, more than nine
1816 @cindex positional parameters, more than nine
1817 @acronym{GNU} @code{m4} allows the number following the @samp{$} to
1818 consist of one or more digits, allowing macros to have any number of
1819 arguments.  The extension of accepting multiple digits is incompatible
1820 with @acronym{POSIX}, and is different than traditional implementations
1821 of @code{m4}, which only recognize one digit.  Therefore, future
1822 versions of @acronym{GNU} M4 will phase out this feature.  To portably
1823 access beyond the ninth argument, you can use the @code{argn} macro
1824 documented later (@pxref{Shift}).
1825
1826 @acronym{POSIX} also states that @samp{$} followed immediately by
1827 @samp{@{} in a macro definition is implementation-defined.  This version
1828 of M4 passes the literal characters @samp{$@{} through unchanged, but M4
1829 2.0 will implement an optional feature similar to @command{sh}, where
1830 @samp{$@{11@}} expands to the eleventh argument, to replace the current
1831 recognition of @samp{$11}.  Meanwhile, if you want to guarantee that you
1832 will get a literal @samp{$@{} in output when expanding a macro, even
1833 when you upgrade to M4 2.0, you can use nested quoting to your
1834 advantage:
1835
1836 @example
1837 define(`foo', `single quoted $`'@{1@} output')
1838 @result{}
1839 define(`bar', ``double quoted $'`@{2@} output'')
1840 @result{}
1841 foo(`a', `b')
1842 @result{}single quoted $@{1@} output
1843 bar(`a', `b')
1844 @result{}double quoted $@{2@} output
1845 @end example
1846
1847 To help you detect places in your M4 input files that might change in
1848 behavior due to the changed behavior of M4 2.0, you can use the
1849 @option{--warn-macro-sequence} command-line option (@pxref{Operation
1850 modes, , Invoking m4}) with the default regular expression.  This will
1851 add a warning any time a macro definition includes @samp{$} followed by
1852 multiple digits, or by @samp{@{}.  The warning is not enabled by
1853 default, because it triggers a number of warnings in Autoconf 2.61 (and
1854 Autoconf uses @option{-E} to treat warnings as errors), and because it
1855 will still be possible to restore older behavior in M4 2.0.
1856
1857 @comment options: --warn-macro-sequence
1858 @example
1859 $ @kbd{m4 --warn-macro-sequence}
1860 define(`foo', `$001 $@{1@} $1')
1861 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$001'
1862 @error{}m4:stdin:1: Warning: definition of `foo' contains sequence `$@{1@}'
1863 @result{}
1864 foo(`bar')
1865 @result{}bar $@{1@} bar
1866 @end example
1867
1868 @node Pseudo Arguments
1869 @section Special arguments to macros
1870
1871 @cindex special arguments to macros
1872 @cindex macros, special arguments to
1873 @cindex arguments to macros, special
1874 There is a special notation for the number of actual arguments supplied,
1875 and for all the actual arguments.
1876
1877 The number of actual arguments in a macro call is denoted by @code{$#}
1878 in the expansion text.
1879
1880 @deffn Composite nargs (@dots{})
1881 Expands to a count of the number of arguments supplied.
1882 @end deffn
1883
1884 @example
1885 define(`nargs', `$#')
1886 @result{}
1887 nargs
1888 @result{}0
1889 nargs()
1890 @result{}1
1891 nargs(`arg1', `arg2', `arg3')
1892 @result{}3
1893 nargs(`commas can be quoted, like this')
1894 @result{}1
1895 nargs(arg1#inside comments, commas do not separate arguments
1896 still arg1)
1897 @result{}1
1898 nargs((unquoted parentheses, like this, group arguments))
1899 @result{}1
1900 @end example
1901
1902 Remember that @samp{#} defaults to the comment character; if you forget
1903 quotes to inhibit the comment behavior, your macro definition may not
1904 end where you expected.
1905
1906 @example
1907 dnl Attempt to define a macro to just `$#'
1908 define(underquoted, $#)
1909 oops)
1910 @result{}
1911 underquoted
1912 @result{}0)
1913 @result{}oops
1914 @end example
1915
1916 The notation @code{$*} can be used in the expansion text to denote all
1917 the actual arguments, unquoted, with commas in between.  For example
1918
1919 @example
1920 define(`echo', `$*')
1921 @result{}
1922 echo(arg1,    arg2, arg3 , arg4)
1923 @result{}arg1,arg2,arg3 ,arg4
1924 @end example
1925
1926 Often each argument should be quoted, and the notation @code{$@@} handles
1927 that.  It is just like @code{$*}, except that it quotes each argument.
1928 A simple example of that is:
1929
1930 @example
1931 define(`echo', `$@@')
1932 @result{}
1933 echo(arg1,    arg2, arg3 , arg4)
1934 @result{}arg1,arg2,arg3 ,arg4
1935 @end example
1936
1937 Where did the quotes go?  Of course, they were eaten, when the expanded
1938 text were reread by @code{m4}.  To show the difference, try
1939
1940 @example
1941 define(`echo1', `$*')
1942 @result{}
1943 define(`echo2', `$@@')
1944 @result{}
1945 define(`foo', `This is macro `foo'.')
1946 @result{}
1947 echo1(foo)
1948 @result{}This is macro This is macro foo..
1949 echo1(`foo')
1950 @result{}This is macro foo.
1951 echo2(foo)
1952 @result{}This is macro foo.
1953 echo2(`foo')
1954 @result{}foo
1955 @end example
1956
1957 @noindent
1958 @xref{Trace}, if you do not understand this.  As another example of the
1959 difference, remember that comments encountered in arguments are passed
1960 untouched to the macro, and that quoting disables comments.
1961
1962 @example
1963 define(`echo1', `$*')
1964 @result{}
1965 define(`echo2', `$@@')
1966 @result{}
1967 define(`foo', `bar')
1968 @result{}
1969 echo1(#foo'foo
1970 foo)
1971 @result{}#foo'foo
1972 @result{}bar
1973 echo2(#foo'foo
1974 foo)
1975 @result{}#foobar
1976 @result{}bar'
1977 @end example
1978
1979 @ignore
1980 @comment Not worth putting in the manual, but this example is needed for
1981 @comment good test coverage of copying large strings across recursion
1982 @comment levels.
1983
1984 @example
1985 define(`echo', `$@@')dnl
1986 echo(echo(`01234567890123456789', `01234567890123456789')
1987 echo(`98765432109876543210', `98765432109876543210'))
1988 @result{}01234567890123456789,01234567890123456789
1989 @result{}98765432109876543210,98765432109876543210
1990 len((echo(`01234567890123456789',
1991           `01234567890123456789')echo(`98765432109876543210',
1992                                       `98765432109876543210')))
1993 @result{}84
1994 indir(`echo', indir(`echo', `01234567890123456789',
1995                             `01234567890123456789')
1996 indir(`echo', `98765432109876543210', `98765432109876543210'))
1997 @result{}01234567890123456789,01234567890123456789
1998 @result{}98765432109876543210,98765432109876543210
1999 define(`argn', `$#')dnl
2000 define(`echo1', `-$@@-')define(`echo2', `,$@@,')dnl
2001 echo1(`1', `2', `3') argn(echo1(`1', `2', `3'))
2002 @result{}-1,2,3- 3
2003 echo2(`1', `2', `3') argn(echo2(`1', `2', `3'))
2004 @result{},1,2,3, 5
2005 @end example
2006 @end ignore
2007
2008 A @samp{$} sign in the expansion text, that is not followed by anything
2009 @code{m4} understands, is simply copied to the macro expansion, as any
2010 other text is.
2011
2012 @example
2013 define(`foo', `$$$ hello $$$')
2014 @result{}
2015 foo
2016 @result{}$$$ hello $$$
2017 @end example
2018
2019 @cindex rescanning
2020 @cindex literal output
2021 @cindex output, literal
2022 If you want a macro to expand to something like @samp{$12}, the
2023 judicious use of nested quoting can put a safe character between the
2024 @code{$} and the next character, relying on the rescanning to remove the
2025 nested quote.  This will prevent @code{m4} from interpreting the
2026 @code{$} sign as a reference to an argument.
2027
2028 @example
2029 define(`foo', `no nested quote: $1')
2030 @result{}
2031 foo(`arg')
2032 @result{}no nested quote: arg
2033 define(`foo', `nested quote around $: `$'1')
2034 @result{}
2035 foo(`arg')
2036 @result{}nested quote around $: $1
2037 define(`foo', `nested empty quote after $: $`'1')
2038 @result{}
2039 foo(`arg')
2040 @result{}nested empty quote after $: $1
2041 define(`foo', `nested quote around next character: $`1'')
2042 @result{}
2043 foo(`arg')
2044 @result{}nested quote around next character: $1
2045 define(`foo', `nested quote around both: `$1'')
2046 @result{}
2047 foo(`arg')
2048 @result{}nested quote around both: arg
2049 @end example
2050
2051 @node Undefine
2052 @section Deleting a macro
2053
2054 @cindex macros, how to delete
2055 @cindex deleting macros
2056 @cindex undefining macros
2057 A macro definition can be removed with @code{undefine}:
2058
2059 @deffn Builtin undefine (@var{name}@dots{})
2060 For each argument, remove the macro @var{name}.  The macro names must
2061 necessarily be quoted, since they will be expanded otherwise.
2062
2063 The expansion of @code{undefine} is void.
2064 The macro @code{undefine} is recognized only with parameters.
2065 @end deffn
2066
2067 @example
2068 foo bar blah
2069 @result{}foo bar blah
2070 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2071 @result{}
2072 foo bar blah
2073 @result{}some other text
2074 undefine(`foo')
2075 @result{}
2076 foo bar blah
2077 @result{}foo other text
2078 undefine(`bar', `blah')
2079 @result{}
2080 foo bar blah
2081 @result{}foo bar blah
2082 @end example
2083
2084 Undefining a macro inside that macro's expansion is safe; the macro
2085 still expands to the definition that was in effect at the @samp{(}.
2086
2087 @example
2088 define(`f', ``$0':$1')
2089 @result{}
2090 f(f(f(undefine(`f')`hello world')))
2091 @result{}f:f:f:hello world
2092 f(`bye')
2093 @result{}f(bye)
2094 @end example
2095
2096 It is not an error for @var{name} to have no macro definition.  In that
2097 case, @code{undefine} does nothing.
2098
2099 @node Defn
2100 @section Renaming macros
2101
2102 @cindex macros, how to rename
2103 @cindex renaming macros
2104 @cindex macros, displaying definitions
2105 @cindex definitions, displaying macro
2106 It is possible to rename an already defined macro.  To do this, you need
2107 the builtin @code{defn}:
2108
2109 @deffn Builtin defn (@var{name}@dots{})
2110 Expands to the @emph{quoted definition} of each @var{name}.  If an
2111 argument is not a defined macro, the expansion for that argument is
2112 empty.
2113
2114 If @var{name} is a user-defined macro, the quoted definition is simply
2115 the quoted expansion text.  If, instead, there is only one @var{name}
2116 and it is a builtin, the
2117 expansion is a special token, which points to the builtin's internal
2118 definition.  This token is only meaningful as the second argument to
2119 @code{define} (and @code{pushdef}), and is silently converted to an
2120 empty string in most other contexts.  Using multiple @var{name} to
2121 combine a builtin with anything else is not supported; a warning is
2122 issued and the builtin is omitted from the final expansion.
2123
2124 The macro @code{defn} is recognized only with parameters.
2125 @end deffn
2126
2127 Its normal use is best understood through an example, which shows how to
2128 rename @code{undefine} to @code{zap}:
2129
2130 @example
2131 define(`zap', defn(`undefine'))
2132 @result{}
2133 zap(`undefine')
2134 @result{}
2135 undefine(`zap')
2136 @result{}undefine(zap)
2137 @end example
2138
2139 In this way, @code{defn} can be used to copy macro definitions, and also
2140 definitions of builtin macros.  Even if the original macro is removed,
2141 the other name can still be used to access the definition.
2142
2143 The fact that macro definitions can be transferred also explains why you
2144 should use @code{$0}, rather than retyping a macro's name in its
2145 definition:
2146
2147 @example
2148 define(`foo', `This is `$0'')
2149 @result{}
2150 define(`bar', defn(`foo'))
2151 @result{}
2152 bar
2153 @result{}This is bar
2154 @end example
2155
2156 Macros used as string variables should be referred through @code{defn},
2157 to avoid unwanted expansion of the text:
2158
2159 @example
2160 define(`string', `The macro dnl is very useful
2161 ')
2162 @result{}
2163 string
2164 @result{}The macro@w{ }
2165 defn(`string')
2166 @result{}The macro dnl is very useful
2167 @result{}
2168 @end example
2169
2170 @cindex rescanning
2171 However, it is important to remember that @code{m4} rescanning is purely
2172 textual.  If an unbalanced end-quote string occurs in a macro
2173 definition, the rescan will see that embedded quote as the termination
2174 of the quoted string, and the remainder of the macro's definition will
2175 be rescanned unquoted.  Thus it is a good idea to avoid unbalanced
2176 end-quotes in macro definitions or arguments to macros.
2177
2178 @example
2179 define(`foo', a'a)
2180 @result{}
2181 define(`a', `A')
2182 @result{}
2183 define(`echo', `$@@')
2184 @result{}
2185 foo
2186 @result{}A'A
2187 defn(`foo')
2188 @result{}aA'
2189 echo(foo)
2190 @result{}AA'
2191 @end example
2192
2193 On the other hand, it is possible to exploit the fact that @code{defn}
2194 can concatenate multiple macros prior to the rescanning phase, in order
2195 to join the definitions of macros that, in isolation, have unbalanced
2196 quotes.  This is particularly useful when one has used several macros to
2197 accumulate text that M4 should rescan as a whole.  In the example below,
2198 note how the use of @code{defn} on @code{l} in isolation opens a string,
2199 which is not closed until the next line; but used on @code{l} and
2200 @code{r} together results in nested quoting.
2201
2202 @example
2203 define(`l', `<[>')define(`r', `<]>')
2204 @result{}
2205 changequote(`[', `]')
2206 @result{}
2207 defn([l])defn([r])
2208 ])
2209 @result{}<[>]defn([r])
2210 @result{})
2211 defn([l], [r])
2212 @result{}<[>][<]>
2213 @end example
2214
2215 @cindex builtins, special tokens
2216 @cindex tokens, builtin macro
2217 Using @code{defn} to generate special tokens for builtin macros will
2218 generate a warning in contexts where a macro name is expected.  But in
2219 contexts that operate on text, the builtin token is just silently
2220 converted to an empty string.  As of M4 1.4.11, expansion of user macros
2221 will also preserve builtin tokens.  However, any use of builtin tokens
2222 outside of the second argument to @code{define} and @code{pushdef} is
2223 generally not portable, since earlier @acronym{GNU} M4 versions, as well
2224 as other @code{m4} implementations, vary on how such tokens are treated.
2225
2226 @example
2227 $ @kbd{m4 -d}
2228 defn(`defn')
2229 @result{}
2230 define(defn(`divnum'), `cannot redefine a builtin token')
2231 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2232 @result{}
2233 divnum
2234 @result{}0
2235 len(defn(`divnum'))
2236 @result{}0
2237 define(`echo', `$@@')
2238 @result{}
2239 define(`mydivnum', shift(echo(`', defn(`divnum'))))
2240 @result{}
2241 mydivnum
2242 @result{}0
2243 define(`', `empty-$1')
2244 @result{}
2245 defn(defn(`divnum'))
2246 @error{}m4:stdin:9: Warning: defn: invalid macro name ignored
2247 @result{}
2248 pushdef(defn(`divnum'), `oops')
2249 @error{}m4:stdin:10: Warning: pushdef: invalid macro name ignored
2250 @result{}
2251 traceon(defn(`divnum'))
2252 @error{}m4:stdin:11: Warning: traceon: invalid macro name ignored
2253 @result{}
2254 indir(defn(`divnum'), `string')
2255 @error{}m4:stdin:12: Warning: indir: invalid macro name ignored
2256 @result{}
2257 indir(`', `string')
2258 @result{}empty-string
2259 traceoff(defn(`divnum'))
2260 @error{}m4:stdin:14: Warning: traceoff: invalid macro name ignored
2261 @result{}
2262 popdef(defn(`divnum'))
2263 @error{}m4:stdin:15: Warning: popdef: invalid macro name ignored
2264 @result{}
2265 dumpdef(defn(`divnum'))
2266 @error{}m4:stdin:16: Warning: dumpdef: invalid macro name ignored
2267 @result{}
2268 undefine(defn(`divnum'))
2269 @error{}m4:stdin:17: Warning: undefine: invalid macro name ignored
2270 @result{}
2271 dumpdef(`')
2272 @error{}:@tabchar{}`empty-$1'
2273 @result{}
2274 define(`foo', `define(`$1', $2)')dnl
2275 foo(`bar', defn(`divnum'))
2276 @result{}
2277 bar
2278 @result{}0
2279 @end example
2280
2281 Also note that @code{defn} with multiple arguments can only join text
2282 macros, not builtins.  Likewise, when collecting macro arguments, a
2283 builtin token is preserved only when it occurs in isolation.  A future
2284 version of @acronym{GNU} M4 may lift these restrictions.
2285
2286 @example
2287 define(`a', `A')define(`AA', `b')
2288 @result{}
2289 defn(`a', `divnum', `a')
2290 @error{}m4:stdin:2: Warning: defn: cannot concatenate builtin `divnum'
2291 @result{}AA
2292 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2293 @error{}m4:stdin:3: Warning: defn: cannot concatenate builtin `divnum'
2294 @error{}m4:stdin:3: Warning: defn: cannot concatenate builtin `divnum'
2295 @result{}
2296 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
2297 @error{}m4:stdin:4: Warning: define: cannot concatenate builtin `divnum'
2298 @error{}m4:stdin:4: Warning: define: cannot concatenate builtin `divnum'
2299 @result{}
2300 define(`mydivnum', defn(`divnum')`a')mydivnum
2301 @error{}m4:stdin:5: Warning: define: cannot concatenate builtin `divnum'
2302 @result{}A
2303 define(`mydivnum', `a'defn(`divnum'))mydivnum
2304 @error{}m4:stdin:6: Warning: define: cannot concatenate builtin `divnum'
2305 @result{}A
2306 @end example
2307
2308 @node Pushdef
2309 @section Temporarily redefining macros
2310
2311 @cindex macros, temporary redefinition of
2312 @cindex temporary redefinition of macros
2313 @cindex redefinition of macros, temporary
2314 @cindex definition stack
2315 @cindex stack, macro definition
2316 It is possible to redefine a macro temporarily, reverting to the
2317 previous definition at a later time.  This is done with the builtins
2318 @code{pushdef} and @code{popdef}:
2319
2320 @deffn Builtin pushdef (@var{name}, @ovar{expansion})
2321 @deffnx Builtin popdef (@var{name}@dots{})
2322 Analogous to @code{define} and @code{undefine}.
2323
2324 These macros work in a stack-like fashion.  A macro is temporarily
2325 redefined with @code{pushdef}, which replaces an existing definition of
2326 @var{name}, while saving the previous definition, before the new one is
2327 installed.  If there is no previous definition, @code{pushdef} behaves
2328 exactly like @code{define}.
2329
2330 If a macro has several definitions (of which only one is accessible),
2331 the topmost definition can be removed with @code{popdef}.  If there is
2332 no previous definition, @code{popdef} behaves like @code{undefine}.
2333
2334 The expansion of both @code{pushdef} and @code{popdef} is void.
2335 The macros @code{pushdef} and @code{popdef} are recognized only with
2336 parameters.
2337 @end deffn
2338
2339 @example
2340 define(`foo', `Expansion one.')
2341 @result{}
2342 foo
2343 @result{}Expansion one.
2344 pushdef(`foo', `Expansion two.')
2345 @result{}
2346 foo
2347 @result{}Expansion two.
2348 pushdef(`foo', `Expansion three.')
2349 @result{}
2350 pushdef(`foo', `Expansion four.')
2351 @result{}
2352 popdef(`foo')
2353 @result{}
2354 foo
2355 @result{}Expansion three.
2356 popdef(`foo', `foo')
2357 @result{}
2358 foo
2359 @result{}Expansion one.
2360 popdef(`foo')
2361 @result{}
2362 foo
2363 @result{}foo
2364 @end example
2365
2366 If a macro with several definitions is redefined with @code{define}, the
2367 topmost definition is @emph{replaced} with the new definition.  If it is
2368 removed with @code{undefine}, @emph{all} the definitions are removed,
2369 and not only the topmost one.  However, @acronym{POSIX} allows other
2370 implementations that treat @code{define} as replacing an entire stack
2371 of definitions with a single new definition, so to be portable to other
2372 implementations, it may be worth explicitly using @code{popdef} and
2373 @code{pushdef} rather than relying on the @acronym{GNU} behavior of
2374 @code{define}.
2375
2376 @example
2377 define(`foo', `Expansion one.')
2378 @result{}
2379 foo
2380 @result{}Expansion one.
2381 pushdef(`foo', `Expansion two.')
2382 @result{}
2383 foo
2384 @result{}Expansion two.
2385 define(`foo', `Second expansion two.')
2386 @result{}
2387 foo
2388 @result{}Second expansion two.
2389 undefine(`foo')
2390 @result{}
2391 foo
2392 @result{}foo
2393 @end example
2394
2395 @cindex local variables
2396 @cindex variables, local
2397 Local variables within macros are made with @code{pushdef} and
2398 @code{popdef}.  At the start of the macro a new definition is pushed,
2399 within the macro it is manipulated and at the end it is popped,
2400 revealing the former definition.
2401
2402 It is possible to temporarily redefine a builtin with @code{pushdef}
2403 and @code{defn}.
2404
2405 @node Indir
2406 @section Indirect call of macros
2407
2408 @cindex indirect call of macros
2409 @cindex call of macros, indirect
2410 @cindex macros, indirect call of
2411 @cindex @acronym{GNU} extensions
2412 Any macro can be called indirectly with @code{indir}:
2413
2414 @deffn Builtin indir (@var{name}, @ovar{args@dots{}})
2415 Results in a call to the macro @var{name}, which is passed the
2416 rest of the arguments @var{args}.  If @var{name} is not defined, an
2417 error message is printed, and the expansion is void.
2418
2419 The macro @code{indir} is recognized only with parameters.
2420 @end deffn
2421
2422 This can be used to call macros with computed or ``invalid''
2423 names (@code{define} allows such names to be defined):
2424
2425 @example
2426 define(`$$internal$macro', `Internal macro (name `$0')')
2427 @result{}
2428 $$internal$macro
2429 @result{}$$internal$macro
2430 indir(`$$internal$macro')
2431 @result{}Internal macro (name $$internal$macro)
2432 @end example
2433
2434 The point is, here, that larger macro packages can have private macros
2435 defined, that will not be called by accident.  They can @emph{only} be
2436 called through the builtin @code{indir}.
2437
2438 One other point to observe is that argument collection occurs before
2439 @code{indir} invokes @var{name}, so if argument collection changes the
2440 value of @var{name}, that will be reflected in the final expansion.
2441 This is different than the behavior when invoking macros directly,
2442 where the definition that was in effect before argument collection is
2443 used.
2444
2445 @example
2446 $ @kbd{m4 -d}
2447 define(`f', `1')
2448 @result{}
2449 f(define(`f', `2'))
2450 @result{}1
2451 indir(`f', define(`f', `3'))
2452 @result{}3
2453 indir(`f', undefine(`f'))
2454 @error{}m4:stdin:4: Warning: indir: undefined macro `f'
2455 @result{}
2456 @end example
2457
2458 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2459 arguments, @code{indir} defers to the invoked @var{name} for whether a
2460 token representing a builtin is recognized or flattened to the empty
2461 string.
2462
2463 @example
2464 $ @kbd{m4 -d}
2465 indir(defn(`defn'), `divnum')
2466 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2467 @result{}
2468 indir(`define', defn(`defn'), `divnum')
2469 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2470 @result{}
2471 indir(`define', `foo', defn(`divnum'))
2472 @result{}
2473 foo
2474 @result{}0
2475 indir(`divert', defn(`foo'))
2476 @error{}m4:stdin:5: Warning: divert: empty string treated as 0
2477 @result{}
2478 @end example
2479
2480 Warning messages issued on behalf of an indirect macro use an
2481 unambiguous representation of the macro name, using escape sequences
2482 similar to C strings, and with colons also quoted.
2483
2484 @example
2485 define(`%%:\
2486 odd', defn(`divnum'))
2487 @result{}
2488 indir(`%%:\
2489 odd', `extra')
2490 @error{}m4:stdin:3: Warning: %%\:\\\nodd: extra arguments ignored: 1 > 0
2491 @result{}0
2492 @end example
2493
2494 @node Builtin
2495 @section Indirect call of builtins
2496
2497 @cindex indirect call of builtins
2498 @cindex call of builtins, indirect
2499 @cindex builtins, indirect call of
2500 @cindex @acronym{GNU} extensions
2501 Builtin macros can be called indirectly with @code{builtin}:
2502
2503 @deffn Builtin builtin (@var{name}, @ovar{args@dots{}})
2504 Results in a call to the builtin @var{name}, which is passed the
2505 rest of the arguments @var{args}.  If @var{name} does not name a
2506 builtin, an error message is printed, and the expansion is void.
2507
2508 The macro @code{builtin} is recognized only with parameters.
2509 @end deffn
2510
2511 This can be used even if @var{name} has been given another definition
2512 that has covered the original, or been undefined so that no macro
2513 maps to the builtin.
2514
2515 @example
2516 pushdef(`define', `hidden')
2517 @result{}
2518 undefine(`undefine')
2519 @result{}
2520 define(`foo', `bar')
2521 @result{}hidden
2522 foo
2523 @result{}foo
2524 builtin(`define', `foo', defn(`divnum'))
2525 @result{}
2526 foo
2527 @result{}0
2528 builtin(`define', `foo', `BAR')
2529 @result{}
2530 foo
2531 @result{}BAR
2532 undefine(`foo')
2533 @result{}undefine(foo)
2534 foo
2535 @result{}BAR
2536 builtin(`undefine', `foo')
2537 @result{}
2538 foo
2539 @result{}foo
2540 @end example
2541
2542 The @var{name} argument only matches the original name of the builtin,
2543 even when the @option{--prefix-builtins} option (or @option{-P},
2544 @pxref{Operation modes, , Invoking m4}) is in effect.  This is different
2545 from @code{indir}, which only tracks current macro names.
2546
2547 @comment options: -P
2548 @example
2549 $ @kbd{m4 -P}
2550 m4_builtin(`divnum')
2551 @result{}0
2552 m4_builtin(`m4_divnum')
2553 @error{}m4:stdin:2: Warning: m4_builtin: undefined builtin `m4_divnum'
2554 @result{}
2555 m4_indir(`divnum')
2556 @error{}m4:stdin:3: Warning: m4_indir: undefined macro `divnum'
2557 @result{}
2558 m4_indir(`m4_divnum')
2559 @result{}0
2560 @end example
2561
2562 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2563 without arguments, even when they normally require parameters to be
2564 recognized; but it will provoke a warning, and result in a void expansion.
2565
2566 @example
2567 builtin
2568 @result{}builtin
2569 builtin()
2570 @error{}m4:stdin:2: Warning: builtin: undefined builtin `'
2571 @result{}
2572 builtin(`builtin')
2573 @error{}m4:stdin:3: Warning: builtin: too few arguments: 0 < 1
2574 @result{}
2575 builtin(`builtin',)
2576 @error{}m4:stdin:4: Warning: builtin: undefined builtin `'
2577 @result{}
2578 @end example
2579
2580 @ignore
2581 @comment This example is not worth putting in the manual, but it is
2582 @comment needed for full coverage.  Autoconf's m4_include relies heavily
2583 @comment on this feature.
2584
2585 @example
2586 builtin(`include', `foo')dnl
2587 @result{}bar
2588 @end example
2589
2590 @comment And this example triggers a regression present in 1.4.10b.
2591
2592 @example
2593 define(`s', `builtin(`shift', $@@)')dnl
2594 define(`loop', `ifelse(`$2', `', `-', `$1$2: $0(`$1', s(s($@@)))')')dnl
2595 loop(`1')
2596 @result{}-
2597 loop(`1', `2')
2598 @result{}12: -
2599 loop(`1', `2', `3')
2600 @result{}12: 13: -
2601 loop(`1', `2', `3', `4')
2602 @result{}12: 13: 14: -
2603 loop(`1', `2', `3', `4', `5')
2604 @result{}12: 13: 14: 15: -
2605 @end example
2606 @end ignore
2607
2608 @node Conditionals
2609 @chapter Conditionals, loops, and recursion
2610
2611 Macros, expanding to plain text, perhaps with arguments, are not quite
2612 enough.  We would like to have macros expand to different things, based
2613 on decisions taken at run-time.  For that, we need some kind of conditionals.
2614 Also, we would like to have some kind of loop construct, so we could do
2615 something a number of times, or while some condition is true.
2616
2617 @menu
2618 * Ifdef::                       Testing if a macro is defined
2619 * Ifelse::                      If-else construct, or multibranch
2620 * Shift::                       Recursion in @code{m4}
2621 * Forloop::                     Iteration by counting
2622 * Foreach::                     Iteration by list contents
2623 @end menu
2624
2625 @node Ifdef
2626 @section Testing if a macro is defined
2627
2628 @cindex conditionals
2629 There are two different builtin conditionals in @code{m4}.  The first is
2630 @code{ifdef}:
2631
2632 @deffn Builtin ifdef (@var{name}, @var{string-1}, @ovar{string-2})
2633 If @var{name} is defined as a macro, @code{ifdef} expands to
2634 @var{string-1}, otherwise to @var{string-2}.  If @var{string-2} is
2635 omitted, it is taken to be the empty string (according to the normal
2636 rules).
2637
2638 The macro @code{ifdef} is recognized only with parameters.
2639 @end deffn
2640
2641 @example
2642 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2643 @result{}foo is not defined
2644 define(`foo', `')
2645 @result{}
2646 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2647 @result{}foo is defined
2648 ifdef(`no_such_macro', `yes', `no', `extra argument')
2649 @error{}m4:stdin:4: Warning: ifdef: extra arguments ignored: 4 > 3
2650 @result{}no
2651 @end example
2652
2653 As of M4 1.4.11, @code{ifdef} transparently handles builtin tokens
2654 generated by @code{defn} (@pxref{Defn}) that occur in either
2655 @var{string}, although a warning is issued for invalid macro names.
2656
2657 @example
2658 define(`', `empty')
2659 @result{}
2660 ifdef(defn(`defn'), `yes', `no')
2661 @error{}m4:stdin:2: Warning: ifdef: invalid macro name ignored
2662 @result{}no
2663 define(`foo', ifdef(`divnum', defn(`divnum'), `undefined'))
2664 @result{}
2665 foo
2666 @result{}0
2667 @end example
2668
2669 @node Ifelse
2670 @section If-else construct, or multibranch
2671
2672 @cindex comparing strings
2673 @cindex discarding input
2674 @cindex input, discarding
2675 The other conditional, @code{ifelse}, is much more powerful.  It can be
2676 used as a way to introduce a long comment, as an if-else construct, or
2677 as a multibranch, depending on the number of arguments supplied:
2678
2679 @deffn Builtin ifelse (@var{comment})
2680 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
2681   @ovar{not-equal})
2682 @deffnx Builtin ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2683   @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
2684 Used with only one argument, the @code{ifelse} simply discards it and
2685 produces no output.
2686
2687 If called with three or four arguments, @code{ifelse} expands into
2688 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
2689 for character), otherwise it expands to @var{not-equal}.  A final fifth
2690 argument is ignored, after triggering a warning.
2691
2692 If called with six or more arguments, and @var{string-1} and
2693 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2694 otherwise the first three arguments are discarded and the processing
2695 starts again.
2696
2697 The macro @code{ifelse} is recognized only with parameters.
2698 @end deffn
2699
2700 Using only one argument is a common @code{m4} idiom for introducing a
2701 block comment, as an alternative to repeatedly using @code{dnl}.  This
2702 special usage is recognized by @acronym{GNU} @code{m4}, so that in this
2703 case, the warning about missing arguments is never triggered.
2704
2705 @example
2706 ifelse(`some comments')
2707 @result{}
2708 ifelse(`foo', `bar')
2709 @error{}m4:stdin:2: Warning: ifelse: too few arguments: 2 < 3
2710 @result{}
2711 @end example
2712
2713 Using three or four arguments provides decision points.
2714
2715 @example
2716 ifelse(`foo', `bar', `true')
2717 @result{}
2718 ifelse(`foo', `foo', `true')
2719 @result{}true
2720 define(`foo', `bar')
2721 @result{}
2722 ifelse(foo, `bar', `true', `false')
2723 @result{}true
2724 ifelse(foo, `foo', `true', `false')
2725 @result{}false
2726 @end example
2727
2728 @cindex macro, blind
2729 @cindex blind macro
2730 Notice how the first argument was used unquoted; it is common to compare
2731 the expansion of a macro with a string.  With this macro, you can now
2732 reproduce the behavior of blind builtins, where the macro is recognized
2733 only with arguments.
2734
2735 @example
2736 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2737 @result{}
2738 foo
2739 @result{}foo
2740 foo()
2741 @result{}arguments:1
2742 foo(`a', `b', `c')
2743 @result{}arguments:3
2744 @end example
2745
2746 @cindex multibranches
2747 @cindex switch statement
2748 @cindex case statement
2749 However, @code{ifelse} can take more than four arguments.  If given more
2750 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2751 statement in traditional programming languages.  If @var{string-1} and
2752 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2753 the procedure is repeated with the first three arguments discarded.  This
2754 calls for an example:
2755
2756 @example
2757 ifelse(`foo', `bar', `third', `gnu', `gnats')
2758 @error{}m4:stdin:1: Warning: ifelse: extra arguments ignored: 5 > 4
2759 @result{}gnu
2760 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2761 @result{}
2762 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2763 @result{}seventh
2764 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2765 @error{}m4:stdin:4: Warning: ifelse: extra arguments ignored: 8 > 7
2766 @result{}7
2767 @end example
2768
2769 As of M4 1.4.11, @code{ifelse} transparently handles builtin tokens
2770 generated by @code{defn} (@pxref{Defn}).  Because of this, it is always
2771 safe to compare two macro definitions, without worrying whether the
2772 macro might be a builtin.
2773
2774 @example
2775 ifelse(defn(`defn'), `', `yes', `no')
2776 @result{}no
2777 ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
2778 @result{}no
2779 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
2780 @result{}yes
2781 define(`foo', ifelse(`', `', defn(`divnum')))
2782 @result{}
2783 foo
2784 @result{}0
2785 @end example
2786
2787 @ignore
2788 @comment Stress tests, not worth documenting.
2789
2790 @comment Ensure that references compared to strings work regardless of
2791 @comment similar prefixes.
2792 @example
2793 define(`e', `$@@')define(`long', `01234567890123456789')
2794 @result{}
2795 ifelse(long, `01234567890123456789', `yes', `no')
2796 @result{}yes
2797 ifelse(`01234567890123456789', long, `yes', `no')
2798 @result{}yes
2799 ifelse(long, `01234567890123456789-', `yes', `no')
2800 @result{}no
2801 ifelse(`01234567890123456789-', long, `yes', `no')
2802 @result{}no
2803 ifelse(e(long), `01234567890123456789', `yes', `no')
2804 @result{}yes
2805 ifelse(`01234567890123456789', e(long), `yes', `no')
2806 @result{}yes
2807 ifelse(e(long), `01234567890123456789-', `yes', `no')
2808 @result{}no
2809 ifelse(`01234567890123456789-', e(long), `yes', `no')
2810 @result{}no
2811 ifelse(-e(long), `-01234567890123456789', `yes', `no')
2812 @result{}yes
2813 ifelse(-`01234567890123456789', -e(long), `yes', `no')
2814 @result{}yes
2815 ifelse(-e(long), `-01234567890123456789-', `yes', `no')
2816 @result{}no
2817 ifelse(`-01234567890123456789-', -e(long), `yes', `no')
2818 @result{}no
2819 ifelse(-e(long)-, `-01234567890123456789-', `yes', `no')
2820 @result{}yes
2821 ifelse(-`01234567890123456789-', -e(long)-, `yes', `no')
2822 @result{}yes
2823 ifelse(-e(long)-, `-01234567890123456789', `yes', `no')
2824 @result{}no
2825 ifelse(`-01234567890123456789', -e(long)-, `yes', `no')
2826 @result{}no
2827 ifelse(`-'e(long), `-01234567890123456789', `yes', `no')
2828 @result{}yes
2829 ifelse(-`01234567890123456789', `-'e(long), `yes', `no')
2830 @result{}yes
2831 ifelse(`-'e(long), `-01234567890123456789-', `yes', `no')
2832 @result{}no
2833 ifelse(`-01234567890123456789-', `-'e(long), `yes', `no')
2834 @result{}no
2835 ifelse(`-'e(long)`-', `-01234567890123456789-', `yes', `no')
2836 @result{}yes
2837 ifelse(-`01234567890123456789-', `-'e(long)`-', `yes', `no')
2838 @result{}yes
2839 ifelse(`-'e(long)`-', `-01234567890123456789', `yes', `no')
2840 @result{}no
2841 ifelse(`-01234567890123456789', `-'e(long)`-', `yes', `no')
2842 @result{}no
2843 @end example
2844
2845 @comment It would be nice to pass builtin tokens through m4wrap, as well
2846 @comment as allowing concatenation of builtins in ifelse and user macros.
2847 @example
2848 define(`e', `$@@')define(`q', ``$@@'')define(`u', `$*')
2849 @result{}
2850 define(`cmp', `ifelse($1, $2, `yes', `no')')define(`d', defn(`defn'))
2851 @result{}
2852 cmp(`defn(`defn')', `defn(`d')')
2853 @result{}yes
2854 cmp(`defn(`defn')', ``<defn>'')
2855 @result{}no
2856 cmp(`q(defn(`defn'))', `q(defn(`d'))')-fixme
2857 @error{}m4:stdin:5: Warning: ifelse: cannot quote builtin
2858 @error{}m4:stdin:5: Warning: ifelse: cannot quote builtin
2859 @result{}yes-fixme
2860 cmp(`q(defn(`defn'))', `q(`<defn>')')-fixme
2861 @error{}m4:stdin:6: Warning: ifelse: cannot quote builtin
2862 @result{}no-fixme
2863 cmp(`q(defn(`defn'))', ``'')-fixme
2864 @error{}m4:stdin:7: Warning: ifelse: cannot quote builtin
2865 @result{}no-fixme
2866 cmp(`q(`1', `2', defn(`defn'))', `q(`1', `2', defn(`d'))')-fixme
2867 @error{}m4:stdin:8: Warning: ifelse: cannot quote builtin
2868 @error{}m4:stdin:8: Warning: ifelse: cannot quote builtin
2869 @result{}yes-fixme
2870 cmp(`q(`1', `2', defn(`defn'))', `q(`1', `2', `<defn>')')-fixme
2871 @error{}m4:stdin:9: Warning: ifelse: cannot quote builtin
2872 @result{}no-fixme
2873 cmp(`q(`1', `2', defn(`defn'))', ```1',`2',<defn>'')-fixme
2874 @error{}m4:stdin:10: Warning: ifelse: cannot quote builtin
2875 @result{}no-fixme
2876 cmp(`q(`1', `2', defn(`defn'))', ```1',`2',`''')-fixme
2877 @error{}m4:stdin:11: Warning: ifelse: cannot quote builtin
2878 @result{}yes-fixme
2879 define(`cat', `$1`'ifelse(`$#', `1', `', `$0(shift($@@))')')
2880 @result{}
2881 cat(`define(`foo',', defn(`divnum'), `)foo')-fixme
2882 @error{}m4:stdin:13: Warning: ifelse: cannot quote builtin
2883 @result{}-fixme
2884 cat(e(`define(`bar',', defn(`divnum'), `)bar'))-fixme
2885 @error{}m4:stdin:14: Warning: ifelse: cannot quote builtin
2886 @result{}-fixme
2887 m4wrap(`u('q(`cat(`define(`baz','', defn(`divnum'), ``)baz')')`)-fixme
2888 ')
2889 @error{}m4:stdin:15: Warning: m4wrap: cannot quote builtin
2890 @result{}
2891 ^D
2892 @result{}-fixme
2893 @end example
2894 @end ignore
2895
2896 Naturally, the normal case will be slightly more advanced than these
2897 examples.  A common use of @code{ifelse} is in macros implementing loops
2898 of various kinds.
2899
2900 @node Shift
2901 @section Recursion in @code{m4}
2902
2903 @cindex recursive macros
2904 @cindex macros, recursive
2905 There is no direct support for loops in @code{m4}, but macros can be
2906 recursive.  There is no limit on the number of recursion levels, other
2907 than those enforced by your hardware and operating system.
2908
2909 @cindex loops
2910 Loops can be programmed using recursion and the conditionals described
2911 previously.
2912
2913 There is a builtin macro, @code{shift}, which can, among other things,
2914 be used for iterating through the actual arguments to a macro:
2915
2916 @deffn Builtin shift (@var{arg1}, @dots{})
2917 Takes any number of arguments, and expands to all its arguments except
2918 @var{arg1}, separated by commas, with each argument quoted.
2919
2920 The macro @code{shift} is recognized only with parameters.
2921 @end deffn
2922
2923 @example
2924 shift
2925 @result{}shift
2926 shift(`bar')
2927 @result{}
2928 shift(`foo', `bar', `baz')
2929 @result{}bar,baz
2930 @end example
2931
2932 An example of the use of @code{shift} is this macro:
2933
2934 @deffn Composite reverse (@dots{})
2935 Takes any number of arguments, and reverses their order.
2936 @end deffn
2937
2938 It is implemented as:
2939
2940 @example
2941 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2942                           `reverse(shift($@@)), `$1'')')
2943 @result{}
2944 reverse
2945 @result{}
2946 reverse(`foo')
2947 @result{}foo
2948 reverse(`foo', `bar', `gnats', `and gnus')
2949 @result{}and gnus, gnats, bar, foo
2950 @end example
2951
2952 While not a very interesting macro, it does show how simple loops can be
2953 made with @code{shift}, @code{ifelse} and recursion.  It also shows
2954 that @code{shift} is usually used with @samp{$@@}.  Another example of
2955 this is an implementation of a short-circuiting conditional operator.
2956
2957 @cindex short-circuiting conditional
2958 @cindex conditional, short-circuiting
2959 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
2960   @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
2961 Similar to @code{ifelse}, where an equal comparison between the first
2962 two strings results in the third, otherwise the first three arguments
2963 are discarded and the process repeats.  The difference is that each
2964 @var{test-<n>} is expanded only when it is encountered.  This means that
2965 every third argument to @code{cond} is normally given one more level of
2966 quoting than the corresponding argument to @code{ifelse}.
2967 @end deffn
2968
2969 Here is the implementation of @code{cond}, along with a demonstration of
2970 how it can short-circuit the side effects in @code{side}.  Notice how
2971 all the unquoted side effects happen regardless of how many comparisons
2972 are made with @code{ifelse}, compared with only the relevant effects
2973 with @code{cond}.
2974
2975 @example
2976 define(`cond',
2977 `ifelse(`$#', `1', `$1',
2978         `ifelse($1, `$2', `$3',
2979                 `$0(shift(shift(shift($@@))))')')')dnl
2980 define(`side', `define(`counter', incr(counter))$1')dnl
2981 define(`example1',
2982 `define(`counter', `0')dnl
2983 ifelse(side(`$1'), `yes', `one comparison: ',
2984        side(`$1'), `no', `two comparisons: ',
2985        side(`$1'), `maybe', `three comparisons: ',
2986        `side(`default answer: ')')counter')dnl
2987 define(`example2',
2988 `define(`counter', `0')dnl
2989 cond(`side(`$1')', `yes', `one comparison: ',
2990      `side(`$1')', `no', `two comparisons: ',
2991      `side(`$1')', `maybe', `three comparisons: ',
2992      `side(`default answer: ')')counter')dnl
2993 example1(`yes')
2994 @result{}one comparison: 3
2995 example1(`no')
2996 @result{}two comparisons: 3
2997 example1(`maybe')
2998 @result{}three comparisons: 3
2999 example1(`feeling rather indecisive today')
3000 @result{}default answer: 4
3001 example2(`yes')
3002 @result{}one comparison: 1
3003 example2(`no')
3004 @result{}two comparisons: 2
3005 example2(`maybe')
3006 @result{}three comparisons: 3
3007 example2(`feeling rather indecisive today')
3008 @result{}default answer: 4
3009 @end example
3010
3011 Sometimes, a recursive algorithm requires adding quotes to each element,
3012 or treating multiple arguments as a single element:
3013
3014 @deffn Composite quote (@dots{})
3015 @deffnx Composite dquote (@dots{})
3016 @deffnx Composite dquote_elt (@dots{})
3017 Takes any number of arguments, and adds quoting.  With @code{quote},
3018 only one level of quoting is added, effectively removing whitespace
3019 after commas and turning multiple arguments into a single string.  With
3020 @code{dquote}, two levels of quoting are added, one around each element,
3021 and one around the list.  And with @code{dquote_elt}, two levels of
3022 quoting are added around each element.
3023 @end deffn
3024
3025 An actual implementation of these three macros is distributed as
3026 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package.  First,
3027 let's examine their usage:
3028
3029 @comment examples
3030 @example
3031 $ @kbd{m4 -I examples}
3032 include(`quote.m4')
3033 @result{}
3034 -quote-dquote-dquote_elt-
3035 @result{}----
3036 -quote()-dquote()-dquote_elt()-
3037 @result{}--`'-`'-
3038 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3039 @result{}-1-`1'-`1'-
3040 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3041 @result{}-1,2-`1',`2'-`1',`2'-
3042 define(`n', `$#')dnl
3043 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3044 @result{}-1-1-2-
3045 dquote(dquote_elt(`1', `2'))
3046 @result{}``1'',``2''
3047 dquote_elt(dquote(`1', `2'))
3048 @result{}``1',`2''
3049 @end example
3050
3051 The last two lines show that when given two arguments, @code{dquote}
3052 results in one string, while @code{dquote_elt} results in two.  Now,
3053 examine the implementation.  Note that @code{quote} and
3054 @code{dquote_elt} make decisions based on their number of arguments, so
3055 that when called without arguments, they result in nothing instead of a
3056 quoted empty string; this is so that it is possible to distinguish
3057 between no arguments and an empty first argument.  @code{dquote}, on the
3058 other hand, results in a string no matter what, since it is still
3059 possible to tell whether it was invoked without arguments based on the
3060 resulting string.
3061
3062 @comment examples
3063 @example
3064 $ @kbd{m4 -I examples}
3065 undivert(`quote.m4')dnl
3066 @result{}divert(`-1')
3067 @result{}# quote(args) - convert args to single-quoted string
3068 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3069 @result{}# dquote(args) - convert args to quoted list of quoted strings
3070 @result{}define(`dquote', ``$@@'')
3071 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3072 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3073 @result{}                             ```$1'',$0(shift($@@))')')
3074 @result{}divert`'dnl
3075 @end example
3076
3077 @cindex nine arguments, more than
3078 @cindex more than nine arguments
3079 @cindex arguments, more than nine
3080 One more useful macro based on @code{shift} allows portably selecting
3081 an arbitrary argument (usually greater than the ninth argument), without
3082 relying on the @acronym{GNU} extension of multi-digit arguments
3083 (@pxref{Arguments}).
3084
3085 @deffn Composite argn (@var{n}, @dots{})
3086 Expands to argument @var{n} out of the remaining arguments.  @var{n}
3087 must be a positive number.  Usually invoked as
3088 @samp{argn(`@var{n}',$@@)}.
3089 @end deffn
3090
3091 It is implemented as:
3092
3093 @example
3094 define(`argn', `ifelse(`$1', 1, ``$2'',
3095   `argn(decr(`$1'), shift(shift($@@)))')')
3096 @result{}
3097 argn(`1', `a')
3098 @result{}a
3099 define(`foo', `argn(`11', $@@)')
3100 @result{}
3101 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3102 @result{}k
3103 @end example
3104
3105 @node Forloop
3106 @section Iteration by counting
3107
3108 @cindex for loops
3109 @cindex loops, counting
3110 @cindex counting loops
3111 Here is an example of a loop macro that implements a simple for loop.
3112
3113 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3114 Takes the name in @var{iterator}, which must be a valid macro name, and
3115 successively assign it each integer value from @var{start} to @var{end},
3116 inclusive.  For each assignment to @var{iterator}, append @var{text} to
3117 the expansion of the @code{forloop}.  @var{text} may refer to
3118 @var{iterator}.  Any definition of @var{iterator} prior to this
3119 invocation is restored.
3120 @end deffn
3121
3122 It can, for example, be used for simple counting:
3123
3124 @comment examples
3125 @example
3126 $ @kbd{m4 -I examples}
3127 include(`forloop.m4')
3128 @result{}
3129 forloop(`i', `1', `8', `i ')
3130 @result{}1 2 3 4 5 6 7 8@w{ }
3131 @end example
3132
3133 For-loops can be nested, like:
3134
3135 @comment examples
3136 @example
3137 $ @kbd{m4 -I examples}
3138 include(`forloop.m4')
3139 @result{}
3140 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3141 ')
3142 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3143 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3144 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3145 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3146 @result{}
3147 @end example
3148
3149 The implementation of the @code{forloop} macro is fairly
3150 straightforward.  The @code{forloop} macro itself is simply a wrapper,
3151 which saves the previous definition of the first argument, calls the
3152 internal macro @code{@w{_forloop}}, and re-establishes the saved
3153 definition of the first argument.
3154
3155 The macro @code{@w{_forloop}} expands the fourth argument once, and
3156 tests to see if the iterator has reached the final value.  If it has
3157 not finished, it increments the iterator (using the predefined macro
3158 @code{incr}, @pxref{Incr}), and recurses.
3159
3160 Here is an actual implementation of @code{forloop}, distributed as
3161 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3162
3163 @comment examples
3164 @example
3165 $ @kbd{m4 -I examples}
3166 undivert(`forloop.m4')dnl
3167 @result{}divert(`-1')
3168 @result{}# forloop(var, from, to, stmt) - simple version
3169 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3170 @result{}define(`_forloop',
3171 @result{}       `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3172 @result{}divert`'dnl
3173 @end example
3174
3175 Notice the careful use of quotes.  Certain macro arguments are left
3176 unquoted, each for its own reason.  Try to find out @emph{why} these
3177 arguments are left unquoted, and see what happens if they are quoted.
3178 (As presented, these two macros are useful but not very robust for
3179 general use.  They lack even basic error handling for cases like
3180 @var{start} less than @var{end}, @var{end} not numeric, or
3181 @var{iterator} not being a macro name.  See if you can improve these
3182 macros; or @pxref{Improved forloop, , Answers}).
3183
3184 @node Foreach
3185 @section Iteration by list contents
3186
3187 @cindex for each loops
3188 @cindex loops, list iteration
3189 @cindex iterating over lists
3190 Here is an example of a loop macro that implements list iteration.
3191
3192 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3193 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3194 Takes the name in @var{iterator}, which must be a valid macro name, and
3195 successively assign it each value from @var{paren-list} or
3196 @var{quote-list}.  In @code{foreach}, @var{paren-list} is a
3197 comma-separated list of elements contained in parentheses.  In
3198 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3199 contained in a quoted string.  For each assignment to @var{iterator},
3200 append @var{text} to the overall expansion.  @var{text} may refer to
3201 @var{iterator}.  Any definition of @var{iterator} prior to this
3202 invocation is restored.
3203 @end deffn
3204
3205 As an example, this displays each word in a list inside of a sentence,
3206 using an implementation of @code{foreach} distributed as
3207 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3208 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3209
3210 @comment examples
3211 @example
3212 $ @kbd{m4 -I examples}
3213 include(`foreach.m4')
3214 @result{}
3215 foreach(`x', (foo, bar, foobar), `Word was: x
3216 ')dnl
3217 @result{}Word was: foo
3218 @result{}Word was: bar
3219 @result{}Word was: foobar
3220 include(`foreachq.m4')
3221 @result{}
3222 foreachq(`x', `foo, bar, foobar', `Word was: x
3223 ')dnl
3224 @result{}Word was: foo
3225 @result{}Word was: bar
3226 @result{}Word was: foobar
3227 @end example
3228
3229 It is possible to be more complex; each element of the @var{paren-list}
3230 or @var{quote-list} can itself be a list, to pass as further arguments
3231 to a helper macro.  This example generates a shell case statement:
3232
3233 @comment examples
3234 @example
3235 $ @kbd{m4 -I examples}
3236 include(`foreach.m4')
3237 @result{}
3238 define(`_case', `  $1)
3239     $2=" $1";;
3240 ')dnl
3241 define(`_cat', `$1$2')dnl
3242 case $`'1 in
3243 @result{}case $1 in
3244 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3245         `_cat(`_case', x)')dnl
3246 @result{}  a)
3247 @result{}    vara=" a";;
3248 @result{}  b)
3249 @result{}    varb=" b";;
3250 @result{}  c)
3251 @result{}    varc=" c";;
3252 esac
3253 @result{}esac
3254 @end example
3255
3256 The implementation of the @code{foreach} macro is a bit more involved;
3257 it is a wrapper around two helper macros.  First, @code{@w{_arg1}} is
3258 needed to grab the first element of a list.  Second,
3259 @code{@w{_foreach}} implements the recursion, successively walking
3260 through the original list.  Here is a simple implementation of
3261 @code{foreach}:
3262
3263 @comment examples
3264 @example
3265 $ @kbd{m4 -I examples}
3266 undivert(`foreach.m4')dnl
3267 @result{}divert(`-1')
3268 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3269 @result{}#   parenthesized list, simple version
3270 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3271 @result{}define(`_arg1', `$1')
3272 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3273 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3274 @result{}divert`'dnl
3275 @end example
3276
3277 Unfortunately, that implementation is not robust to macro names as list
3278 elements.  Each iteration of @code{@w{_foreach}} is stripping another
3279 layer of quotes, leading to erratic results if list elements are not
3280 already fully expanded.  The first cut at implementing @code{foreachq}
3281 takes this into account.  Also, when using quoted elements in a
3282 @var{paren-list}, the overall list must be quoted.  A @var{quote-list}
3283 has the nice property of requiring fewer characters to create a list
3284 containing the same quoted elements.  To see the difference between the
3285 two macros, we attempt to pass double-quoted macro names in a list,
3286 expecting the macro name on output after one layer of quotes is removed
3287 during list iteration and the final layer removed during the final
3288 rescan:
3289
3290 @comment examples
3291 @example
3292 $ @kbd{m4 -I examples}
3293 define(`a', `1')define(`b', `2')define(`c', `3')
3294 @result{}
3295 include(`foreach.m4')
3296 @result{}
3297 include(`foreachq.m4')
3298 @result{}
3299 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3300 ')
3301 @result{}1
3302 @result{}(2)1
3303 @result{}
3304 @result{}, x
3305 @result{})
3306 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3307 ')dnl
3308 @result{}a
3309 @result{}(b
3310 @result{}c)
3311 @end example
3312
3313 Obviously, @code{foreachq} did a better job; here is its implementation:
3314
3315 @comment examples
3316 @example
3317 $ @kbd{m4 -I examples}
3318 undivert(`foreachq.m4')dnl
3319 @result{}include(`quote.m4')dnl
3320 @result{}divert(`-1')
3321 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3322 @result{}#   quoted list, simple version
3323 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3324 @result{}define(`_arg1', `$1')
3325 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3326 @result{}  `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3327 @result{}divert`'dnl
3328 @end example
3329
3330 Notice that @code{@w{_foreachq}} had to use the helper macro
3331 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3332 embedded @code{ifelse} call does not go haywire if a list element
3333 contains a comma.  Unfortunately, this implementation of @code{foreachq}
3334 has its own severe flaw.  Whereas the @code{foreach} implementation was
3335 linear, this macro is quadratic in the number of list elements, and is
3336 much more likely to trip up the limit set by the command line option
3337 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3338 Invoking m4}).  Additionally, this implementation does not expand
3339 @samp{defn(`@var{iterator}')} very well, when compared with
3340 @code{foreach}.
3341
3342 @comment examples
3343 @example
3344 $ @kbd{m4 -I examples}
3345 include(`foreach.m4')include(`foreachq.m4')
3346 @result{}
3347 foreach(`name', `(`a', `b')', ` defn(`name')')
3348 @result{} a b
3349 foreachq(`name', ``a', `b'', ` defn(`name')')
3350 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3351 @end example
3352
3353 It is possible to have robust iteration with linear behavior and sane
3354 @var{iterator} contents for either list style.  See if you can learn
3355 from the best elements of both of these implementations to create robust
3356 macros (or @pxref{Improved foreach, , Answers}).
3357
3358 @node Debugging
3359 @chapter How to debug macros and input
3360
3361 @cindex debugging macros
3362 @cindex macros, debugging
3363 When writing macros for @code{m4}, they often do not work as intended on
3364 the first try (as is the case with most programming languages).
3365 Fortunately, there is support for macro debugging in @code{m4}.
3366
3367 @menu
3368 * Dumpdef::                     Displaying macro definitions
3369 * Trace::                       Tracing macro calls
3370 * Debug Levels::                Controlling debugging output
3371 * Debug Output::                Saving debugging output
3372 @end menu
3373
3374 @node Dumpdef
3375 @section Displaying macro definitions
3376
3377 @cindex displaying macro definitions
3378 @cindex macros, displaying definitions
3379 @cindex definitions, displaying macro
3380 @cindex standard error, output to
3381 If you want to see what a name expands into, you can use the builtin
3382 @code{dumpdef}:
3383
3384 @deffn Builtin dumpdef (@ovar{names@dots{}})
3385 Accepts any number of arguments.  If called without any arguments,
3386 it displays the definitions of all known names, otherwise it displays
3387 the definitions of the @var{names} given.  The output is printed to the
3388 current debug file (usually standard error), and is sorted by name.  If
3389 an unknown name is encountered, a warning is printed.
3390
3391 The expansion of @code{dumpdef} is void.
3392 @end deffn
3393
3394 @example
3395 $ @kbd{m4 -d}
3396 define(`foo', `Hello world.')
3397 @result{}
3398 dumpdef(`foo')
3399 @error{}foo:@tabchar{}`Hello world.'
3400 @result{}
3401 dumpdef(`define')
3402 @error{}define:@tabchar{}<define>
3403 @result{}
3404 @end example
3405
3406 The last example shows how builtin macros definitions are displayed.
3407 The definition that is dumped corresponds to what would occur if the
3408 macro were to be called at that point, even if other definitions are
3409 still live due to redefining a macro during argument collection.
3410
3411 @example
3412 $ @kbd{m4 -d}
3413 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
3414 @result{}
3415 f(popdef(`f')dumpdef(`f'))
3416 @error{}f:@tabchar{}``$0'1'
3417 @result{}f2
3418 f(popdef(`f')dumpdef(`f'))
3419 @error{}m4:stdin:3: Warning: dumpdef: undefined macro `f'
3420 @result{}f1
3421 @end example
3422
3423 @xref{Debug Levels}, for information on controlling the details of the
3424 display.
3425
3426 @node Trace
3427 @section Tracing macro calls
3428
3429 @cindex tracing macro expansion
3430 @cindex macro expansion, tracing
3431 @cindex expansion, tracing macro
3432 @cindex standard error, output to
3433 It is possible to trace macro calls and expansions through the builtins
3434 @code{traceon} and @code{traceoff}:
3435
3436 @deffn Builtin traceon (@ovar{names@dots{}})
3437 @deffnx Builtin traceoff (@ovar{names@dots{}})
3438 When called without any arguments, @code{traceon} and @code{traceoff}
3439 will turn tracing on and off, respectively, for all currently defined
3440 macros.
3441
3442 When called with arguments, only the macros listed in @var{names} are
3443 affected, whether or not they are currently defined.
3444
3445 The expansion of @code{traceon} and @code{traceoff} is void.
3446 @end deffn
3447
3448 Whenever a traced macro is called and the arguments have been collected,
3449 the call is displayed.  If the expansion of the macro call is not void,
3450 the expansion can be displayed after the call.  The output is printed
3451 to the current debug file (defaulting to standard error, @pxref{Debug
3452 Output}).
3453
3454 @example
3455 $ @kbd{m4 -d}
3456 define(`foo', `Hello World.')
3457 @result{}
3458 define(`echo', `$@@')
3459 @result{}
3460 traceon(`foo', `echo')
3461 @result{}
3462 foo
3463 @error{}m4trace: -1- foo -> `Hello World.'
3464 @result{}Hello World.
3465 echo(`gnus', `and gnats')
3466 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
3467 @result{}gnus,and gnats
3468 @end example
3469
3470 The number between dashes is the depth of the expansion.  It is one most
3471 of the time, signifying an expansion at the outermost level, but it
3472 increases when macro arguments contain unquoted macro calls.  The
3473 maximum number that will appear between dashes is controlled by the
3474 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
3475 , Invoking m4}).  Additionally, the option @option{--trace} (or
3476 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
3477 parsing input.
3478
3479 @comment The explicit -dp neutralizes the testsuite default of -d.
3480 @comment options: -dp -L3 -tifelse
3481 @comment status: 1
3482 @example
3483 $ @kbd{m4 -L 3 -t ifelse}
3484 ifelse(`one level')
3485 @error{}m4trace: -1- ifelse
3486 @result{}
3487 ifelse(ifelse(ifelse(`three levels')))
3488 @error{}m4trace: -3- ifelse
3489 @error{}m4trace: -2- ifelse
3490 @error{}m4trace: -1- ifelse
3491 @result{}
3492 ifelse(ifelse(ifelse(ifelse(`four levels'))))
3493 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
3494 @end example
3495
3496 Tracing by name is an attribute that is preserved whether the macro is
3497 defined or not.  This allows the selection of macros to trace before
3498 those macros are defined.
3499
3500 @example
3501 $ @kbd{m4 -d}
3502 traceoff(`foo')
3503 @result{}
3504 traceon(`foo')
3505 @result{}
3506 foo
3507 @result{}foo
3508 define(`foo', `bar')
3509 @result{}
3510 foo
3511 @error{}m4trace: -1- foo -> `bar'
3512 @result{}bar
3513 undefine(`foo')
3514 @result{}
3515 ifdef(`foo', `yes', `no')
3516 @result{}no
3517 indir(`foo')
3518 @error{}m4:stdin:8: Warning: indir: undefined macro `foo'
3519 @result{}
3520 define(`foo', `blah')
3521 @result{}
3522 foo
3523 @error{}m4trace: -1- foo -> `blah'
3524 @result{}blah
3525 traceoff
3526 @result{}
3527 foo
3528 @result{}blah
3529 @end example
3530
3531 Tracing even works on builtins.  However, @code{defn} (@pxref{Defn})
3532 does not transfer tracing status.
3533
3534 @example
3535 $ @kbd{m4 -d}
3536 traceon(`eval', `m4_divnum')
3537 @result{}
3538 define(`m4_eval', defn(`eval'))
3539 @result{}
3540 define(`m4_divnum', defn(`divnum'))
3541 @result{}
3542 eval(divnum)
3543 @error{}m4trace: -1- eval(`0') -> `0'
3544 @result{}0
3545 m4_eval(m4_divnum)
3546 @error{}m4trace: -2- m4_divnum -> `0'
3547 @result{}0
3548 @end example
3549
3550 @xref{Debug Levels}, for information on controlling the details of the
3551 display.
3552
3553 @node Debug Levels
3554 @section Controlling debugging output
3555
3556 @cindex controlling debugging output
3557 @cindex debugging output, controlling
3558 The @option{-d} option to @code{m4} (or @option{--debug},
3559 @pxref{Debugging options, , Invoking m4}) controls the amount of details
3560 presented in three
3561 categories of output.  Trace output is requested by @code{traceon}
3562 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
3563 relation to a macro invocation.  Debug output tracks useful events not
3564 associated with a macro invocation, and each line is prefixed by
3565 @samp{m4debug:}.  Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
3566 affected, with no prefix added to the output lines.
3567
3568 The @var{flags} following the option can be one or more of the
3569 following:
3570
3571 @table @code
3572 @item a
3573 In trace output, show the actual arguments that were collected before
3574 invoking the macro.  This applies to all macro calls if the @samp{t}
3575 flag is used, otherwise only the macros covered by calls of
3576 @code{traceon}.  Arguments are subject to length truncation specified by
3577 the command line option @option{--arglength} (or @option{-l}).
3578
3579 @item c
3580 In trace output, show several trace lines for each macro call.  A line
3581 is shown when the macro is seen, but before the arguments are collected;
3582 a second line when the arguments have been collected and a third line
3583 after the call has completed.
3584
3585 @item e
3586 In trace output, show the expansion of each macro call, if it is not
3587 void.  This applies to all macro calls if the @samp{t} flag is used,
3588 otherwise only the macros covered by calls of @code{traceon}.  The
3589 expansion is subject to length truncation specified by the command line
3590 option @option{--arglength} (or @option{-l}).
3591
3592 @item f
3593 In debug and trace output, include the name of the current input file in
3594 the output line.
3595
3596 @item i
3597 In debug output, print a message each time the current input file is
3598 changed.
3599
3600 @item l
3601 In debug and trace output, include the current input line number in the
3602 output line.
3603
3604 @item p
3605 In debug output, print a message when a named file is found through the
3606 path search mechanism (@pxref{Search Path}), giving the actual file name
3607 used.
3608
3609 @item q
3610 In trace and dumpdef output, quote actual arguments and macro expansions
3611 in the display with the current quotes.  This is useful in connection
3612 with the @samp{a} and @samp{e} flags above.
3613
3614 @item t
3615 In trace output, trace all macro calls made in this invocation of
3616 @code{m4}, regardless of the settings of @code{traceon}.
3617
3618 @item x
3619 In trace output, add a unique `macro call id' to each line of the trace
3620 output.  This is useful in connection with the @samp{c} flag above.
3621
3622 @item V
3623 A shorthand for all of the above flags.
3624 @end table
3625
3626 If no flags are specified with the @option{-d} option, the default is
3627 @samp{aeq}.  The examples throughout this manual assume the default
3628 flags.
3629
3630 @cindex @acronym{GNU} extensions
3631 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
3632 the debugging output format:
3633
3634 @deffn Builtin debugmode (@ovar{flags})
3635 The argument @var{flags} should be a subset of the letters listed above.
3636 As special cases, if the argument starts with a @samp{+}, the flags are
3637 added to the current debug flags, and if it starts with a @samp{-}, they
3638 are removed.  If no argument is present, all debugging flags are cleared
3639 (as if no @option{-d} was given), and with an empty argument the flags
3640 are reset to the default of @samp{aeq}.
3641
3642 The expansion of @code{debugmode} is void.
3643 @end deffn
3644
3645 @comment The explicit -dp neutralizes the testsuite default of -d.
3646 @comment options: -dp
3647 @example
3648 $ @kbd{m4}
3649 define(`foo', `FOO')
3650 @result{}
3651 traceon(`foo')
3652 @result{}
3653 debugmode()
3654 @result{}
3655 foo
3656 @error{}m4trace: -1- foo -> `FOO'
3657 @result{}FOO
3658 debugmode
3659 @result{}
3660 foo
3661 @error{}m4trace: -1- foo
3662 @result{}FOO
3663 debugmode(`+l')
3664 @result{}
3665 foo
3666 @error{}m4trace:8: -1- foo
3667 @result{}FOO
3668 @end example
3669
3670 The following example demonstrates the behavior of length truncation,
3671 when specified on the command line.  Note that each argument and the
3672 final result are individually truncated.  Also, the special tokens for
3673 builtin functions are not truncated.
3674
3675 @comment options: -l6
3676 @example
3677 $ @kbd{m4 -d -l 6}
3678 define(`echo', `$@@')debugmode(`+t')
3679 @result{}
3680 echo(`1', `long string')
3681 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
3682 @result{}1,long string
3683 indir(`echo', defn(`changequote'))
3684 @error{}m4trace: -2- defn(`change...')
3685 @error{}m4trace: -1- indir(`echo', <changequote>) -> ``<changequote>''
3686 @result{}
3687 @end example
3688
3689 @node Debug Output
3690 @section Saving debugging output
3691
3692 @cindex saving debugging output
3693 @cindex debugging output, saving
3694 @cindex output, saving debugging
3695 @cindex @acronym{GNU} extensions
3696 Debug and tracing output can be redirected to files using either the
3697 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
3698 Invoking m4}), or with the builtin macro @code{debugfile}:
3699
3700 @deffn Builtin debugfile (@ovar{file})
3701 Sends all further debug and trace output to @var{file}, opened in append
3702 mode.  If @var{file} is the empty string, debug and trace output are
3703 discarded.  If @code{debugfile} is called without any arguments, debug
3704 and trace output are sent to standard error.  This does not affect
3705 warnings, error messages, or @code{errprint} output, which are
3706 always sent to standard error.  If @var{file} cannot be opened, the
3707 current debug file is unchanged, and an error is issued.
3708
3709 The expansion of @code{debugfile} is void.
3710 @end deffn
3711
3712 @example
3713 $ @kbd{m4 -d}
3714 traceon(`divnum')
3715 @result{}
3716 divnum(`extra')
3717 @error{}m4:stdin:2: Warning: divnum: extra arguments ignored: 1 > 0
3718 @error{}m4trace: -1- divnum(`extra') -> `0'
3719 @result{}0
3720 debugfile()
3721 @result{}
3722 divnum(`extra')
3723 @error{}m4:stdin:4: Warning: divnum: extra arguments ignored: 1 > 0
3724 @result{}0
3725 debugfile
3726 @result{}
3727 divnum
3728 @error{}m4trace: -1- divnum -> `0'
3729 @result{}0
3730 @end example
3731
3732 @node Input Control
3733 @chapter Input control
3734
3735 This chapter describes various builtin macros for controlling the input
3736 to @code{m4}.
3737
3738 @menu
3739 * Dnl::                         Deleting whitespace in input
3740 * Changequote::                 Changing the quote characters
3741 * Changecom::                   Changing the comment delimiters
3742 * Changeword::                  Changing the lexical structure of words
3743 * M4wrap::                      Saving text until end of input
3744 @end menu
3745
3746 @node Dnl
3747 @section Deleting whitespace in input
3748
3749 @cindex deleting whitespace in input
3750 @cindex discarding input
3751 @cindex input, discarding
3752 The builtin @code{dnl} stands for ``Discard to Next Line'':
3753
3754 @deffn Builtin dnl
3755 All characters, up to and including the next newline, are discarded
3756 without performing any macro expansion.  A warning is issued if the end
3757 of the file is encountered without a newline.
3758
3759 The expansion of @code{dnl} is void.
3760 @end deffn
3761
3762 It is often used in connection with @code{define}, to remove the
3763 newline that follows the call to @code{define}.  Thus
3764
3765 @example
3766 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
3767 foo
3768 @result{}Macro foo.
3769 @end example
3770
3771 The input up to and including the next newline is discarded, as opposed
3772 to the way comments are treated (@pxref{Comments}).
3773
3774 Usually, @code{dnl} is immediately followed by an end of line or some
3775 other whitespace.  @acronym{GNU} @code{m4} will produce a warning diagnostic if
3776 @code{dnl} is followed by an open parenthesis.  In this case, @code{dnl}
3777 will collect and process all arguments, looking for a matching close
3778 parenthesis.  All predictable side effects resulting from this
3779 collection will take place.  @code{dnl} will return no output.  The
3780 input following the matching close parenthesis up to and including the
3781 next newline, on whatever line containing it, will still be discarded.
3782
3783 @example
3784 dnl(`args are ignored, but side effects occur',
3785 define(`foo', `like this')) while this text is ignored: undefine(`foo')
3786 @error{}m4:stdin:1: Warning: dnl: extra arguments ignored: 2 > 0
3787 See how `foo' was defined, foo?
3788 @result{}See how foo was defined, like this?
3789 @end example
3790
3791 If the end of file is encountered without a newline character, a
3792 warning is issued and dnl stops consuming input.
3793
3794 @example
3795 m4wrap(`m4wrap(`2 hi
3796 ')0 hi dnl 1 hi')
3797 @result{}
3798 define(`hi', `HI')
3799 @result{}
3800 ^D
3801 @error{}m4:stdin:1: Warning: dnl: end of file treated as newline
3802 @result{}0 HI 2 HI
3803 @end example
3804
3805 @node Changequote
3806 @section Changing the quote characters
3807
3808 @cindex changing quote delimiters
3809 @cindex quote delimiters, changing
3810 @cindex delimiters, changing
3811 The default quote delimiters can be changed with the builtin
3812 @code{changequote}:
3813
3814 @deffn Builtin changequote (@dvar{start, `}, @dvar{end, '})
3815 This sets @var{start} as the new begin-quote delimiter and @var{end} as
3816 the new end-quote delimiter.  If both arguments are missing, the default
3817 quotes (@code{`} and @code{'}) are used.  If @var{start} is void, then
3818 quoting is disabled.  Otherwise, if @var{end} is missing or void, the
3819 default end-quote delimiter (@code{'}) is used.  The quote delimiters
3820 can be of any length.
3821
3822 The expansion of @code{changequote} is void.
3823 @end deffn
3824
3825 @example
3826 changequote(`[', `]')
3827 @result{}
3828 define([foo], [Macro [foo].])
3829 @result{}
3830 foo
3831 @result{}Macro foo.
3832 @end example
3833
3834 The quotation strings can safely contain eight-bit characters.
3835 @ignore
3836 @comment Yuck.  I know of no clean way to render an 8-bit character in
3837 @comment both info and dvi.  This example uses the `open-guillemot' and
3838 @comment `close-guillemot' characters of the Latin-1 character set.
3839
3840 @example
3841 define(`a', `b')
3842 @result{}
3843 «a»
3844 @result{}«b»
3845 changequote(`«', `»')
3846 @result{}
3847 «a»
3848 @result{}a
3849 @end example
3850 @end ignore
3851 If no single character is appropriate, @var{start} and @var{end} can be
3852 of any length.  Other implementations cap the delimiter length to five
3853 characters, but @acronym{GNU} has no inherent limit.
3854
3855 @example
3856 changequote(`[[[', `]]]')
3857 @result{}
3858 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
3859 @result{}
3860 foo
3861 @result{}Macro [[foo]].
3862 @end example
3863
3864 Calling @code{changequote} with @var{start} as the empty string will
3865 effectively disable the quoting mechanism, leaving no way to quote text.
3866 However, using an empty string is not portable, as some other
3867 implementations of @code{m4} revert to the default quoting, while others
3868 preserve the prior non-empty delimiter.  If @var{start} is not empty,
3869 then an empty @var{end} will use the default end-quote delimiter of
3870 @samp{'}, as otherwise, it would be impossible to end a quoted string.
3871 Again, this is not portable, as some other @code{m4} implementations
3872 reuse @var{start} as the end-quote delimiter, while others preserve the
3873 previous non-empty value.  Omitting both arguments restores the default
3874 begin-quote and end-quote delimiters; fortunately this behavior is
3875 portable to all implementations of @code{m4}.
3876
3877 @example
3878 define(`foo', `Macro `FOO'.')
3879 @result{}
3880 changequote(`', `')
3881 @result{}
3882 foo
3883 @result{}Macro `FOO'.
3884 `foo'
3885 @result{}`Macro `FOO'.'
3886 changequote(`,)
3887 @result{}
3888 foo
3889 @result{}Macro FOO.
3890 @end example
3891
3892 There is no way in @code{m4} to quote a string containing an unmatched
3893 begin-quote, except using @code{changequote} to change the current
3894 quotes.
3895
3896 If the quotes should be changed from, say, @samp{[} to @samp{[[},
3897 temporary quote characters have to be defined.  To achieve this, two
3898 calls of @code{changequote} must be made, one for the temporary quotes
3899 and one for the new quotes.
3900
3901 Macros are recognized in preference to the begin-quote string, so if a
3902 prefix of @var{start} can be recognized as part of a potential macro
3903 name, the quoting mechanism is effectively disabled.  Unless you use
3904 @code{changeword} (@pxref{Changeword}), this means that @var{start}
3905 should not begin with a letter, digit, or @samp{_} (underscore).
3906 However, even though quoted strings are not recognized, the quote
3907 characters can still be discerned in macro expansion and in trace
3908 output.
3909
3910 @example
3911 define(`echo', `$@@')
3912 @result{}
3913 define(`hi', `HI')
3914 @result{}
3915 changequote(`q', `Q')
3916 @result{}
3917 q hi Q hi
3918 @result{}q HI Q HI
3919 echo(hi)
3920 @result{}qHIQ
3921 changequote
3922 @result{}
3923 changequote(`-', `EOF')
3924 @result{}
3925 - hi EOF hi
3926 @result{} hi  HI
3927 changequote
3928 @result{}
3929 changequote(`1', `2')
3930 @result{}
3931 hi1hi2
3932 @result{}hi1hi2
3933 hi 1hi2
3934 @result{}HI hi
3935 @end example
3936
3937 Quotes are recognized in preference to argument collection.  In
3938 particular, if @var{start} is a single @samp{(}, then argument
3939 collection is effectively disabled.  For portability with other
3940 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
3941 @samp{)} as the first character in @var{start}.
3942
3943 @example
3944 define(`echo', `$#:$@@:')
3945 @result{}
3946 define(`hi', `HI')
3947 @result{}
3948 changequote(`(',`)')
3949 @result{}
3950 echo(hi)
3951 @result{}0::hi
3952 changequote
3953 @result{}
3954 changequote(`((', `))')
3955 @result{}
3956 echo(hi)
3957 @result{}1:HI:
3958 echo((hi))
3959 @result{}0::hi
3960 changequote
3961 @result{}
3962 changequote(`,', `)')
3963 @result{}
3964 echo(hi,hi)bye)
3965 @result{}1:HIhibye:
3966 @end example
3967
3968 However, if you are not worried about portability, using @samp{(} and
3969 @samp{)} as quoting characters has an interesting property---you can use
3970 it to compute a quoted string containing the expansion of any quoted
3971 text, as long as the expansion results in both balanced quotes and
3972 balanced parentheses.  The trick is realizing @code{expand} uses
3973 @samp{$1} unquoted, to trigger its expansion using the normal quoting
3974 characters, but uses extra parentheses to group unquoted commas that
3975 occur in the expansion without consuming whitespace following those
3976 commas.  Then @code{_expand} uses @code{changequote} to convert the
3977 extra parentheses back into quoting characters.  Note that it takes two
3978 more @code{changequote} invocations to restore the original quotes.
3979 Contrast the behavior on whitespace when using @samp{$*}, via
3980 @code{quote}, to attempt the same task.
3981
3982 @example
3983 changequote(`[', `]')dnl
3984 define([a], [1, (b)])dnl
3985 define([b], [2])dnl
3986 define([quote], [[$*]])dnl
3987 define([expand], [_$0(($1))])dnl
3988 define([_expand],
3989   [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
3990 expand([a, a, [a, a], [[a, a]]])
3991 @result{}1, (2), 1, (2), a, a, [a, a]
3992 quote(a, a, [a, a], [[a, a]])
3993 @result{}1,(2),1,(2),a, a,[a, a]
3994 @end example
3995
3996 If @var{end} is a prefix of @var{start}, the end-quote will be
3997 recognized in preference to a nested begin-quote.  In particular,
3998 changing the quotes to have the same string for @var{start} and
3999 @var{end} disables nesting of quotes.  When quote nesting is disabled,
4000 it is impossible to double-quote strings across macro expansions, so
4001 using the same string is not done very often.
4002
4003 @example
4004 define(`hi', `HI')
4005 @result{}
4006 changequote(`""', `"')
4007 @result{}
4008 ""hi"""hi"
4009 @result{}hihi
4010 ""hi" ""hi"
4011 @result{}hi hi
4012 ""hi"" "hi"
4013 @result{}hi" "HI"
4014 changequote
4015 @result{}
4016 `hi`hi'hi'
4017 @result{}hi`hi'hi
4018 changequote(`"', `"')
4019 @result{}
4020 "hi"hi"hi"
4021 @result{}hiHIhi
4022 @end example
4023
4024 @ignore
4025 @comment And another stress test, not worth documenting in the manual.
4026 @example
4027 define(`aaaaaaaaaaaaaaaaaaaa', `A')define(`q', `"$@@"')
4028 @result{}
4029 changequote(`"', `"')
4030 @result{}
4031 q(q("aaaaaaaaaaaaaaaaaaaa", "a"))
4032 @result{}A,a
4033 @end example
4034 @end ignore
4035
4036 It is an error if the end of file occurs within a quoted string.
4037
4038 @comment status: 1
4039 @example
4040 `hello world'
4041 @result{}hello world
4042 `dangling quote
4043 ^D
4044 @error{}m4:stdin:2: end of file in string
4045 @end example
4046
4047 @comment status: 1
4048 @example
4049 ifelse(`dangling quote
4050 ^D
4051 @error{}m4:stdin:1: ifelse: end of file in string
4052 @end example
4053
4054 @node Changecom
4055 @section Changing the comment delimiters
4056
4057 @cindex changing comment delimiters
4058 @cindex comment delimiters, changing
4059 @cindex delimiters, changing
4060 The default comment delimiters can be changed with the builtin
4061 macro @code{changecom}:
4062
4063 @deffn Builtin changecom (@ovar{start}, @dvar{end, @key{NL}})
4064 This sets @var{start} as the new begin-comment delimiter and @var{end}
4065 as the new end-comment delimiter.  If both arguments are missing, or
4066 @var{start} is void, then comments are disabled.  Otherwise, if
4067 @var{end} is missing or void, the default end-comment delimiter of
4068 newline is used.  The comment delimiters can be of any length.
4069
4070 The expansion of @code{changecom} is void.
4071 @end deffn
4072
4073 @example
4074 define(`comment', `COMMENT')
4075 @result{}
4076 # A normal comment
4077 @result{}# A normal comment
4078 changecom(`/*', `*/')
4079 @result{}
4080 # Not a comment anymore
4081 @result{}# Not a COMMENT anymore
4082 But: /* this is a comment now */ while this is not a comment
4083 @result{}But: /* this is a comment now */ while this is not a COMMENT
4084 @end example
4085
4086 @cindex comments, copied to output
4087 Note how comments are copied to the output, much as if they were quoted
4088 strings.  If you want the text inside a comment expanded, quote the
4089 begin-comment delimiter.
4090
4091 Calling @code{changecom} without any arguments, or with @var{start} as
4092 the empty string, will effectively disable the commenting mechanism.  To
4093 restore the original comment start of @samp{#}, you must explicitly ask
4094 for it.  If @var{start} is not empty, then an empty @var{end} will use
4095 the default end-comment delimiter of newline, as otherwise, it would be
4096 impossible to end a comment.  However, this is not portable, as some
4097 other @code{m4} implementations preserve the previous non-empty
4098 delimiters instead.
4099
4100 @example
4101 define(`comment', `COMMENT')
4102 @result{}
4103 changecom
4104 @result{}
4105 # Not a comment anymore
4106 @result{}# Not a COMMENT anymore
4107 changecom(`#', `')
4108 @result{}
4109 # comment again
4110 @result{}# comment again
4111 @end example
4112
4113 The comment strings can safely contain eight-bit characters.
4114 @ignore
4115 @comment Yuck.  I know of no clean way to render an 8-bit character in
4116 @comment both info and dvi.  This example uses the `open-guillemot' and
4117 @comment `close-guillemot' characters of the Latin-1 character set.
4118
4119 @example
4120 define(`a', `b')
4121 @result{}
4122 «a»
4123 @result{}«b»
4124 changecom(`«', `»')
4125 @result{}
4126 «a»
4127 @result{}«a»
4128 @end example
4129 @end ignore
4130 If no single character is appropriate, @var{start} and @var{end} can be
4131 of any length.  Other implementations cap the delimiter length to five
4132 characters, but @acronym{GNU} has no inherent limit.
4133
4134 Comments are recognized in preference to macros.  However, this is not
4135 compatible with other implementations, where macros and even quoting
4136 takes precedence over comments, so it may change in a future release.
4137 For portability, this means that @var{start} should not begin with a
4138 letter, digit, or @samp{_} (underscore), and that neither the
4139 start-quote nor the start-comment string should be a prefix of the
4140 other.
4141
4142 @example
4143 define(`hi', `HI')
4144 @result{}
4145 define(`hi1hi2', `hello')
4146 @result{}
4147 changecom(`q', `Q')
4148 @result{}
4149 q hi Q hi
4150 @result{}q hi Q HI
4151 changecom(`1', `2')
4152 @result{}
4153 hi1hi2
4154 @result{}hello
4155 hi 1hi2
4156 @result{}HI 1hi2
4157 @end example
4158
4159 Comments are recognized in preference to argument collection.  In
4160 particular, if @var{start} is a single @samp{(}, then argument
4161 collection is effectively disabled.  For portability with other
4162 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
4163 @samp{)} as the first character in @var{start}.
4164
4165 @example
4166 define(`echo', `$#:$*:$@@:')
4167 @result{}
4168 define(`hi', `HI')
4169 @result{}
4170 changecom(`(',`)')
4171 @result{}
4172 echo(hi)
4173 @result{}0:::(hi)
4174 changecom
4175 @result{}
4176 changecom(`((', `))')
4177 @result{}
4178 echo(hi)
4179 @result{}1:HI:HI:
4180 echo((hi))
4181 @result{}0:::((hi))
4182 changecom(`,', `)')
4183 @result{}
4184 echo(hi,hi)bye)
4185 @result{}1:HI,hi)bye:HI,hi)bye:
4186 changecom
4187 @result{}
4188 echo(hi,`,`'hi',hi)
4189 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
4190 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
4191 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
4192 @end example
4193
4194 It is an error if the end of file occurs within a comment.
4195
4196 @comment status: 1
4197 @example
4198 changecom(`/*', `*/')
4199 @result{}
4200 /*dangling comment
4201 ^D
4202 @error{}m4:stdin:2: end of file in comment
4203 @end example
4204
4205 @comment status: 1
4206 @example
4207 changecom(`/*', `*/')
4208 @result{}
4209 len(/*dangling comment
4210 ^D
4211 @error{}m4:stdin:2: len: end of file in comment
4212 @end example
4213
4214 @node Changeword
4215 @section Changing the lexical structure of words
4216
4217 @cindex lexical structure of words
4218 @cindex words, lexical structure of
4219 @cindex syntax, changing
4220 @cindex changing syntax
4221 @cindex regular expressions
4222 @quotation
4223 The macro @code{changeword} and all associated functionality is
4224 experimental.  It is only available if the @option{--enable-changeword}
4225 option was given to @code{configure}, at @acronym{GNU} @code{m4} installation
4226 time.  The functionality will go away in the future, to be replaced by
4227 other new features that are more efficient at providing the same
4228 capabilities.  @emph{Do not rely on it}.  Please direct your comments
4229 about it the same way you would do for bugs.
4230 @end quotation
4231
4232 A file being processed by @code{m4} is split into quoted strings, words
4233 (potential macro names) and simple tokens (any other single character).
4234 Initially a word is defined by the following regular expression:
4235
4236 @comment ignore
4237 @example
4238 [_a-zA-Z][_a-zA-Z0-9]*
4239 @end example
4240
4241 Using @code{changeword}, you can change this regular expression:
4242
4243 @deffn {Optional builtin} changeword (@var{regex})
4244 Changes the regular expression for recognizing macro names to be
4245 @var{regex}.  If @var{regex} is empty, use
4246 @samp{[_a-zA-Z][_a-zA-Z0-9]*}.  @var{regex} must obey the constraint
4247 that every prefix of the desired final pattern is also accepted by the
4248 regular expression.  If @var{regex} contains grouping parentheses, the
4249 macro invoked is the portion that matched the first group, rather than
4250 the entire matching string.
4251
4252 The expansion of @code{changeword} is void.
4253 The macro @code{changeword} is recognized only with parameters.
4254 @end deffn
4255
4256 Relaxing the lexical rules of @code{m4} might be useful (for example) if
4257 you wanted to apply translations to a file of numbers:
4258
4259 @example
4260 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4261 ')m4exit(`77')')dnl
4262 changeword(`[_a-zA-Z0-9]+')
4263 @result{}
4264 define(`1', `0')1
4265 @result{}0
4266 @end example
4267
4268 Tightening the lexical rules is less useful, because it will generally
4269 make some of the builtins unavailable.  You could use it to prevent
4270 accidental call of builtins, for example:
4271
4272 @example
4273 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4274 ')m4exit(`77')')dnl
4275 define(`_indir', defn(`indir'))
4276 @result{}
4277 changeword(`_[_a-zA-Z0-9]*')
4278 @result{}
4279 esyscmd(`foo')
4280 @result{}esyscmd(foo)
4281 _indir(`esyscmd', `echo hi')
4282 @result{}hi
4283 @result{}
4284 @end example
4285
4286 Because @code{m4} constructs its words a character at a time, there
4287 is a restriction on the regular expressions that may be passed to
4288 @code{changeword}.  This is that if your regular expression accepts
4289 @samp{foo}, it must also accept @samp{f} and @samp{fo}.
4290
4291 @example
4292 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4293 ')m4exit(`77')')dnl
4294 define(`foo
4295 ', `bar
4296 ')
4297 @result{}
4298 dnl This example wants to recognize changeword, dnl, and `foo\n'.
4299 dnl First, we check that our regexp will match.
4300 regexp(`changeword', `[cd][a-z]*\|foo[
4301 ]')
4302 @result{}0
4303 regexp(`foo
4304 ', `[cd][a-z]*\|foo[
4305 ]')
4306 @result{}0
4307 regexp(`f', `[cd][a-z]*\|foo[
4308 ]')
4309 @result{}-1
4310 foo
4311 @result{}foo
4312 changeword(`[cd][a-z]*\|foo[
4313 ]')
4314 @result{}
4315 dnl Even though `foo\n' matches, we forgot to allow `f'.
4316 foo
4317 @result{}foo
4318 changeword(`[cd][a-z]*\|fo*[
4319 ]?')
4320 @result{}
4321 dnl Now we can call `foo\n'.
4322 foo
4323 @result{}bar
4324 @end example
4325
4326 @ignore
4327 @comment One more test of including newline in a macro name; but this
4328 @comment does not need to be displayed in the manual.  This ensures
4329 @comment that line numbering is correct when dnl cuts across include
4330 @comment file boundaries, and when __file__ or __line__ is the last
4331 @comment token in an include file.
4332
4333 @example
4334 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4335 ')m4exit(`77')')dnl
4336 define(`bar
4337 ', defn(`dnl'))dnl
4338 define(`baz', `dnl
4339 include(`foo') ignored
4340 dnl')dnl
4341 changeword(`\([_a-zA-Z][_a-zA-Z0-9]*\|bar
4342 \)')
4343 @result{}
4344 __file__:__line__
4345 @result{}stdin:10
4346 include(`foo') ignored
4347 __file__:__line__
4348 @result{}stdin:12
4349 baz ignored
4350 __file__:__line__
4351 @result{}stdin:14
4352 define(`bar
4353 ', defn(`__file__'))
4354 @result{}
4355 include(`foo')
4356 @result{}../examples/foo
4357 define(`bar
4358 ', defn(`__line__'))
4359 @result{}
4360 include(`foo')
4361 @result{}1
4362 __file__:__line__
4363 @result{}stdin:21
4364 @end example
4365 @end ignore
4366
4367 @code{changeword} has another function.  If the regular expression
4368 supplied contains any grouped subexpressions, then text outside
4369 the first of these is discarded before symbol lookup.  So:
4370
4371 @example
4372 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4373 ')m4exit(`77')')dnl
4374 ifdef(`__unix__', ,
4375       `errprint(` skipping: syscmd does not have unix semantics
4376 ')m4exit(`77')')dnl
4377 changecom(`/*', `*/')dnl
4378 define(`foo', `bar')dnl
4379 changeword(`#\([_a-zA-Z0-9]*\)')
4380 @result{}
4381 #esyscmd(`echo foo \#foo')
4382 @result{}foo bar
4383 @result{}
4384 @end example
4385
4386 @code{m4} now requires a @samp{#} mark at the beginning of every
4387 macro invocation, so one can use @code{m4} to preprocess plain
4388 text without losing various words like @samp{divert}.
4389
4390 In @code{m4}, macro substitution is based on text, while in @TeX{}, it
4391 is based on tokens.  @code{changeword} can throw this difference into
4392 relief.  For example, here is the same idea represented in @TeX{} and
4393 @code{m4}.  First, the @TeX{} version:
4394
4395 @comment ignore
4396 @example
4397 \def\a@{\message@{Hello@}@}
4398 \catcode`\@@=0
4399 \catcode`\\=12
4400 @@a
4401 @@bye
4402 @result{}Hello
4403 @end example
4404
4405 @noindent
4406 Then, the @code{m4} version:
4407
4408 @example
4409 ifdef(`changeword', `', `errprint(` skipping: no changeword support
4410 ')m4exit(`77')')dnl
4411 define(`a', `errprint(`Hello')')dnl
4412 changeword(`@@\([_a-zA-Z0-9]*\)')
4413 @result{}
4414 @@a
4415 @result{}errprint(Hello)
4416 @end example
4417
4418 In the @TeX{} example, the first line defines a macro @code{a} to
4419 print the message @samp{Hello}.  The second line defines @key{@@} to
4420 be usable instead of @key{\} as an escape character.  The third line
4421 defines @key{\} to be a normal printing character, not an escape.
4422 The fourth line invokes the macro @code{a}.  So, when @TeX{} is run
4423 on this file, it displays the message @samp{Hello}.
4424
4425 When the @code{m4} example is passed through @code{m4}, it outputs
4426 @samp{errprint(Hello)}.  The reason for this is that @TeX{} does
4427 lexical analysis of macro definition when the macro is @emph{defined}.
4428 @code{m4} just stores the text, postponing the lexical analysis until
4429 the macro is @emph{used}.
4430
4431 You should note that using @code{changeword} will slow @code{m4} down
4432 by a factor of about seven, once it is changed to something other
4433 than the default regular expression.  You can invoke @code{changeword}
4434 with the empty string to restore the default word definition, and regain
4435 the parsing speed.
4436
4437 @node M4wrap
4438 @section Saving text until end of input
4439
4440 @cindex saving input
4441 @cindex input, saving
4442 @cindex deferring expansion
4443 @cindex expansion, deferring
4444 It is possible to `save' some text until the end of the normal input has
4445 been seen.  Text can be saved, to be read again by @code{m4} when the
4446 normal input has been exhausted.  This feature is normally used to
4447 initiate cleanup actions before normal exit, e.g., deleting temporary
4448 files.
4449
4450 To save input text, use the builtin @code{m4wrap}:
4451
4452 @deffn Builtin m4wrap (@var{string}, @dots{})
4453 Stores @var{string} in a safe place, to be reread when end of input is
4454 reached.  As a @acronym{GNU} extension, additional arguments are
4455 concatenated with a space to the @var{string}.
4456
4457 The expansion of @code{m4wrap} is void.
4458 The macro @code{m4wrap} is recognized only with parameters.
4459 @end deffn
4460
4461 @example
4462 define(`cleanup', `This is the `cleanup' action.
4463 ')
4464 @result{}
4465 m4wrap(`cleanup')
4466 @result{}
4467 This is the first and last normal input line.
4468 @result{}This is the first and last normal input line.
4469 ^D
4470 @result{}This is the cleanup action.
4471 @end example
4472
4473 The saved input is only reread when the end of normal input is seen, and
4474 not if @code{m4exit} is used to exit @code{m4}.
4475
4476 @comment FIXME: this contradicts POSIX, which requires that "If the
4477 @comment m4wrap macro is used multiple times, the arguments specified
4478 @comment shall be processed in the order in which the m4wrap macros were
4479 @comment processed."
4480 It is safe to call @code{m4wrap} from saved text, but then the order in
4481 which the saved text is reread is undefined.  If @code{m4wrap} is not used
4482 recursively, the saved pieces of text are reread in the opposite order
4483 in which they were saved (LIFO---last in, first out).  However, this
4484 behavior is likely to change in a future release, to match
4485 @acronym{POSIX}, so you should not depend on this order.
4486
4487 Here is an example of implementing a factorial function using
4488 @code{m4wrap}:
4489
4490 @example
4491 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
4492 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
4493 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
4494 @result{}
4495 f(`10')
4496 @result{}
4497 ^D
4498 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
4499 @end example
4500
4501 Invocations of @code{m4wrap} at the same recursion level are
4502 concatenated and rescanned as usual:
4503
4504 @example
4505 define(`aa', `AA
4506 ')
4507 @result{}
4508 m4wrap(`a')m4wrap(`a')
4509 @result{}
4510 ^D
4511 @result{}AA
4512 @end example
4513
4514 @noindent
4515 however, the transition between recursion levels behaves like an end of
4516 file condition between two input files.
4517
4518 @comment status: 1
4519 @example
4520 m4wrap(`m4wrap(`)')len(abc')
4521 @result{}
4522 ^D
4523 @error{}m4:stdin:1: len: end of file in argument list
4524 @end example
4525
4526 @node File Inclusion
4527 @chapter File inclusion
4528
4529 @cindex file inclusion
4530 @cindex inclusion, of files
4531 @code{m4} allows you to include named files at any point in the input.
4532
4533 @menu
4534 * Include::                     Including named files
4535 * Search Path::                 Searching for include files
4536 @end menu
4537
4538 @node Include
4539 @section Including named files
4540
4541 There are two builtin macros in @code{m4} for including files:
4542
4543 @deffn Builtin include (@var{file})
4544 @deffnx Builtin sinclude (@var{file})
4545 Both macros cause the file named @var{file} to be read by
4546 @code{m4}.  When the end of the file is reached, input is resumed from
4547 the previous input file.
4548
4549 The expansion of @code{include} and @code{sinclude} is therefore the
4550 contents of @var{file}.
4551
4552 If @var{file} does not exist (or cannot be read), the expansion is void,
4553 and @code{include} will fail with an error while @code{sinclude} is
4554 silent.  The empty string counts as a file that does not exist.
4555
4556 The macros @code{include} and @code{sinclude} are recognized only with
4557 parameters.
4558 @end deffn
4559
4560 @comment status: 1
4561 @example
4562 include(`n')
4563 @error{}m4:stdin:1: include: cannot open `n': No such file or directory
4564 @result{}
4565 include()
4566 @error{}m4:stdin:2: include: cannot open `': No such file or directory
4567 @result{}
4568 sinclude(`n')
4569 @result{}
4570 sinclude()
4571 @result{}
4572 @end example
4573
4574 The rest of this section assumes that @code{m4} is invoked with the
4575 @option{-I} option (@pxref{Preprocessor features, , Invoking m4})
4576 pointing to the @file{m4-@value{VERSION}/@/examples}
4577 directory shipped as part of the @acronym{GNU} @code{m4} package.  The
4578 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
4579 contains the lines:
4580
4581 @comment ignore
4582 @example
4583 $ @kbd{cat examples/incl.m4}
4584 @result{}Include file start
4585 @result{}foo
4586 @result{}Include file end
4587 @end example
4588
4589 Normally file inclusion is used to insert the contents of a file
4590 into the input stream.  The contents of the file will be read by
4591 @code{m4} and macro calls in the file will be expanded:
4592
4593 @comment examples
4594 @example
4595 $ @kbd{m4 -I examples}
4596 define(`foo', `FOO')
4597 @result{}
4598 include(`incl.m4')
4599 @result{}Include file start
4600 @result{}FOO
4601 @result{}Include file end
4602 @result{}
4603 @end example
4604
4605 The fact that @code{include} and @code{sinclude} expand to the contents
4606 of the file can be used to define macros that operate on entire files.
4607 Here is an example, which defines @samp{bar} to expand to the contents
4608 of @file{incl.m4}:
4609
4610 @comment examples
4611 @example
4612 $ @kbd{m4 -I examples}
4613 define(`bar', include(`incl.m4'))
4614 @result{}
4615 This is `bar':  >>bar<<
4616 @result{}This is bar:  >>Include file start
4617 @result{}foo
4618 @result{}Include file end
4619 @result{}<<
4620 @end example
4621
4622 This use of @code{include} is not trivial, though, as files can contain
4623 quotes, commas, and parentheses, which can interfere with the way the
4624 @code{m4} parser works.  @acronym{GNU} @code{m4} seamlessly concatenates
4625 the file contents with the next character, even if the included file
4626 ended in the middle of a comment, string, or macro call.  These
4627 conditions are only treated as end of file errors if specified as input
4628 files on the command line.
4629
4630 In @acronym{GNU} @code{m4}, an alternative method of reading files is
4631 using @code{undivert} (@pxref{Undivert}) on a named file.
4632
4633 @node Search Path
4634 @section Searching for include files
4635
4636 @cindex search path for included files
4637 @cindex included files, search path for
4638 @cindex @acronym{GNU} extensions
4639 @acronym{GNU} @code{m4} allows included files to be found in other directories
4640 than the current working directory.
4641
4642 @cindex @env{M4PATH}
4643 If the @option{--prepend-include} or @option{-B} command-line option was
4644 provided (@pxref{Preprocessor features, , Invoking m4}), those
4645 directories are searched first, in reverse order that those options were
4646 listed on the command line.  Then @code{m4} looks in the current working
4647 directory.  Next comes the directories specified with the
4648 @option{--include} or @option{-I} option, in the order found on the
4649 command line.  Finally, if the @env{M4PATH} environment variable is set,
4650 it is expected to contain a colon-separated list of directories, which
4651 will be searched in order.
4652
4653 If the automatic search for include-files causes trouble, the @samp{p}
4654 debug flag (@pxref{Debug Levels}) can help isolate the problem.
4655
4656 @node Diversions
4657 @chapter Diverting and undiverting output
4658
4659 @cindex deferring output
4660 Diversions are a way of temporarily saving output.  The output of
4661 @code{m4} can at any time be diverted to a temporary file, and be
4662 reinserted into the output stream, @dfn{undiverted}, again at a later
4663 time.
4664
4665 @cindex @env{TMPDIR}
4666 Numbered diversions are counted from 0 upwards, diversion number 0
4667 being the normal output stream.  The number of simultaneous diversions
4668 is limited mainly by the memory used to describe them, because @acronym{GNU}
4669 @code{m4} tries to keep diversions in memory.  However, there is a
4670 limit to the overall memory usable by all diversions taken altogether
4671 (512K, currently).  When this maximum is about to be exceeded,
4672 a temporary file is opened to receive the contents of the biggest
4673 diversion still in memory, freeing this memory for other diversions.
4674 When creating the temporary file, @code{m4} honors the value of the
4675 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
4676 So, it is theoretically possible that the number and aggregate size of
4677 diversions is limited only by available disk space.
4678
4679 @ignore
4680 @comment We need to test spilled diversions, but don't need to expose
4681 @comment this highly repetitive test in the manual.
4682
4683 @example
4684 divert(`-1')define(`f', `.')
4685 define(`f', defn(`f')defn(`f'))
4686 define(`f', defn(`f')defn(`f'))
4687 define(`f', defn(`f')defn(`f'))
4688 define(`f', defn(`f')defn(`f'))
4689 define(`f', defn(`f')defn(`f'))
4690 define(`f', defn(`f')defn(`f'))
4691 define(`f', defn(`f')defn(`f'))
4692 define(`f', defn(`f')defn(`f'))
4693 define(`f', defn(`f')defn(`f'))
4694 define(`f', defn(`f')defn(`f'))
4695 define(`f', defn(`f')defn(`f'))
4696 define(`f', defn(`f')defn(`f'))
4697 define(`f', defn(`f')defn(`f'))
4698 define(`f', defn(`f')defn(`f'))
4699 define(`f', defn(`f')defn(`f'))
4700 define(`f', defn(`f')defn(`f'))
4701 define(`f', defn(`f')defn(`f'))
4702 define(`f', defn(`f')defn(`f'))
4703 define(`f', defn(`f')defn(`f'))
4704 define(`f', defn(`f')defn(`f'))
4705 divert`'dnl
4706 len(f)
4707 @result{}1048576
4708 divert(`1')
4709 f
4710 divert(`-1')undivert
4711 @end example
4712
4713 @comment Another test of spilled diversions.
4714
4715 @example
4716 divert(`-1')define(`f', `.')
4717 define(`f', defn(`f')defn(`f'))
4718 define(`f', defn(`f')defn(`f'))
4719 define(`f', defn(`f')defn(`f'))
4720 define(`f', defn(`f')defn(`f'))
4721 define(`f', defn(`f')defn(`f'))
4722 define(`f', defn(`f')defn(`f'))
4723 define(`f', defn(`f')defn(`f'))
4724 define(`f', defn(`f')defn(`f'))
4725 define(`f', defn(`f')defn(`f'))
4726 define(`f', defn(`f')defn(`f'))
4727 define(`f', defn(`f')defn(`f'))
4728 define(`f', defn(`f')defn(`f'))
4729 define(`f', defn(`f')defn(`f'))
4730 define(`f', defn(`f')defn(`f'))
4731 define(`f', defn(`f')defn(`f'))
4732 define(`f', defn(`f')defn(`f'))
4733 define(`f', defn(`f')defn(`f'))
4734 define(`f', defn(`f')defn(`f'))
4735 define(`f', defn(`f')defn(`f'))
4736 define(`f', defn(`f')defn(`f'))
4737 divert`'dnl
4738 len(f)
4739 @result{}1048576
4740 divert(`1')
4741 f
4742 m4exit
4743 @end example
4744 @end ignore
4745
4746 Diversions make it possible to generate output in a different order than
4747 the input was read.  It is possible to implement topological sorting
4748 dependencies.  For example, @acronym{GNU} Autoconf makes use of
4749 diversions under the hood to ensure that the expansion of a prerequisite
4750 macro appears in the output prior to the expansion of a dependent macro,
4751 regardless of which order the two macros were invoked in the user's
4752 input file.
4753
4754 @menu
4755 * Divert::                      Diverting output
4756 * Undivert::                    Undiverting output
4757 * Divnum::                      Diversion numbers
4758 * Cleardivert::                 Discarding diverted text
4759 @end menu
4760
4761 @node Divert
4762 @section Diverting output
4763
4764 @cindex diverting output to files
4765 @cindex output, diverting to files
4766 @cindex files, diverting output to
4767 Output is diverted using @code{divert}:
4768
4769 @deffn Builtin divert (@dvar{number, 0})
4770 The current diversion is changed to @var{number}.  If @var{number} is left
4771 out or empty, it is assumed to be zero.  If @var{number} cannot be
4772 parsed, the diversion is unchanged.
4773
4774 The expansion of @code{divert} is void.
4775 @end deffn
4776
4777 When all the @code{m4} input will have been processed, all existing
4778 diversions are automatically undiverted, in numerical order.
4779
4780 @example
4781 divert(`1')
4782 This text is diverted.
4783 divert
4784 @result{}
4785 This text is not diverted.
4786 @result{}This text is not diverted.
4787 ^D
4788 @result{}
4789 @result{}This text is diverted.
4790 @end example
4791
4792 Several calls of @code{divert} with the same argument do not overwrite
4793 the previous diverted text, but append to it.  Diversions are printed
4794 after any wrapped text is expanded.
4795
4796 @example
4797 define(`text', `TEXT')
4798 @result{}
4799 divert(`1')`diverted text.'
4800 divert
4801 @result{}
4802 m4wrap(`Wrapped text precedes ')
4803 @result{}
4804 ^D
4805 @result{}Wrapped TEXT precedes diverted text.
4806 @end example
4807
4808 @cindex discarding input
4809 @cindex input, discarding
4810 If output is diverted to a negative diversion, it is simply discarded.
4811 This can be used to suppress unwanted output.  A common example of
4812 unwanted output is the trailing newlines after macro definitions.  Here
4813 is a common programming idiom in @code{m4} for avoiding them.
4814
4815 @example
4816 divert(`-1')
4817 define(`foo', `Macro `foo'.')
4818 define(`bar', `Macro `bar'.')
4819 divert
4820 @result{}
4821 @end example
4822
4823 @cindex @acronym{GNU} extensions
4824 Traditional implementations only supported ten diversions.  But as a
4825 @acronym{GNU} extension, diversion numbers can be as large as positive
4826 integers will allow, rather than treating a multi-digit diversion number
4827 as a request to discard text.
4828
4829 @example
4830 divert(eval(`1<<28'))world
4831 divert(`2')hello
4832 ^D
4833 @result{}hello
4834 @result{}world
4835 @end example
4836
4837 Note that @code{divert} is an English word, but also an active macro
4838 without arguments.  When processing plain text, the word might appear in
4839 normal text and be unintentionally swallowed as a macro invocation.  One
4840 way to avoid this is to use the @option{-P} option to rename all
4841 builtins (@pxref{Operation modes, , Invoking m4}).  Another is to write
4842 a wrapper that requires a parameter to be recognized.
4843
4844 @example
4845 We decided to divert the stream for irrigation.
4846 @result{}We decided to  the stream for irrigation.
4847 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
4848 @result{}
4849 divert(`-1')
4850 Ignored text.
4851 divert(`0')
4852 @result{}
4853 We decided to divert the stream for irrigation.
4854 @result{}We decided to divert the stream for irrigation.
4855 @end example
4856
4857 @node Undivert
4858 @section Undiverting output
4859
4860 Diverted text can be undiverted explicitly using the builtin
4861 @code{undivert}:
4862
4863 @deffn Builtin undivert (@ovar{diversions@dots{}})
4864 Undiverts the numeric @var{diversions} given by the arguments, in the
4865 order given.  If no arguments are supplied, all diversions are
4866 undiverted, in numerical order.
4867
4868 @cindex file inclusion
4869 @cindex inclusion, of files
4870 @cindex @acronym{GNU} extensions
4871 As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
4872 strings, which are treated as the names of files to copy into the output
4873 without expansion.  A warning is issued if a file could not be opened.
4874
4875 The expansion of @code{undivert} is void.
4876 @end deffn
4877
4878 @example
4879 divert(`1')
4880 This text is diverted.
4881 divert
4882 @result{}
4883 This text is not diverted.
4884 @result{}This text is not diverted.
4885 undivert(`1')
4886 @result{}
4887 @result{}This text is diverted.
4888 @result{}
4889 @end example
4890
4891 Notice the last two blank lines.  One of them comes from the newline
4892 following @code{undivert}, the other from the newline that followed the
4893 @code{divert}!  A diversion often starts with a blank line like this.
4894
4895 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
4896 but rather copied directly to the current output, and it is therefore
4897 not an error to undivert into a diversion.  Undiverting the empty string
4898 is the same as specifying diversion 0; in either case nothing happens
4899 since the output has already been flushed.
4900
4901 @example
4902 divert(`1')diverted text
4903 divert
4904 @result{}
4905 undivert()
4906 @result{}
4907 undivert(`0')
4908 @result{}
4909 undivert
4910 @result{}diverted text
4911 @result{}
4912 @end example
4913
4914 When a diversion has been undiverted, the diverted text is discarded,
4915 and it is not possible to bring back diverted text more than once.
4916
4917 @example
4918 divert(`1')
4919 This text is diverted first.
4920 divert(`0')undivert(`1')dnl
4921 @result{}
4922 @result{}This text is diverted first.
4923 undivert(`1')
4924 @result{}
4925 divert(`1')
4926 This text is also diverted but not appended.
4927 divert(`0')undivert(`1')dnl
4928 @result{}
4929 @result{}This text is also diverted but not appended.
4930 @end example
4931
4932 Attempts to undivert the current diversion are silently ignored.  Thus,
4933 when the current diversion is not 0, the current diversion does not get
4934 rearranged among the other diversions.
4935
4936 @example
4937 divert(`1')one
4938 divert(`2')two
4939 divert(`3')three
4940 divert(`2')undivert`'dnl
4941 divert`'undivert`'dnl
4942 @result{}two
4943 @result{}one
4944 @result{}three
4945 @end example
4946
4947 @cindex @acronym{GNU} extensions
4948 @cindex file inclusion
4949 @cindex inclusion, of files
4950 @acronym{GNU} @code{m4} allows named files to be undiverted.  Given a
4951 non-numeric argument, the contents of the file named will be copied,
4952 uninterpreted, to the current output.  This complements the builtin
4953 @code{include} (@pxref{Include}).  To illustrate the difference, assume
4954 the file @file{foo} contains:
4955
4956 @comment ignore
4957 @example
4958 $ @kbd{cat foo}
4959 bar
4960 @end example
4961
4962 @noindent
4963 then
4964
4965 @example
4966 define(`bar', `BAR')
4967 @result{}
4968 undivert(`foo')
4969 @result{}bar
4970 @result{}
4971 include(`foo')
4972 @result{}BAR
4973 @result{}
4974 @end example
4975
4976 If the file is not found (or cannot be read), an error message is
4977 issued, and the expansion is void.  It is possible to intermix files
4978 and diversion numbers.
4979
4980 @example
4981 divert(`1')diversion one
4982 divert(`2')undivert(`foo')dnl
4983 divert(`3')diversion three
4984 divert`'dnl
4985 undivert(`1', `2', `foo', `3')dnl
4986 @result{}diversion one
4987 @result{}bar
4988 @result{}bar
4989 @result{}diversion three
4990 @end example
4991
4992 @node Divnum
4993 @section Diversion numbers
4994
4995 @cindex diversion numbers
4996 The current diversion is tracked by the builtin @code{divnum}:
4997
4998 @deffn Builtin divnum
4999 Expands to the number of the current diversion.
5000 @end deffn
5001
5002 @example
5003 Initial divnum
5004 @result{}Initial 0
5005 divert(`1')
5006 Diversion one: divnum
5007 divert(`2')
5008 Diversion two: divnum
5009 ^D
5010 @result{}
5011 @result{}Diversion one: 1
5012 @result{}
5013 @result{}Diversion two: 2
5014 @end example
5015
5016 @node Cleardivert
5017 @section Discarding diverted text
5018
5019 @cindex discarding diverted text
5020 @cindex diverted text, discarding
5021 Often it is not known, when output is diverted, whether the diverted
5022 text is actually needed.  Since all non-empty diversion are brought back
5023 on the main output stream when the end of input is seen, a method of
5024 discarding a diversion is needed.  If all diversions should be
5025 discarded, the easiest is to end the input to @code{m4} with
5026 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
5027
5028 @example
5029 divert(`1')
5030 Diversion one: divnum
5031 divert(`2')
5032 Diversion two: divnum
5033 divert(`-1')
5034 undivert
5035 ^D
5036 @end example
5037
5038 @noindent
5039 No output is produced at all.
5040
5041 Clearing selected diversions can be done with the following macro:
5042
5043 @deffn Composite cleardivert (@ovar{diversions@dots{}})
5044 Discard the contents of each of the listed numeric @var{diversions}.
5045 @end deffn
5046
5047 @example
5048 define(`cleardivert',
5049 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5050 @result{}
5051 @end example
5052
5053 It is called just like @code{undivert}, but the effect is to clear the
5054 diversions, given by the arguments.  (This macro has a nasty bug!  You
5055 should try to see if you can find it and correct it; or @pxref{Improved
5056 cleardivert, , Answers}).
5057
5058 @node Text handling
5059 @chapter Macros for text handling
5060
5061 There are a number of builtins in @code{m4} for manipulating text in
5062 various ways, extracting substrings, searching, substituting, and so on.
5063
5064 @menu
5065 * Len::                         Calculating length of strings
5066 * Index macro::                 Searching for substrings
5067 * Regexp::                      Searching for regular expressions
5068 * Substr::                      Extracting substrings
5069 * Translit::                    Translating characters
5070 * Patsubst::                    Substituting text by regular expression
5071 * Format::                      Formatting strings (printf-like)
5072 @end menu
5073
5074 @node Len
5075 @section Calculating length of strings
5076
5077 @cindex length of strings
5078 @cindex strings, length of
5079 The length of a string can be calculated by @code{len}:
5080
5081 @deffn Builtin len (@var{string})
5082 Expands to the length of @var{string}, as a decimal number.
5083
5084 The macro @code{len} is recognized only with parameters.
5085 @end deffn
5086
5087 @example
5088 len()
5089 @result{}0
5090 len(`abcdef')
5091 @result{}6
5092 @end example
5093
5094 @node Index macro
5095 @section Searching for substrings
5096
5097 @cindex substrings, locating
5098 Searching for substrings is done with @code{index}:
5099
5100 @deffn Builtin index (@var{string}, @var{substring})
5101 Expands to the index of the first occurrence of @var{substring} in
5102 @var{string}.  The first character in @var{string} has index 0.  If
5103 @var{substring} does not occur in @var{string}, @code{index} expands to
5104 @samp{-1}.
5105
5106 The macro @code{index} is recognized only with parameters.
5107 @end deffn
5108
5109 @example
5110 index(`gnus, gnats, and armadillos', `nat')
5111 @result{}7
5112 index(`gnus, gnats, and armadillos', `dag')
5113 @result{}-1
5114 @end example
5115
5116 Omitting @var{substring} evokes a warning, but still produces output;
5117 contrast this with an empty @var{substring}.
5118
5119 @example
5120 index(`abc')
5121 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
5122 @result{}0
5123 index(`abc', `')
5124 @result{}0
5125 index(`abc', `b')
5126 @result{}1
5127 @end example
5128
5129 @node Regexp
5130 @section Searching for regular expressions
5131
5132 @cindex basic regular expressions
5133 @cindex regular expressions
5134 @cindex expressions, regular
5135 @cindex @acronym{GNU} extensions
5136 Searching for regular expressions is done with the builtin
5137 @code{regexp}:
5138
5139 @deffn Builtin regexp (@var{string}, @var{regexp}, @ovar{replacement})
5140 Searches for @var{regexp} in @var{string}.  The syntax for regular
5141 expressions is the same as in @acronym{GNU} Emacs, which is similar to
5142 @acronym{BRE, Basic Regular Expressions} in @acronym{POSIX}.
5143 @ifnothtml
5144 @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
5145 Manual}.
5146 @end ifnothtml
5147 @ifhtml
5148 See
5149 @uref{http://www.gnu.org/@/software/@/emacs/@/manual/@/emacs.html#Regexps,
5150 Syntax of Regular Expressions} in the @acronym{GNU} Emacs Manual.
5151 @end ifhtml
5152 Support for @acronym{ERE, Extended Regular Expressions} is not
5153 available, but will be added in @acronym{GNU} M4 2.0.
5154
5155 If @var{replacement} is omitted, @code{regexp} expands to the index of
5156 the first match of @var{regexp} in @var{string}.  If @var{regexp} does
5157 not match anywhere in @var{string}, it expands to -1.
5158
5159 If @var{replacement} is supplied, and there was a match, @code{regexp}
5160 changes the expansion to this argument, with @samp{\@var{n}} substituted
5161 by the text matched by the @var{n}th parenthesized sub-expression of
5162 @var{regexp}, up to nine sub-expressions.  The escape @samp{\&} is
5163 replaced by the text of the entire regular expression matched.  For
5164 all other characters, @samp{\} treats the next character literally.  A
5165 warning is issued if there were fewer sub-expressions than the
5166 @samp{\@var{n}} requested, or if there is a trailing @samp{\}.  If there
5167 was no match, @code{regexp} expands to the empty string.
5168
5169 The macro @code{regexp} is recognized only with parameters.
5170 @end deffn
5171
5172 @example
5173 regexp(`GNUs not Unix', `\<[a-z]\w+')
5174 @result{}5
5175 regexp(`GNUs not Unix', `\<Q\w*')
5176 @result{}-1
5177 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
5178 @result{}*** Unix *** nix ***
5179 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
5180 @result{}
5181 @end example
5182
5183 Here are some more examples on the handling of backslash:
5184
5185 @example
5186 regexp(`abc', `\(b\)', `\\\10\a')
5187 @result{}\b0a
5188 regexp(`abc', `b', `\1\')
5189 @error{}m4:stdin:2: Warning: regexp: sub-expression 1 not present
5190 @error{}m4:stdin:2: Warning: regexp: trailing \ ignored in replacement
5191 @result{}
5192 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
5193 @error{}m4:stdin:3: Warning: regexp: sub-expression 4 not present
5194 @error{}m4:stdin:3: Warning: regexp: sub-expression 5 not present
5195 @error{}m4:stdin:3: Warning: regexp: sub-expression 6 not present
5196 @result{}c
5197 @end example
5198
5199 Omitting @var{regexp} evokes a warning, but still produces output;
5200 contrast this with an empty @var{regexp} argument.
5201
5202 @example
5203 regexp(`abc')
5204 @error{}m4:stdin:1: Warning: regexp: too few arguments: 1 < 2
5205 @result{}0
5206 regexp(`abc', `')
5207 @result{}0
5208 regexp(`abc', `', `\\def')
5209 @result{}\def
5210 @end example
5211
5212 @node Substr
5213 @section Extracting substrings
5214
5215 @cindex extracting substrings
5216 @cindex substrings, extracting
5217 Substrings are extracted with @code{substr}:
5218
5219 @deffn Builtin substr (@var{string}, @var{from}, @ovar{length})
5220 Expands to the substring of @var{string}, which starts at index
5221 @var{from}, and extends for @var{length} characters, or to the end of
5222 @var{string}, if @var{length} is omitted.  The starting index of a string
5223 is always 0.  The expansion is empty if there is an error parsing
5224 @var{from} or @var{length}, if @var{from} is beyond the end of
5225 @var{string}, or if @var{length} is negative.
5226
5227 The macro @code{substr} is recognized only with parameters.
5228 @end deffn
5229
5230 @example
5231 substr(`gnus, gnats, and armadillos', `6')
5232 @result{}gnats, and armadillos
5233 substr(`gnus, gnats, and armadillos', `6', `5')
5234 @result{}gnats
5235 @end example
5236
5237 Omitting @var{from} evokes a warning, but still produces output.
5238
5239 @example
5240 substr(`abc')
5241 @error{}m4:stdin:1: Warning: substr: too few arguments: 1 < 2
5242 @result{}abc
5243 substr(`abc',)
5244 @error{}m4:stdin:2: Warning: substr: empty string treated as 0
5245 @result{}abc
5246 @end example
5247
5248 @node Translit
5249 @section Translating characters
5250
5251 @cindex translating characters
5252 @cindex characters, translating
5253 Character translation is done with @code{translit}:
5254
5255 @deffn Builtin translit (@var{string}, @var{chars}, @ovar{replacement})
5256 Expands to @var{string}, with each character that occurs in
5257 @var{chars} translated into the character from @var{replacement} with
5258 the same index.
5259
5260 If @var{replacement} is shorter than @var{chars}, the excess characters
5261 of @var{chars} are deleted from the expansion; if @var{chars} is
5262 shorter, the excess characters in @var{replacement} are silently
5263 ignored.  If @var{replacement} is omitted, all characters in
5264 @var{string} that are present in @var{chars} are deleted from the
5265 expansion.  If a character appears more than once in @var{chars}, only
5266 the first instance is used in making the translation.  Only a single
5267 translation pass is made, even if characters in @var{replacement} also
5268 appear in @var{chars}.
5269
5270 As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
5271 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
5272 letters) or @samp{0-9} (meaning all digits).  To include a dash @samp{-}
5273 in @var{chars} or @var{replacement}, place it first or last in the
5274 entire string, or as the last character of a range.  Back-to-back ranges
5275 can share a common endpoint.  It is not an error for the last character
5276 in the range to be `larger' than the first.  In that case, the range
5277 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
5278 The expansion of a range is dependent on the underlying encoding of
5279 characters, so using ranges is not always portable between machines.
5280
5281 The macro @code{translit} is recognized only with parameters.
5282 @end deffn
5283
5284 @example
5285 translit(`GNUs not Unix', `A-Z')
5286 @result{}s not nix
5287 translit(`GNUs not Unix', `a-z', `A-Z')
5288 @result{}GNUS NOT UNIX
5289 translit(`GNUs not Unix', `A-Z', `z-a')
5290 @result{}tmfs not fnix
5291 translit(`+,-12345', `+--1-5', `<;>a-c-a')
5292 @result{}<;>abcba
5293 translit(`abcdef', `aabdef', `bcged')
5294 @result{}bgced
5295 @end example
5296
5297 In the @sc{ascii} encoding, the first example deletes all uppercase
5298 letters, the second converts lowercase to uppercase, and the third
5299 `mirrors' all uppercase letters, while converting them to lowercase.
5300 The two first cases are by far the most common, even though they are not
5301 portable to @sc{ebcdic} or other encodings.  The fourth example shows a
5302 range ending in @samp{-}, as well as back-to-back ranges.  The final
5303 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
5304 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
5305 @samp{e} are swapped, and the @samp{f} is discarded.
5306
5307 @ignore
5308 @comment No need to fight 8-bit characters, as it is difficult to get
5309 @comment rendering right in both info and dvi.
5310
5311 @example
5312 translit(`«abc~', `~-»')
5313 @result{}abc
5314 @end example
5315 @end ignore
5316
5317 Omitting @var{chars} evokes a warning, but still produces output.
5318
5319 @example
5320 translit(`abc')
5321 @error{}m4:stdin:1: Warning: translit: too few arguments: 1 < 2
5322 @result{}abc
5323 @end example
5324
5325 @node Patsubst
5326 @section Substituting text by regular expression
5327
5328 @cindex basic regular expressions
5329 @cindex regular expressions
5330 @cindex expressions, regular
5331 @cindex pattern substitution
5332 @cindex substitution by regular expression
5333 @cindex @acronym{GNU} extensions
5334 Global substitution in a string is done by @code{patsubst}:
5335
5336 @deffn Builtin patsubst (@var{string}, @var{regexp}, @ovar{replacement})
5337 Searches @var{string} for matches of @var{regexp}, and substitutes
5338 @var{replacement} for each match.  The syntax for regular expressions
5339 is the same as in @acronym{GNU} Emacs (@pxref{Regexp}).
5340
5341 The parts of @var{string} that are not covered by any match of
5342 @var{regexp} are copied to the expansion.  Whenever a match is found, the
5343 search proceeds from the end of the match, so a character from
5344 @var{string} will never be substituted twice.  If @var{regexp} matches a
5345 string of zero length, the start position for the search is incremented,
5346 to avoid infinite loops.
5347
5348 When a replacement is to be made, @var{replacement} is inserted into
5349 the expansion, with @samp{\@var{n}} substituted by the text matched by
5350 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
5351 nine sub-expressions.  The escape @samp{\&} is replaced by the text of
5352 the entire regular expression matched.  For all other characters,
5353 @samp{\} treats the next character literally.  A warning is issued if
5354 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
5355 if there is a trailing @samp{\}.
5356
5357 The @var{replacement} argument can be omitted, in which case the text
5358 matched by @var{regexp} is deleted.
5359
5360 The macro @code{patsubst} is recognized only with parameters.
5361 @end deffn
5362
5363 @example
5364 patsubst(`GNUs not Unix', `^', `OBS: ')
5365 @result{}OBS: GNUs not Unix
5366 patsubst(`GNUs not Unix', `\<', `OBS: ')
5367 @result{}OBS: GNUs OBS: not OBS: Unix
5368 patsubst(`GNUs not Unix', `\w*', `(\&)')
5369 @result{}(GNUs)() (not)() (Unix)()
5370 patsubst(`GNUs not Unix', `\w+', `(\&)')
5371 @result{}(GNUs) (not) (Unix)
5372 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
5373 @result{}GN not@w{ }
5374 patsubst(`GNUs not Unix', `not', `NOT\')
5375 @error{}m4:stdin:6: Warning: patsubst: trailing \ ignored in replacement
5376 @result{}GNUs NOT Unix
5377 @end example
5378
5379 Here is a slightly more realistic example, which capitalizes individual
5380 words or whole sentences, by substituting calls of the macros
5381 @code{upcase} and @code{downcase} into the strings.
5382
5383 @deffn Composite upcase (@var{text})
5384 @deffnx Composite downcase (@var{text})
5385 @deffnx Composite capitalize (@var{text})
5386 Expand to @var{text}, but with capitalization changed: @code{upcase}
5387 changes all letters to upper case, @code{downcase} changes all letters
5388 to lower case, and @code{capitalize} changes the first character of each
5389 word to upper case and the remaining characters to lower case.
5390 @end deffn
5391
5392 First, an example of their usage, using implementations distributed in
5393 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
5394
5395 @comment examples
5396 @example
5397 $ @kbd{m4 -I examples}
5398 include(`capitalize.m4')
5399 @result{}
5400 upcase(`GNUs not Unix')
5401 @result{}GNUS NOT UNIX
5402 downcase(`GNUs not Unix')
5403 @result{}gnus not unix
5404 capitalize(`GNUs not Unix')
5405 @result{}Gnus Not Unix
5406 @end example
5407
5408 Now for the implementation.  There is a helper macro @code{_capitalize}
5409 which puts only its first word in mixed case.  Then @code{capitalize}
5410 merely parses out the words, and replaces them with an invocation of
5411 @code{_capitalize}.  (As presented here, the @code{capitalize} macro has
5412 some subtle flaws.  You should try to see if you can find and correct
5413 them; or @pxref{Improved capitalize, , Answers}).
5414
5415 @comment examples
5416 @example
5417 $ @kbd{m4 -I examples}
5418 undivert(`capitalize.m4')dnl
5419 @result{}divert(`-1')
5420 @result{}# upcase(text)
5421 @result{}# downcase(text)
5422 @result{}# capitalize(text)
5423 @result{}#   change case of text, simple version
5424 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
5425 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
5426 @result{}define(`_capitalize',
5427 @result{}       `regexp(`$1', `^\(\w\)\(\w*\)',
5428 @result{}               `upcase(`\1')`'downcase(`\2')')')
5429 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
5430 @result{}divert`'dnl
5431 @end example
5432
5433 While @code{regexp} replaces the whole input with the replacement as
5434 soon as there is a match, @code{patsubst} replaces each
5435 @emph{occurrence} of a match and preserves non-matching pieces:
5436
5437 @example
5438 define(`patreg',
5439 `patsubst($@@)
5440 regexp($@@)')dnl
5441 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
5442 @result{}bar FOO baz FOO
5443 @result{}FOO
5444 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
5445 @result{}bab abb 212
5446 @result{}bab
5447 @end example
5448
5449 Omitting @var{regexp} evokes a warning, but still produces output;
5450 contrast this with an empty @var{regexp} argument.
5451
5452 @example
5453 patsubst(`abc')
5454 @error{}m4:stdin:1: Warning: patsubst: too few arguments: 1 < 2
5455 @result{}abc
5456 patsubst(`abc', `')
5457 @result{}abc
5458 patsubst(`abc', `', `\\-')
5459 @result{}\-a\-b\-c\-
5460 @end example
5461
5462 @node Format
5463 @section Formatting strings (printf-like)
5464
5465 @cindex formatted output
5466 @cindex output, formatted
5467 @cindex @acronym{GNU} extensions
5468 Formatted output can be made with @code{format}:
5469
5470 @deffn Builtin format (@var{format-string}, @dots{})
5471 Works much like the C function @code{printf}.  The first argument
5472 @var{format-string} can contain @samp{%} specifications which are
5473 satisfied by additional arguments, and the expansion of @code{format} is
5474 the formatted string.
5475
5476 The macro @code{format} is recognized only with parameters.
5477 @end deffn
5478
5479 Its use is best described by a few examples:
5480
5481 @comment This test is a bit fragile, if someone tries to port to a
5482 @comment platform without infinity.
5483 @example
5484 define(`foo', `The brown fox jumped over the lazy dog')
5485 @result{}
5486 format(`The string "%s" uses %d characters', foo, len(foo))
5487 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
5488 format(`%*.*d', `-1', `-1', `1')
5489 @result{}1
5490 format(`%.0f', `56789.9876')
5491 @result{}56790
5492 len(format(`%-*X', `5000', `1'))
5493 @result{}5000
5494 ifelse(format(`%010F', `infinity'), `       INF', `success',
5495        format(`%010F', `infinity'), `  INFINITY', `success',
5496        format(`%010F', `infinity'))
5497 @result{}success
5498 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
5499        format(`%.1A', `1.999'), `0X2.0P+0', `success',
5500        format(`%.1A', `1.999'))
5501 @result{}success
5502 @end example
5503
5504 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
5505 example shows how @code{format} can be used to produce tabular output.
5506
5507 @comment examples
5508 @example
5509 $ @kbd{m4 -I examples}
5510 include(`forloop.m4')
5511 @result{}
5512 forloop(`i', `1', `10', `format(`%6d squared is %10d
5513 ', i, eval(i**2))')
5514 @result{}     1 squared is          1
5515 @result{}     2 squared is          4
5516 @result{}     3 squared is          9
5517 @result{}     4 squared is         16
5518 @result{}     5 squared is         25
5519 @result{}     6 squared is         36
5520 @result{}     7 squared is         49
5521 @result{}     8 squared is         64
5522 @result{}     9 squared is         81
5523 @result{}    10 squared is        100
5524 @result{}
5525 @end example
5526
5527 The builtin @code{format} is modeled after the ANSI C @samp{printf}
5528 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
5529 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
5530 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
5531 @samp{%}; it supports field widths and precisions, and the flags
5532 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}.  For
5533 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
5534 @samp{l} are recognized, and for floating point specifiers, the width
5535 modifier @samp{l} is recognized.  Items not yet supported include
5536 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
5537 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
5538 modifiers, and any platform extensions available in the native
5539 @code{printf}.  For more details on the functioning of @code{printf},
5540 see the C Library Manual, or the @acronym{POSIX} specification (for
5541 example, @samp{%a} is supported even on platforms that haven't yet
5542 implemented C99 hexadecimal floating point output natively).
5543
5544 Warnings are issued for unrecognized specifiers, an improper number of
5545 arguments, or difficulty parsing an argument according to the format
5546 string (such as overflow or extra characters).  It is anticipated that a
5547 future release of @acronym{GNU} @code{m4} will support more specifiers.
5548 Likewise, escape sequences are not yet recognized.
5549
5550 @example
5551 format(`%p', `0')
5552 @error{}m4:stdin:1: Warning: format: unrecognized specifier in `%p'
5553 @result{}
5554 format(`%*d', `')
5555 @error{}m4:stdin:2: Warning: format: empty string treated as 0
5556 @error{}m4:stdin:2: Warning: format: too few arguments: 2 < 3
5557 @result{}0
5558 format(`%.1f', `2a')
5559 @error{}m4:stdin:3: Warning: format: non-numeric argument `2a'
5560 @result{}2.0
5561 @end example
5562
5563 @node Arithmetic
5564 @chapter Macros for doing arithmetic
5565
5566 @cindex arithmetic
5567 @cindex integer arithmetic
5568 Integer arithmetic is included in @code{m4}, with a C-like syntax.  As
5569 convenient shorthands, there are builtins for simple increment and
5570 decrement operations.
5571
5572 @menu
5573 * Incr::                        Decrement and increment operators
5574 * Eval::                        Evaluating integer expressions
5575 @end menu
5576
5577 @node Incr
5578 @section Decrement and increment operators
5579
5580 @cindex decrement operator
5581 @cindex increment operator
5582 Increment and decrement of integers are supported using the builtins
5583 @code{incr} and @code{decr}:
5584
5585 @deffn Builtin incr (@var{number})
5586 @deffnx Builtin decr (@var{number})
5587 Expand to the numerical value of @var{number}, incremented
5588 or decremented, respectively, by one.  Except for the empty string, the
5589 expansion is empty if @var{number} could not be parsed.
5590
5591 The macros @code{incr} and @code{decr} are recognized only with
5592 parameters.
5593 @end deffn
5594
5595 @example
5596 incr(`4')
5597 @result{}5
5598 decr(`7')
5599 @result{}6
5600 incr()
5601 @error{}m4:stdin:3: Warning: incr: empty string treated as 0
5602 @result{}1
5603 decr()
5604 @error{}m4:stdin:4: Warning: decr: empty string treated as 0
5605 @result{}-1
5606 @end example
5607
5608 @node Eval
5609 @section Evaluating integer expressions
5610
5611 @cindex integer expression evaluation
5612 @cindex evaluation, of integer expressions
5613 @cindex expressions, evaluation of integer
5614 Integer expressions are evaluated with @code{eval}:
5615
5616 @deffn Builtin eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
5617 Expands to the value of @var{expression}.  The expansion is empty
5618 if a problem is encountered while parsing the arguments.  If specified,
5619 @var{radix} and @var{width} control the format of the output.
5620
5621 Calculations are done with 32-bit signed numbers.  Overflow silently
5622 results in wraparound.  A warning is issued if division by zero is
5623 attempted, or if @var{expression} could not be parsed.
5624
5625 Expressions can contain the following operators, listed in order of
5626 decreasing precedence.
5627
5628 @table @samp
5629 @item ()
5630 Parentheses
5631 @item +  -  ~  !
5632 Unary plus and minus, and bitwise and logical negation
5633 @item **
5634 Exponentiation
5635 @item *  /  %
5636 Multiplication, division, and modulo
5637 @item +  -
5638 Addition and subtraction
5639 @item <<  >>
5640 Shift left or right
5641 @item >  >=  <  <=
5642 Relational operators
5643 @item ==  !=
5644 Equality operators
5645 @item &
5646 Bitwise and
5647 @item ^
5648 Bitwise exclusive-or
5649 @item |
5650 Bitwise or
5651 @item &&
5652 Logical and
5653 @item ||
5654 Logical or
5655 @end table
5656
5657 The macro @code{eval} is recognized only with parameters.
5658 @end deffn
5659
5660 All binary operators, except exponentiation, are left associative.  C
5661 operators that perform variable assignment, such as @samp{+=} or
5662 @samp{--}, are not implemented, since @code{eval} only operates on
5663 constants, not variables.  Attempting to use them results in an error.
5664 However, since traditional implementations treated @samp{=} as an
5665 undocumented alias for @samp{==} as opposed to an assignment operator,
5666 this usage is supported as a special case.  Be aware that a future
5667 version of @acronym{GNU} M4 may support assignment semantics as an
5668 extension when @acronym{POSIX} mode is not requested, and that using
5669 @samp{=} to check equality is not portable.
5670
5671 @comment status: 1
5672 @example
5673 eval(`2 = 2')
5674 @error{}m4:stdin:1: Warning: eval: recommend ==, not =, for equality
5675 @result{}1
5676 eval(`++0')
5677 @error{}m4:stdin:2: eval: invalid operator: ++0
5678 @result{}
5679 eval(`0 |= 1')
5680 @error{}m4:stdin:3: eval: invalid operator: 0 |= 1
5681 @result{}
5682 @end example
5683
5684 Note that some older @code{m4} implementations use @samp{^} as an
5685 alternate operator for the exponentiation, although @acronym{POSIX}
5686 requires the C behavior of bitwise exclusive-or.  The precedence of the
5687 negation operators, @samp{~} and @samp{!}, was traditionally lower than
5688 equality.  The unary operators could not be used reliably more than once
5689 on the same term without intervening parentheses.  The traditional
5690 precedence of the equality operators @samp{==} and @samp{!=} was
5691 identical instead of lower than the relational operators such as
5692 @samp{<}, even through @acronym{GNU} M4 1.4.8.  Starting with version
5693 1.4.9, @acronym{GNU} M4 correctly follows @acronym{POSIX} precedence
5694 rules.  M4 scripts designed to be portable between releases must be
5695 aware that parentheses may be required to enforce C precedence rules.
5696 Likewise, division by zero, even in the unused branch of a
5697 short-circuiting operator, is not always well-defined in other
5698 implementations.
5699
5700 Following are some examples where the current version of M4 follows C
5701 precedence rules, but where older versions and some other
5702 implementations of @code{m4} require explicit parentheses to get the
5703 correct result:
5704
5705 @example
5706 eval(`1 == 2 > 0')
5707 @result{}1
5708 eval(`(1 == 2) > 0')
5709 @result{}0
5710 eval(`! 0 * 2')
5711 @result{}2
5712 eval(`! (0 * 2)')
5713 @result{}1
5714 eval(`1 | 1 ^ 1')
5715 @result{}1
5716 eval(`(1 | 1) ^ 1')
5717 @result{}0
5718 eval(`+ + - ~ ! ~ 0')
5719 @result{}1
5720 eval(`2 || 1 / 0')
5721 @result{}1
5722 eval(`0 || 1 / 0')
5723 @error{}m4:stdin:9: Warning: eval: divide by zero: 0 || 1 / 0
5724 @result{}
5725 eval(`0 && 1 % 0')
5726 @result{}0
5727 eval(`2 && 1 % 0')
5728 @error{}m4:stdin:11: Warning: eval: modulo by zero: 2 && 1 % 0
5729 @result{}
5730 @end example
5731
5732 @cindex @acronym{GNU} extensions
5733 As a @acronym{GNU} extension, the operator @samp{**} performs integral
5734 exponentiation.  The operator is right-associative, and if evaluated,
5735 the exponent must be non-negative, and at least one of the arguments
5736 must be non-zero, or a warning is issued.
5737
5738 @example
5739 eval(`2 ** 3 ** 2')
5740 @result{}512
5741 eval(`(2 ** 3) ** 2')
5742 @result{}64
5743 eval(`0 ** 1')
5744 @result{}0
5745 eval(`2 ** 0')
5746 @result{}1
5747 eval(`0 ** 0')
5748 @result{}
5749 @error{}m4:stdin:5: Warning: eval: divide by zero: 0 ** 0
5750 eval(`4 ** -2')
5751 @error{}m4:stdin:6: Warning: eval: negative exponent: 4 ** -2
5752 @result{}
5753 @end example
5754
5755 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
5756 without a special prefix are decimal.  A simple @samp{0} prefix
5757 introduces an octal number.  @samp{0x} introduces a hexadecimal number.
5758 As @acronym{GNU} extensions, @samp{0b} introduces a binary number.
5759 @samp{0r} introduces a number expressed in any radix between 1 and 36:
5760 the prefix should be immediately followed by the decimal expression of
5761 the radix, a colon, then the digits making the number.  For radix 1,
5762 leading zeros are ignored, and all remaining digits must be @samp{1};
5763 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
5764 @dots{}.  Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
5765 to @samp{z}.  Lower and upper case letters can be used interchangeably
5766 in numbers prefixes and as number digits.
5767
5768 Parentheses may be used to group subexpressions whenever needed.  For the
5769 relational operators, a true relation returns @code{1}, and a false
5770 relation return @code{0}.
5771
5772 Here are a few examples of use of @code{eval}.
5773
5774 @example
5775 eval(`-3 * 5')
5776 @result{}-15
5777 eval(`-99 / 10')
5778 @result{}-9
5779 eval(`-99 % 10')
5780 @result{}-9
5781 eval(`99 % -10')
5782 @result{}9
5783 eval(index(`Hello world', `llo') >= 0)
5784 @result{}1
5785 eval(`0r1:0111 + 0b100 + 0r3:12')
5786 @result{}12
5787 define(`square', `eval(`($1) ** 2')')
5788 @result{}
5789 square(`9')
5790 @result{}81
5791 square(square(`5')` + 1')
5792 @result{}676
5793 define(`foo', `666')
5794 @result{}
5795 eval(`foo / 6')
5796 @error{}m4:stdin:11: Warning: eval: bad expression: foo / 6
5797 @result{}
5798 eval(foo / 6)
5799 @result{}111
5800 @end example
5801
5802 As the last two lines show, @code{eval} does not handle macro
5803 names, even if they expand to a valid expression (or part of a valid
5804 expression).  Therefore all macros must be expanded before they are
5805 passed to @code{eval}.
5806
5807 Some calculations are not portable to other implementations, since they
5808 have undefined semantics in C, but @acronym{GNU} @code{m4} has
5809 well-defined behavior on overflow.  When shifting, an out-of-range shift
5810 amount is implicitly brought into the range of 32-bit signed integers
5811 using an implicit bit-wise and with 0x1f).
5812
5813 @example
5814 define(`max_int', eval(`0x7fffffff'))
5815 @result{}
5816 define(`min_int', incr(max_int))
5817 @result{}
5818 eval(min_int` < 0')
5819 @result{}1
5820 eval(max_int` > 0')
5821 @result{}1
5822 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
5823 @result{}overflow occurred
5824 min_int
5825 @result{}-2147483648
5826 eval(`0x80000000 % -1')
5827 @result{}0
5828 eval(`-4 >> 1')
5829 @result{}-2
5830 eval(`-4 >> 33')
5831 @result{}-2
5832 @end example
5833
5834 If @var{radix} is specified, it specifies the radix to be used in the
5835 expansion.  The default radix is 10; this is also the case if
5836 @var{radix} is the empty string.  A warning results if the radix is
5837 outside the range of 1 through 36, inclusive.  The result of @code{eval}
5838 is always taken to be signed.  No radix prefix is output, and for
5839 radices greater than 10, the digits are lower case.  The @var{width}
5840 argument specifies the minimum output width, excluding any negative
5841 sign.  The result is zero-padded to extend the expansion to the
5842 requested width.  A warning results if the width is negative.  If
5843 @var{radix} or @var{width} is out of bounds, the expansion of
5844 @code{eval} is empty.
5845
5846 @example
5847 eval(`666', `10')
5848 @result{}666
5849 eval(`666', `11')
5850 @result{}556
5851 eval(`666', `6')
5852 @result{}3030
5853 eval(`666', `6', `10')
5854 @result{}0000003030
5855 eval(`-666', `6', `10')
5856 @result{}-0000003030
5857 eval(`10', `', `0')
5858 @result{}10
5859 `0r1:'eval(`10', `1', `11')
5860 @result{}0r1:01111111111
5861 eval(`10', `16')
5862 @result{}a
5863 eval(`1', `37')
5864 @error{}m4:stdin:9: Warning: eval: radix 37 out of range
5865 @result{}
5866 eval(`1', , `-1')
5867 @error{}m4:stdin:10: Warning: eval: negative width
5868 @result{}
5869 eval()
5870 @error{}m4:stdin:11: Warning: eval: empty string treated as 0
5871 @result{}0
5872 @end example
5873
5874 @node Shell commands
5875 @chapter Macros for running shell commands
5876
5877 @cindex UNIX commands, running
5878 @cindex executing shell commands
5879 @cindex running shell commands
5880 @cindex shell commands, running
5881 @cindex commands, running shell
5882 There are a few builtin macros in @code{m4} that allow you to run shell
5883 commands from within @code{m4}.
5884
5885 Note that the definition of a valid shell command is system dependent.
5886 On UNIX systems, this is the typical @command{/bin/sh}.  But on other
5887 systems, such as native Windows, the shell has a different syntax of
5888 commands that it understands.  Some examples in this chapter assume
5889 @command{/bin/sh}, and also demonstrate how to quit early with a known
5890 exit value if this is not the case.
5891
5892 @menu
5893 * Platform macros::             Determining the platform
5894 * Syscmd::                      Executing simple commands
5895 * Esyscmd::                     Reading the output of commands
5896 * Sysval::                      Exit status
5897 * Mkstemp::                     Making temporary files
5898 @end menu
5899
5900 @node Platform macros
5901 @section Determining the platform
5902
5903 @cindex platform macros
5904 Sometimes it is desirable for an input file to know which platform
5905 @code{m4} is running on.  @acronym{GNU} @code{m4} provides several
5906 macros that are predefined to expand to the empty string; checking for
5907 their existence will confirm platform details.
5908
5909 @deffn {Optional builtin} __gnu__
5910 @deffnx {Optional builtin} __os2__
5911 @deffnx {Optional builtin} os2
5912 @deffnx {Optional builtin} __unix__
5913 @deffnx {Optional builtin} unix
5914 @deffnx {Optional builtin} __windows__
5915 @deffnx {Optional builtin} windows
5916 Each of these macros is conditionally defined as needed to describe the
5917 environment of @code{m4}.  If defined, each macro expands to the empty
5918 string.  For now, these macros silently ignore all arguments, but in a
5919 future release of M4, they might warn if arguments are present.
5920 @end deffn
5921
5922 When @acronym{GNU} extensions are in effect (that is, when you did not
5923 use the @option{-G} option, @pxref{Limits control, , Invoking m4}),
5924 @acronym{GNU} @code{m4} will define the macro @code{@w{__gnu__}} to
5925 expand to the empty string.
5926
5927 @example
5928 $ @kbd{m4}
5929 __gnu__
5930 @result{}
5931 __gnu__(`ignored')
5932 @result{}
5933 Extensions are ifdef(`__gnu__', `active', `inactive')
5934 @result{}Extensions are active
5935 @end example
5936
5937 @comment options: -G
5938 @example
5939 $ @kbd{m4 -G}
5940 __gnu__
5941 @result{}__gnu__
5942 __gnu__(`ignored')
5943 @result{}__gnu__(ignored)
5944 Extensions are ifdef(`__gnu__', `active', `inactive')
5945 @result{}Extensions are inactive
5946 @end example
5947
5948 On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
5949 by default, or @code{unix} when the @option{-G} option is specified.
5950
5951 On native Windows systems, @acronym{GNU} @code{m4} will define
5952 @code{@w{__windows__}} by default, or @code{windows} when the
5953 @option{-G} option is specified.
5954
5955 On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
5956 by default, or @code{os2} when the @option{-G} option is specified.
5957
5958 If @acronym{GNU} @code{m4} does not provide a platform macro for your system,
5959 please report that as a bug.
5960
5961 @example
5962 define(`provided', `0')
5963 @result{}
5964 ifdef(`__unix__', `define(`provided', incr(provided))')
5965 @result{}
5966 ifdef(`__windows__', `define(`provided', incr(provided))')
5967 @result{}
5968 ifdef(`__os2__', `define(`provided', incr(provided))')
5969 @result{}
5970 provided
5971 @result{}1
5972 @end example
5973
5974 @node Syscmd
5975 @section Executing simple commands
5976
5977 Any shell command can be executed, using @code{syscmd}:
5978
5979 @deffn Builtin syscmd (@var{shell-command})
5980 Executes @var{shell-command} as a shell command.
5981
5982 The expansion of @code{syscmd} is void, @emph{not} the output from
5983 @var{shell-command}!  Output or error messages from @var{shell-command}
5984 are not read by @code{m4}.  @xref{Esyscmd}, if you need to process the
5985 command output.
5986
5987 Prior to executing the command, @code{m4} flushes its buffers.
5988 The default standard input, output and error of @var{shell-command} are
5989 the same as those of @code{m4}.
5990
5991 The macro @code{syscmd} is recognized only with parameters.
5992 @end deffn
5993
5994 @example
5995 define(`foo', `FOO')
5996 @result{}
5997 syscmd(`echo foo')
5998 @result{}foo
5999 @result{}
6000 @end example
6001
6002 Note how the expansion of @code{syscmd} keeps the trailing newline of
6003 the command, as well as using the newline that appeared after the macro.
6004
6005 The following is an example of @var{shell-command} using the same
6006 standard input as @code{m4}:
6007
6008 @comment ignore
6009 @example
6010 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
6011 @result{}
6012 @end example
6013
6014 @ignore
6015 @comment If the user types the example below with stdin being an
6016 @comment interactive terminal, then cat will hang waiting for additional
6017 @comment input after m4 has exited.  But the testsuite is using a pipe
6018 @comment for stdin.  Hence, we have two versions - the one we feed the
6019 @comment testsuite below, and the one we display to the user above that
6020 @comment more accurately shows what the testsuite is really doing but
6021 @comment which the testsuite cannot parse.
6022
6023 @example
6024 m4wrap(`syscmd(`cat')')
6025 @result{}
6026 ^D
6027 @end example
6028 @end ignore
6029
6030 It tells @code{m4} to read all of its input before executing the wrapped
6031 text, then hand a valid (albeit emptied) pipe as standard input for the
6032 @code{cat} subcommand.  Therefore, you should be careful when using
6033 standard input (either by specifying no files, or by passing @samp{-} as
6034 a file name on the command line, @pxref{Command line files, , Invoking
6035 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
6036 that consume data from standard input.  When standard input is a
6037 seekable file, the subprocess will pick up with the next character not
6038 yet processed by @code{m4}; when it is a pipe or other non-seekable
6039 file, there is no guarantee how much data will already be buffered by
6040 @code{m4} and thus unavailable to the child.
6041
6042 @node Esyscmd
6043 @section Reading the output of commands
6044
6045 @cindex @acronym{GNU} extensions
6046 If you want @code{m4} to read the output of a shell command, use
6047 @code{esyscmd}:
6048
6049 @deffn Builtin esyscmd (@var{shell-command})
6050 Expands to the standard output of the shell command
6051 @var{shell-command}.
6052
6053 Prior to executing the command, @code{m4} flushes its buffers.
6054 The default standard input and standard error of @var{shell-command} are
6055 the same as those of @code{m4}.  The error output of @var{shell-command}
6056 is not a part of the expansion: it will appear along with the error
6057 output of @code{m4}.
6058
6059 The macro @code{esyscmd} is recognized only with parameters.
6060 @end deffn
6061
6062 @example
6063 define(`foo', `FOO')
6064 @result{}
6065 esyscmd(`echo foo')
6066 @result{}FOO
6067 @result{}
6068 @end example
6069
6070 Note how the expansion of @code{esyscmd} keeps the trailing newline of
6071 the command, as well as using the newline that appeared after the macro.
6072
6073 Just as with @code{syscmd}, care must be exercised when sharing standard
6074 input between @code{m4} and the child process of @code{esyscmd}.
6075
6076 @node Sysval
6077 @section Exit status
6078
6079 @cindex UNIX commands, exit status from
6080 @cindex exit status from shell commands
6081 @cindex shell commands, exit status from
6082 @cindex commands, exit status from shell
6083 @cindex status of shell commands
6084 To see whether a shell command succeeded, use @code{sysval}:
6085
6086 @deffn Builtin sysval
6087 Expands to the exit status of the last shell command run with
6088 @code{syscmd} or @code{esyscmd}.  Expands to 0 if no command has been
6089 run yet.
6090 @end deffn
6091
6092 @example
6093 sysval
6094 @result{}0
6095 syscmd(`false')
6096 @result{}
6097 ifelse(sysval, `0', `zero', `non-zero')
6098 @result{}non-zero
6099 syscmd(`exit 2')
6100 @result{}
6101 sysval
6102 @result{}2
6103 syscmd(`true')
6104 @result{}
6105 sysval
6106 @result{}0
6107 esyscmd(`false')
6108 @result{}
6109 ifelse(sysval, `0', `zero', `non-zero')
6110 @result{}non-zero
6111 esyscmd(`exit 2')
6112 @result{}
6113 sysval
6114 @result{}2
6115 esyscmd(`true')
6116 @result{}
6117 sysval
6118 @result{}0
6119 @end example
6120
6121 @code{sysval} results in 127 if there was a problem executing the
6122 command, for example, if the system-imposed argument length is exceeded,
6123 or if there were not enough resources to fork.  It is not possible to
6124 distinguish between failed execution and successful execution that had
6125 an exit status of 127.
6126
6127 On UNIX platforms, where it is possible to detect when command execution
6128 is terminated by a signal, rather than a normal exit, the result is the
6129 signal number shifted left by eight bits.
6130
6131 @comment This test has difficulties being portable, even on platforms
6132 @comment where syscmd invokes /bin/sh.  Kill is not portable with signal
6133 @comment names.  According to autoconf, the only portable signal numbers
6134 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM).  But
6135 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
6136 @comment exits normally rather than letting the signal terminate it).
6137 @comment Also, TERM is flaky, as it can also kill the running m4 on
6138 @comment systems where /bin/sh does not create its own process group.
6139 @comment And PIPE is unreliable, since people tend to run with it
6140 @comment ignored, with m4 inheriting that choice.  That leaves KILL as
6141 @comment the only signal we can reliably test.
6142 @example
6143 dnl This test assumes kill is a shell builtin, and that signals are
6144 dnl recognizable.
6145 ifdef(`__unix__', ,
6146       `errprint(` skipping: syscmd does not have unix semantics
6147 ')m4exit(`77')')dnl
6148 syscmd(`kill -9 $$')
6149 @result{}
6150 sysval
6151 @result{}2304
6152 syscmd()
6153 @result{}
6154 sysval
6155 @result{}0
6156 esyscmd(`kill -9 $$')
6157 @result{}
6158 sysval
6159 @result{}2304
6160 @end example
6161
6162 @node Mkstemp
6163 @section Making temporary files
6164
6165 @cindex temporary file names
6166 @cindex files, names of temporary
6167 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6168 temporary file, for output or for some other purpose.  There is a
6169 builtin macro, @code{mkstemp}, for making a temporary file:
6170
6171 @deffn Builtin mkstemp (@var{template})
6172 @deffnx Builtin maketemp (@var{template})
6173 Expands to the quoted name of a new, empty file, made from the string
6174 @var{template}, which should end with the string @samp{XXXXXX}.  The six
6175 @samp{X} characters are then replaced with random characters matching
6176 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
6177 name unique.  If fewer than six @samp{X} characters are found at the end
6178 of @code{template}, the result will be longer than the template.  The
6179 created file will have access permissions as if by @kbd{chmod =rw,go=},
6180 meaning that the current umask of the @code{m4} process is taken into
6181 account, and at most only the current user can read and write the file.
6182
6183 The traditional behavior, standardized by @acronym{POSIX}, is that
6184 @code{maketemp} merely replaces the trailing @samp{X} with the process
6185 id, without creating a file or quoting the expansion, and without
6186 ensuring that the resulting
6187 string is a unique file name.  In part, this means that using the same
6188 @var{template} twice in the same input file will result in the same
6189 expansion.  This behavior is a security hole, as it is very easy for
6190 another process to guess the name that will be generated, and thus
6191 interfere with a subsequent use of @code{syscmd} trying to manipulate
6192 that file name.  Hence, @acronym{POSIX} has recommended that all new
6193 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
6194 and that users of @code{m4} check for its existence.
6195
6196 The expansion is void and an error issued if a temporary file could
6197 not be created.
6198
6199 The macros @code{mkstemp} and @code{maketemp} are recognized only with
6200 parameters.
6201 @end deffn
6202
6203 If you try this next example, you will most likely get different output
6204 for the two file names, since the replacement characters are randomly
6205 chosen:
6206
6207 @comment ignore
6208 @example
6209 $ @kbd{m4}
6210 define(`tmp', `oops')
6211 @result{}
6212 maketemp(`/tmp/fooXXXXXX')
6213 @result{}/tmp/fooa07346
6214 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
6215       `define(`mkstemp', defn(`maketemp'))dnl
6216 errprint(`warning: potentially insecure maketemp implementation
6217 ')')
6218 @result{}
6219 mkstemp(`doc')
6220 @result{}docQv83Uw
6221 @end example
6222
6223 @cindex @acronym{GNU} extensions
6224 Unless you use the @option{--traditional} command line option (or
6225 @option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
6226 version of @code{maketemp} is secure.  This means that using the same
6227 template to multiple calls will generate multiple files.  However, we
6228 recommend that you use the new @code{mkstemp} macro, introduced in
6229 @acronym{GNU} M4 1.4.8, which is secure even in traditional mode.  Also,
6230 as of M4 1.4.11, the secure implementation quotes the resulting file
6231 name, so that you are guaranteed to know what file was created even if
6232 the random file name happens to match an existing macro.  Notice that
6233 this example is careful to use @code{defn} to avoid unintended expansion
6234 of @samp{foo}.
6235
6236 @example
6237 $ @kbd{m4}
6238 define(`foo', `errprint(`oops')')
6239 @result{}
6240 syscmd(`rm -f foo-??????')sysval
6241 @result{}0
6242 define(`file1', maketemp(`foo-XXXXXX'))dnl
6243 ifelse(esyscmd(`echo \` foo-?????? \''), ` foo-?????? ',
6244        `no file', `created')
6245 @result{}created
6246 define(`file2', maketemp(`foo-XX'))dnl
6247 define(`file3', mkstemp(`foo-XXXXXX'))dnl
6248 ifelse(len(defn(`file1')), len(defn(`file2')),
6249        `same length', `different')
6250 @result{}same length
6251 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
6252 @result{}different file
6253 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
6254 @result{}different file
6255 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
6256 @result{}different file
6257 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
6258 @result{}
6259 sysval
6260 @result{}0
6261 @end example
6262
6263 @ignore
6264 @c Not worth documenting, but make sure we don't leave trailing NUL in
6265 @c the expansion.
6266
6267 @example
6268 syscmd(`rm -f foo??????')sysval
6269 @result{}0
6270 len(mkstemp(`fooXXXXX'))
6271 @result{}9
6272 syscmd(`rm foo??????')sysval
6273 @result{}0
6274 @end example
6275
6276 @c Likewise, and ensure that traditional mode leaves the result unquoted
6277 @c without creating a file.
6278
6279 @comment options: -G
6280 @example
6281 syscmd(`rm -f foo-*')sysval
6282 @result{}0
6283 len(maketemp(`foo-XXXXX'))
6284 @error{}m4:stdin:2: Warning: maketemp: recommend using mkstemp instead
6285 @result{}9
6286 define(`abc', `def')
6287 @result{}
6288 maketemp(`foo-abc')
6289 @result{}foo-def
6290 @error{}m4:stdin:4: Warning: maketemp: recommend using mkstemp instead
6291 syscmd(`test -f foo-*')sysval
6292 @result{}1
6293 @end example
6294 @end ignore
6295
6296 @node Miscellaneous
6297 @chapter Miscellaneous builtin macros
6298
6299 This chapter describes various builtins, that do not really belong in
6300 any of the previous chapters.
6301
6302 @menu
6303 * Errprint::                    Printing error messages
6304 * Location::                    Printing current location
6305 * M4exit::                      Exiting from @code{m4}
6306 @end menu
6307
6308 @node Errprint
6309 @section Printing error messages
6310
6311 @cindex printing error messages
6312 @cindex error messages, printing
6313 @cindex messages, printing error
6314 @cindex standard error, output to
6315 You can print error messages using @code{errprint}:
6316
6317 @deffn Builtin errprint (@var{message}, @dots{})
6318 Prints @var{message} and the rest of the arguments to standard error,
6319 separated by spaces.  Standard error is used, regardless of the
6320 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
6321
6322 The expansion of @code{errprint} is void.
6323 The macro @code{errprint} is recognized only with parameters.
6324 @end deffn
6325
6326 @example
6327 errprint(`Invalid arguments to forloop
6328 ')
6329 @error{}Invalid arguments to forloop
6330 @result{}
6331 errprint(`1')errprint(`2',`3
6332 ')
6333 @error{}12 3
6334 @result{}
6335 @end example
6336
6337 A trailing newline is @emph{not} printed automatically, so it should be
6338 supplied as part of the argument, as in the example.  Unfortunately, the
6339 exact output of @code{errprint} is not very portable to other @code{m4}
6340 implementations: @acronym{POSIX} requires that all arguments be printed,
6341 but some implementations of @code{m4} only print the first.
6342 Furthermore, some @acronym{BSD} implementations always append a newline
6343 for each @code{errprint} call, regardless of whether the last argument
6344 already had one, and @acronym{POSIX} is silent on whether this is
6345 acceptable.
6346
6347 @node Location
6348 @section Printing current location
6349
6350 @cindex location, input
6351 @cindex input location
6352 To make it possible to specify the location of an error, three
6353 utility builtins exist:
6354
6355 @deffn Builtin __file__
6356 @deffnx Builtin __line__
6357 @deffnx Builtin __program__
6358 Expand to the quoted name of the current input file, the
6359 current input line number in that file, and the quoted name of the
6360 current invocation of @code{m4}.
6361 @end deffn
6362
6363 @example
6364 errprint(__program__:__file__:__line__: `input error
6365 ')
6366 @error{}m4:stdin:1: input error
6367 @result{}
6368 @end example
6369
6370 Line numbers start at 1 for each file.  If the file was found due to the
6371 @option{-I} option or @env{M4PATH} environment variable, that is
6372 reflected in the file name.  The syncline option (@option{-s},
6373 @pxref{Preprocessor features, , Invoking m4}), and the
6374 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debug Levels}),
6375 also use this notion of current file and line.  Redefining the three
6376 location macros has no effect on syncline, debug, warning, or error
6377 message output.
6378
6379 This example reuses the file @file{incl.m4} mentioned earlier
6380 (@pxref{Include}):
6381
6382 @comment examples
6383 @example
6384 $ @kbd{m4 -I examples}
6385 define(`foo', ``$0' called at __file__:__line__')
6386 @result{}
6387 foo
6388 @result{}foo called at stdin:2
6389 include(`incl.m4')
6390 @result{}Include file start
6391 @result{}foo called at ../examples/incl.m4:2
6392 @result{}Include file end
6393 @result{}
6394 @end example
6395
6396 The location of macros invoked during the rescanning of macro expansion
6397 text corresponds to the location in the file where the expansion was
6398 triggered, regardless of how many newline characters the expansion text
6399 contains.  As of @acronym{GNU} M4 1.4.8, the location of text wrapped
6400 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
6401 @code{m4wrap} was invoked.  Previous versions, however, behaved as
6402 though wrapped text came from line 0 of the file ``''.
6403
6404 @example
6405 define(`echo', `$@@')
6406 @result{}
6407 define(`foo', `echo(__line__
6408 __line__)')
6409 @result{}
6410 echo(__line__
6411 __line__)
6412 @result{}4
6413 @result{}5
6414 m4wrap(`foo
6415 ')
6416 @result{}
6417 foo(errprint(__line__
6418 __line__
6419 ))
6420 @error{}8
6421 @error{}9
6422 @result{}8
6423 @result{}8
6424 __line__
6425 @result{}11
6426 ^D
6427 @result{}6
6428 @result{}6
6429 @end example
6430
6431 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
6432 terminology.  If you invoke @code{m4} through an absolute path or a link
6433 with a different spelling, rather than by relying on a @env{PATH} search
6434 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
6435 The intent is that you can use it to produce error messages with the
6436 same formatting that @code{m4} produces internally.  It can also be used
6437 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
6438 @code{m4} that is currently running, rather than whatever version of
6439 @code{m4} happens to be first in @env{PATH}.  It was first introduced in
6440 @acronym{GNU} M4 1.4.6.
6441
6442 @node M4exit
6443 @section Exiting from @code{m4}
6444
6445 @cindex exiting from @code{m4}
6446 @cindex status, setting @code{m4} exit
6447 If you need to exit from @code{m4} before the entire input has been
6448 read, you can use @code{m4exit}:
6449
6450 @deffn Builtin m4exit (@dvar{code, 0})
6451 Causes @code{m4} to exit, with exit status @var{code}.  If @var{code} is
6452 left out, the exit status is zero.  If @var{code} cannot be parsed, or
6453 is outside the range of 0 to 255, the exit status is one.  No further
6454 input is read, and all wrapped and diverted text is discarded.
6455 @end deffn
6456
6457 @example
6458 m4wrap(`This text is lost due to `m4exit'.')
6459 @result{}
6460 divert(`1') So is this.
6461 divert
6462 @result{}
6463 m4exit And this is never read.
6464 @end example
6465
6466 A common use of this is to abort processing:
6467
6468 @deffn Composite fatal_error (@var{message})
6469 Abort processing with an error message and non-zero status.  Prefix
6470 @var{message} with details about where the error occurred, and print the
6471 resulting string to standard error.
6472 @end deffn
6473
6474 @comment status: 1
6475 @example
6476 define(`fatal_error',
6477        `errprint(__program__:__file__:__line__`: fatal error: $*
6478 ')m4exit(`1')')
6479 @result{}
6480 fatal_error(`this is a BAD one, buster')
6481 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
6482 @end example
6483
6484 After this macro call, @code{m4} will exit with exit status 1.  This macro
6485 is only intended for error exits, since the normal exit procedures are
6486 not followed, i.e., diverted text is not undiverted, and saved text
6487 (@pxref{M4wrap}) is not reread.  (This macro could be made more robust
6488 to earlier versions of @code{m4}.  You should try to see if you can find
6489 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
6490
6491 Note that it is still possible for the exit status to be different than
6492 what was requested by @code{m4exit}.  If @code{m4} detects some other
6493 error, such as a write error on standard output, the exit status will be
6494 non-zero even if @code{m4exit} requested zero.
6495
6496 If standard input is seekable, then the file will be positioned at the
6497 next unread character.  If it is a pipe or other non-seekable file,
6498 then there are no guarantees how much data @code{m4} might have read
6499 into buffers, and thus discarded.
6500
6501 @node Frozen files
6502 @chapter Fast loading of frozen state
6503
6504 Some bigger @code{m4} applications may be built over a common base
6505 containing hundreds of definitions and other costly initializations.
6506 Usually, the common base is kept in one or more declarative files,
6507 which files are listed on each @code{m4} invocation prior to the
6508 user's input file, or else each input file uses @code{include}.
6509
6510 Reading the common base of a big application, over and over again, may
6511 be time consuming.  @acronym{GNU} @code{m4} offers some machinery to
6512 speed up the start of an application using lengthy common bases.
6513
6514 @menu
6515 * Using frozen files::          Using frozen files
6516 * Frozen file format::          Frozen file format
6517 @end menu
6518
6519 @node Using frozen files
6520 @section Using frozen files
6521
6522 @cindex fast loading of frozen files
6523 @cindex frozen files for fast loading
6524 @cindex initialization, frozen state
6525 @cindex dumping into frozen file
6526 @cindex reloading a frozen file
6527 @cindex @acronym{GNU} extensions
6528 Suppose a user has a library of @code{m4} initializations in
6529 @file{base.m4}, which is then used with multiple input files:
6530
6531 @comment ignore
6532 @example
6533 $ @kbd{m4 base.m4 input1.m4}
6534 $ @kbd{m4 base.m4 input2.m4}
6535 $ @kbd{m4 base.m4 input3.m4}
6536 @end example
6537
6538 Rather than spending time parsing the fixed contents of @file{base.m4}
6539 every time, the user might rather execute:
6540
6541 @comment ignore
6542 @example
6543 $ @kbd{m4 -F base.m4f base.m4}
6544 @end example
6545
6546 @noindent
6547 once, and further execute, as often as needed:
6548
6549 @comment ignore
6550 @example
6551 $ @kbd{m4 -R base.m4f input1.m4}
6552 $ @kbd{m4 -R base.m4f input2.m4}
6553 $ @kbd{m4 -R base.m4f input3.m4}
6554 @end example
6555
6556 @noindent
6557 with the varying input.  The first call, containing the @option{-F}
6558 option, only reads and executes file @file{base.m4}, defining
6559 various application macros and computing other initializations.
6560 Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
6561 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
6562 file which contains a kind of snapshot of the @code{m4} internal state.
6563
6564 Later calls, containing the @option{-R} option, are able to reload
6565 the internal state of @code{m4}, from @file{base.m4f},
6566 @emph{prior} to reading any other input files.  This means
6567 instead of starting with a virgin copy of @code{m4}, input will be
6568 read after having effectively recovered the effect of a prior run.
6569 In our example, the effect is the same as if file @file{base.m4} has
6570 been read anew.  However, this effect is achieved a lot faster.
6571
6572 Only one frozen file may be created or read in any one @code{m4}
6573 invocation.  It is not possible to recover two frozen files at once.
6574 However, frozen files may be updated incrementally, through using
6575 @option{-R} and @option{-F} options simultaneously.  For example, if
6576 some care is taken, the command:
6577
6578 @comment ignore
6579 @example
6580 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
6581 @end example
6582
6583 @noindent
6584 could be broken down in the following sequence, accumulating the same
6585 output:
6586
6587 @comment ignore
6588 @example
6589 $ @kbd{m4 -F file1.m4f file1.m4}
6590 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
6591 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
6592 $ @kbd{m4 -R file3.m4f file4.m4}
6593 @end example
6594
6595 Some care is necessary because not every effort has been made for
6596 this to work in all cases.  In particular, the trace attribute of
6597 macros is not handled, nor the current setting of @code{changeword}.
6598 Currently, @code{m4wrap} and @code{sysval} also have problems.
6599 Also, interactions for some options of @code{m4}, being used in one call
6600 and not in the next, have not been fully analyzed yet.  On the other
6601 end, you may be confident that stacks of @code{pushdef} definitions
6602 are handled correctly, as well as undefined or renamed builtins, and
6603 changed strings for quotes or comments.  And future releases of
6604 @acronym{GNU} M4 will improve on the utility of frozen files.
6605
6606 @ignore
6607 @c This example is not worth putting in the manual, but caused core
6608 @c dumps in all versions prior to 1.4.11.
6609
6610 @comment options: -F /dev/null
6611 @example
6612 ifdef(`__unix__', ,
6613       `errprint(` skipping: /dev/null not known to exist
6614 ')m4exit(`77')')dnl
6615 traceon(`undefined')dnl
6616 @end example
6617 @end ignore
6618
6619 When an @code{m4} run is to be frozen, the automatic undiversion
6620 which takes place at end of execution is inhibited.  Instead, all
6621 positively numbered diversions are saved into the frozen file.
6622 The active diversion number is also transmitted.
6623
6624 A frozen file to be reloaded need not reside in the current directory.
6625 It is looked up the same way as an @code{include} file (@pxref{Search
6626 Path}).
6627
6628 If the frozen file was generated with a newer version of @code{m4}, and
6629 contains directives that an older @code{m4} cannot parse, attempting to
6630 load the frozen file with option @option{-R} will cause @code{m4} to
6631 exit with status 63 to indicate version mismatch.
6632
6633 @node Frozen file format
6634 @section Frozen file format
6635
6636 @cindex frozen file format
6637 @cindex file format, frozen file
6638 Frozen files are sharable across architectures.  It is safe to write
6639 a frozen file on one machine and read it on another, given that the
6640 second machine uses the same or newer version of @acronym{GNU} @code{m4}.
6641 It is conventional, but not required, to give a frozen file the suffix
6642 of @code{.m4f}.
6643
6644 These are simple (editable) text files, made up of directives,
6645 each starting with a capital letter and ending with a newline
6646 (@key{NL}).  Wherever a directive is expected, the character
6647 @samp{#} introduces a comment line; empty lines are also ignored if they
6648 are not part of an embedded string.
6649 In the following descriptions, each @var{len} refers to the length of
6650 the corresponding strings @var{str} in the next line of input.  Numbers
6651 are always expressed in decimal.  There are no escape characters.  The
6652 directives are:
6653
6654 @table @code
6655 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
6656 Uses @var{str1} and @var{str2} as the begin-comment and
6657 end-comment strings.  If omitted, then @samp{#} and @key{NL} are the
6658 comment delimiters.
6659
6660 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
6661 Selects diversion @var{number}, making it current, then copy
6662 @var{str} in the current diversion.  @var{number} may be a negative
6663 number for a non-existing diversion.  To merely specify an active
6664 selection, use this command with an empty @var{str}.  With 0 as the
6665 diversion @var{number}, @var{str} will be issued on standard output
6666 at reload time.  @acronym{GNU} @code{m4} will not produce the @samp{D}
6667 directive with non-zero length for diversion 0, but this can be done
6668 with manual edits.  This directive may
6669 appear more than once for the same diversion, in which case the
6670 diversion is the concatenation of the various uses.  If omitted, then
6671 diversion 0 is current.
6672
6673 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
6674 Defines, through @code{pushdef}, a definition for @var{str1}
6675 expanding to the function whose builtin name is @var{str2}.  If the
6676 builtin does not exist (for example, if the frozen file was produced by
6677 a copy of @code{m4} compiled with changeword support, but the version
6678 of @code{m4} reloading was compiled without it), the reload is silent,
6679 but any subsequent use of the definition of @var{str1} will result in
6680 a warning.  This directive may appear more than once for the same name,
6681 and its order, along with @samp{T}, is important.  If omitted, you will
6682 have no access to any builtins.
6683
6684 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
6685 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
6686 strings.  If omitted, then @samp{`} and @samp{'} are the quote
6687 delimiters.
6688
6689 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
6690 Defines, though @code{pushdef}, a definition for @var{str1}
6691 expanding to the text given by @var{str2}.  This directive may appear
6692 more than once for the same name, and its order, along with @samp{F}, is
6693 important.
6694
6695 @item V @var{number} @key{NL}
6696 Confirms the format of the file.  @code{m4} @value{VERSION} only creates
6697 and understands frozen files where @var{number} is 1.  This directive
6698 must be the first non-comment in the file, and may not appear more than
6699 once.
6700 @end table
6701
6702 @node Compatibility
6703 @chapter Compatibility with other versions of @code{m4}
6704
6705 @cindex compatibility
6706 This chapter describes the many of the differences between this
6707 implementation of @code{m4}, and of other implementations found under
6708 UNIX, such as System V Release 3, Solaris, and @acronym{BSD} flavors.
6709 In particular, it lists the known differences and extensions to
6710 @acronym{POSIX}.  However, the list is not necessarily comprehensive.
6711
6712 At the time of this writing, @acronym{POSIX} 2001 (also known as IEEE
6713 Std 1003.1-2001) is the latest standard, although a new version of
6714 @acronym{POSIX} is under development and includes several proposals for
6715 modifying what @code{m4} is required to do.  The requirements for
6716 @code{m4} are shared between @acronym{SUSv3} and @acronym{POSIX}, and
6717 can be viewed at
6718 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
6719
6720 @menu
6721 * Extensions::                  Extensions in @acronym{GNU} M4
6722 * Incompatibilities::           Facilities in System V m4 not in GNU M4
6723 * Other Incompatibilities::     Other incompatibilities
6724 @end menu
6725
6726 @node Extensions
6727 @section Extensions in @acronym{GNU} M4
6728
6729 @cindex @acronym{GNU} extensions
6730 @cindex @acronym{POSIX}
6731 This version of @code{m4} contains a few facilities that do not exist
6732 in System V @code{m4}.  These extra facilities are all suppressed by
6733 using the @option{-G} command line option (@pxref{Limits control, ,
6734 Invoking m4}), unless overridden by other command line options.
6735
6736 @itemize @bullet
6737 @item
6738 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
6739 several digits, while the System V @code{m4} only accepts one digit.
6740 This allows macros in @acronym{GNU} @code{m4} to take any number of
6741 arguments, and not only nine (@pxref{Arguments}).
6742
6743 This means that @code{define(`foo', `$11')} is ambiguous between
6744 implementations.  To portably choose between grabbing the first
6745 parameter and appending 1 to the expansion, or grabbing the eleventh
6746 parameter, you can do the following:
6747
6748 @example
6749 define(`a1', `A1')
6750 @result{}
6751 dnl First argument, concatenated with 1
6752 define(`_1', `$1')define(`first1', `_1($@@)1')
6753 @result{}
6754 dnl Eleventh argument, portable
6755 define(`_9', `$9')define(`eleventh', `_9(shift(shift($@@)))')
6756 @result{}
6757 dnl Eleventh argument, GNU style
6758 define(`Eleventh', `$11')
6759 @result{}
6760 first1(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
6761 @result{}A1
6762 eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
6763 @result{}k
6764 Eleventh(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k')
6765 @result{}k
6766 @end example
6767
6768 @noindent
6769 Also see the @code{argn} macro (@pxref{Shift}).
6770
6771 @item
6772 The @code{divert} (@pxref{Divert}) macro can manage more than 9
6773 diversions.  @acronym{GNU} @code{m4} treats all positive numbers as valid
6774 diversions, rather than discarding diversions greater than 9.
6775
6776 @item
6777 Files included with @code{include} and @code{sinclude} are sought in a
6778 user specified search path, if they are not found in the working
6779 directory.  The search path is specified by the @option{-I} option and the
6780 @env{M4PATH} environment variable (@pxref{Search Path}).
6781
6782 @item
6783 Arguments to @code{undivert} can be non-numeric, in which case the named
6784 file will be included uninterpreted in the output (@pxref{Undivert}).
6785
6786 @item
6787 Formatted output is supported through the @code{format} builtin, which
6788 is modeled after the C library function @code{printf} (@pxref{Format}).
6789
6790 @item
6791 Searches and text substitution through basic regular expressions are
6792 supported by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
6793 (@pxref{Patsubst}) builtins.  Some @acronym{BSD} implementations use
6794 extended regular expressions instead.
6795
6796 @item
6797 The output of shell commands can be read into @code{m4} with
6798 @code{esyscmd} (@pxref{Esyscmd}).
6799
6800 @item
6801 There is indirect access to any builtin macro with @code{builtin}
6802 (@pxref{Builtin}).
6803
6804 @item
6805 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
6806
6807 @item
6808 The name of the program, the current input file, and the current input
6809 line number are accessible through the builtins @code{@w{__program__}},
6810 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
6811
6812 @item
6813 The format of the output from @code{dumpdef} and macro tracing can be
6814 controlled with @code{debugmode} (@pxref{Debug Levels}).
6815
6816 @item
6817 The destination of trace and debug output can be controlled with
6818 @code{debugfile} (@pxref{Debug Output}).
6819
6820 @item
6821 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
6822 creating a new file with a unique name on every invocation, rather than
6823 following the insecure behavior of replacing the trailing @samp{X}
6824 characters with the @code{m4} process id.
6825
6826 @item
6827 @acronym{POSIX} only requires support for the command line options
6828 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
6829 by @acronym{GNU} M4 are extensions.  @xref{Invoking m4}, for a
6830 description of these options.
6831
6832 The debugging and tracing facilities in @acronym{GNU} @code{m4} are much
6833 more extensive than in most other versions of @code{m4}.
6834 @end itemize
6835
6836 @node Incompatibilities
6837 @section Facilities in System V @code{m4} not in @acronym{GNU} @code{m4}
6838
6839 The version of @code{m4} from System V contains a few facilities that
6840 have not been implemented in @acronym{GNU} @code{m4} yet.  Additionally,
6841 @acronym{POSIX} requires some behaviors that @acronym{GNU} @code{m4} has not
6842 implemented yet.  Relying on these behaviors is non-portable, as a
6843 future release of @acronym{GNU} @code{m4} may change.
6844
6845 @itemize @bullet
6846 @item
6847 @acronym{POSIX} requires support for multiple arguments to @code{defn},
6848 without any clarification on how @code{defn} behaves when one of the
6849 multiple arguments names a builtin.  System V @code{m4} and some other
6850 implementations allow mixing builtins and text macros into a single
6851 macro.  @acronym{GNU} @code{m4} only supports joining multiple text
6852 arguments, although a future implementation may lift this restriction to
6853 behave more like System V@.  The only portable way to join text macros
6854 with builtins is via helper macros and implicit concatenation of macro
6855 results.
6856
6857 @item
6858 @acronym{POSIX} requires an application to exit with non-zero status if
6859 it wrote an error message to stderr.  This has not yet been consistently
6860 implemented for the various builtins that are required to issue an error
6861 (such as @code{eval} (@pxref{Eval}) when an argument cannot be parsed).
6862
6863 @item
6864 Some traditional implementations only allow reading standard input
6865 once, but @acronym{GNU} @code{m4} correctly handles multiple instances
6866 of @samp{-} on the command line.
6867
6868 @item
6869 @acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
6870 (first-in, first-out) order, but @acronym{GNU} @code{m4} currently uses
6871 LIFO order.  Furthermore, @acronym{POSIX} states that only the first
6872 argument to @code{m4wrap} is saved for later evaluation, but
6873 @acronym{GNU} @code{m4} saves and processes all arguments, with output
6874 separated by spaces.
6875
6876 However, it is possible to emulate @acronym{POSIX} behavior by
6877 including the file @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4}
6878 from the distribution:
6879
6880 @example
6881 undivert(`wrapfifo.m4')dnl
6882 @result{}dnl Redefine m4wrap to have FIFO semantics.
6883 @result{}define(`_m4wrap_level', `0')dnl
6884 @result{}define(`m4wrap',
6885 @result{}`ifdef(`m4wrap'_m4wrap_level,
6886 @result{}       `define(`m4wrap'_m4wrap_level,
6887 @result{}               defn(`m4wrap'_m4wrap_level)`$1')',
6888 @result{}       `builtin(`m4wrap', `define(`_m4wrap_level',
6889 @result{}                                  incr(_m4wrap_level))dnl
6890 @result{}m4wrap'_m4wrap_level)dnl
6891 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
6892 include(`wrapfifo.m4')
6893 @result{}
6894 m4wrap(`a`'m4wrap(`c
6895 ', `d')')m4wrap(`b')
6896 @result{}
6897 ^D
6898 @result{}abc
6899 @end example
6900
6901 @item
6902 @acronym{POSIX} states that builtins that require arguments, but are
6903 called without arguments, have undefined behavior.  Traditional
6904 implementations simply behave as though empty strings had been passed.
6905 For example, @code{a`'define`'b} would expand to @code{ab}.  But
6906 @acronym{GNU} @code{m4} ignores certain builtins if they have missing
6907 arguments, giving @code{adefineb} for the above example.
6908
6909 @item
6910 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
6911 by undefining the entire stack of previous definitions, and if doing
6912 @code{undefine(`f')} first.  @acronym{GNU} @code{m4} replaces just the top
6913 definition on the stack, as if doing @code{popdef(`f')} followed by
6914 @code{pushdef(`f',`1')}.  @acronym{POSIX} allows either behavior.
6915
6916 @item
6917 @acronym{POSIX} 2001 requires @code{syscmd} (@pxref{Syscmd}) to evaluate
6918 command output for macro expansion, but this was a mistake that is
6919 anticipated to be corrected in the next version of @acronym{POSIX}.
6920 @acronym{GNU} @code{m4} follows traditional behavior in @code{syscmd}
6921 where output is not rescanned, and provides the extension @code{esyscmd}
6922 that does scan the output.
6923
6924 @item
6925 At one point, @acronym{POSIX} required @code{changequote(@var{arg})}
6926 (@pxref{Changequote}) to use newline as the close quote, but this was a
6927 bug, and the next version of @acronym{POSIX} is anticipated to state
6928 that using empty strings or just one argument is unspecified.
6929 Meanwhile, the @acronym{GNU} @code{m4} behavior of treating an empty
6930 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
6931 repeating the start-quote delimiter, and BSD treats it as leaving the
6932 previous end-quote delimiter unchanged.  For predictable results, never
6933 call changequote with just one argument, or with empty strings for
6934 arguments.
6935
6936 @item
6937 At one point, @acronym{POSIX} required @code{changecom(@var{arg},)}
6938 (@pxref{Changecom}) to make it impossible to end a comment, but this is
6939 a bug, and the next version of @acronym{POSIX} is anticipated to state
6940 that using empty strings is unspecified.  Meanwhile, the @acronym{GNU}
6941 @code{m4} behavior of treating an empty end-comment delimiter as newline
6942 is not portable, as BSD treats it as leaving the previous end-comment
6943 delimiter unchanged.  It is also impossible in BSD implementations to
6944 disable comments, even though that is required by @acronym{POSIX}.  For
6945 predictable results, never call changecom with empty strings for
6946 arguments.
6947
6948 @item
6949 Most implementations of @code{m4} give macros a higher precedence than
6950 comments when parsing, meaning that if the start delimiter given to
6951 @code{changecom} (@pxref{Changecom}) starts with a macro name, comments
6952 are effectively disabled.  @acronym{POSIX} does not specify what the
6953 precedence is, so this version of @acronym{GNU} @code{m4} parser
6954 recognizes comments, then macros, then quoted strings.
6955
6956 @item
6957 Traditional implementations allow argument collection, but not string
6958 and comment processing, to span file boundaries.  Thus, if @file{a.m4}
6959 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
6960 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
6961 gives an error message that the end of file was encountered inside a
6962 macro with @acronym{GNU} @code{m4}.  On the other hand, traditional
6963 implementations do end of file processing for files included with
6964 @code{include} or @code{sinclude} (@pxref{Include}), while @acronym{GNU}
6965 @code{m4} seamlessly integrates the content of those files.  Thus
6966 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
6967 giving an error.
6968
6969 @item
6970 Traditional @code{m4} treats @code{traceon} (@pxref{Trace}) without
6971 arguments as a global variable, independent of named macro tracing.
6972 Also, once a macro is undefined, named tracing of that macro is lost.
6973 On the other hand, when @acronym{GNU} @code{m4} encounters
6974 @code{traceon} without
6975 arguments, it turns tracing on for all existing definitions at the time,
6976 but does not trace future definitions; @code{traceoff} without arguments
6977 turns tracing off for all definitions regardless of whether they were
6978 also traced by name; and tracing by name, such as with @option{-tfoo} at
6979 the command line or @code{traceon(`foo')} in the input, is an attribute
6980 that is preserved even if the macro is currently undefined.
6981
6982 @item
6983 @acronym{POSIX} requires @code{eval} (@pxref{Eval}) to treat all
6984 operators with the same precedence as C@.  However, earlier versions of
6985 @acronym{GNU} @code{m4} followed the traditional behavior of other
6986 @code{m4} implementations, where bitwise and logical negation (@samp{~}
6987 and @samp{!}) have lower precedence than equality operators; and where
6988 equality operators (@samp{==} and @samp{!=}) had the same precedence as
6989 relational operators (such as @samp{<}).  Use explicit parentheses to
6990 ensure proper precedence.  As extensions to @acronym{POSIX},
6991 @acronym{GNU} @code{m4} gives well-defined semantics to operations that
6992 C leaves undefined, such as when overflow occurs, when shifting negative
6993 numbers, or when performing division by zero.  @acronym{POSIX} also
6994 requires @samp{=} to cause an error, but many traditional
6995 implementations allowed it as an alias for @samp{==}.
6996
6997 @item
6998 @acronym{POSIX} 2001 requires @code{translit} (@pxref{Translit}) to
6999 treat each character of the second and third arguments literally.
7000 However, it is anticipated that the next version of @acronym{POSIX} will
7001 allow the @acronym{GNU} @code{m4} behavior of treating @samp{-} as a
7002 range operator.
7003
7004 @item
7005 @acronym{POSIX} requires @code{m4} to honor the locale environment
7006 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
7007 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
7008 implemented in @acronym{GNU} @code{m4}.
7009
7010 @item
7011 @acronym{POSIX} states that only unquoted leading newlines and blanks
7012 (that is, space and tab) are ignored when collecting macro arguments.
7013 However, this appears to be a bug in @acronym{POSIX}, since most
7014 traditional implementations also ignore all whitespace (formfeed,
7015 carriage return, and vertical tab).  @acronym{GNU} @code{m4} follows
7016 tradition and ignores all leading unquoted whitespace.
7017 @end itemize
7018
7019 @node Other Incompatibilities
7020 @section Other incompatibilities
7021
7022 There are a few other incompatibilities between this implementation of
7023 @code{m4}, and the System V version.
7024
7025 @itemize @bullet
7026 @item
7027 @acronym{GNU} @code{m4} implements sync lines differently from System V
7028 @code{m4}, when text is being diverted.  @acronym{GNU} @code{m4} outputs
7029 the sync lines when the text is being diverted, and System V @code{m4}
7030 when the diverted text is being brought back.
7031
7032 The problem is which lines and file names should be attached to text
7033 that is being, or has been, diverted.  System V @code{m4} regards all
7034 the diverted text as being generated by the source line containing the
7035 @code{undivert} call, whereas @acronym{GNU} @code{m4} regards the
7036 diverted text as being generated at the time it is diverted.
7037
7038 The sync line option is used mostly when using @code{m4} as
7039 a front end to a compiler.  If a diverted line causes a compiler error,
7040 the error messages should most probably refer to the place where the
7041 diversion was made, and not where it was inserted again.
7042
7043 @comment options: -s
7044 @example
7045 divert(2)2
7046 divert(1)1
7047 divert`'0
7048 @result{}#line 3 "stdin"
7049 @result{}0
7050 ^D
7051 @result{}#line 2 "stdin"
7052 @result{}1
7053 @result{}#line 1 "stdin"
7054 @result{}2
7055 @end example
7056
7057 The current @code{m4} implementation has a limitation that the syncline
7058 output at the start of each diversion occurs no matter what, even if the
7059 previous diversion did not end with a newline.  This goes contrary to
7060 the claim that synclines appear on a line by themselves, so this
7061 limitation may be corrected in a future version of @code{m4}.  In the
7062 meantime, when using @option{-s}, it is wisest to make sure all
7063 diversions end with newline.
7064
7065 @item
7066 @acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
7067 definitions like:
7068
7069 @example
7070 define(`x', `x')
7071 @result{}
7072 define(`x', `x ')
7073 @result{}
7074 @end example
7075
7076 @cindex rescanning
7077 There is nothing inherently wrong with defining @samp{x} to
7078 return @samp{x}.  The wrong thing is to expand @samp{x} unquoted,
7079 because that would cause an infinite rescan loop.
7080 In @code{m4}, one might use macros to hold strings, as we do for
7081 variables in other programming languages, further checking them with:
7082
7083 @comment ignore
7084 @example
7085 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
7086 @end example
7087
7088 @noindent
7089 In cases like this one, an interdiction for a macro to hold its own name
7090 would be a useless limitation.  Of course, this leaves more rope for the
7091 @acronym{GNU} @code{m4} user to hang himself!  Rescanning hangs may be
7092 avoided through careful programming, a little like for endless loops in
7093 traditional programming languages.
7094 @end itemize
7095
7096 @node Answers
7097 @chapter Correct version of some examples
7098
7099 Some of the examples in this manuals are buggy or not very robust, for
7100 demonstration purposes.  Improved versions of these composite macros are
7101 presented here.
7102
7103 @menu
7104 * Improved exch::               Solution for @code{exch}
7105 * Improved forloop::            Solution for @code{forloop}
7106 * Improved foreach::            Solution for @code{foreach}
7107 * Improved cleardivert::        Solution for @code{cleardivert}
7108 * Improved capitalize::         Solution for @code{capitalize}
7109 * Improved fatal_error::        Solution for @code{fatal_error}
7110 @end menu
7111
7112 @node Improved exch
7113 @section Solution for @code{exch}
7114
7115 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
7116 to double quote their arguments.  A nicer definition, which lets
7117 clients follow the rule of thumb of one level of quoting per level of
7118 parentheses, involves adding quotes in the definition of @code{exch}, as
7119 follows:
7120
7121 @example
7122 define(`exch', ``$2', `$1'')
7123 @result{}
7124 define(exch(`expansion text', `macro'))
7125 @result{}
7126 macro
7127 @result{}expansion text
7128 @end example
7129
7130 @node Improved forloop
7131 @section Solution for @code{forloop}
7132
7133 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
7134 into an infinite loop if given an iterator that is not parsed as a macro
7135 name.  It does not do any sanity checking on its numeric bounds, and
7136 only permits decimal numbers for bounds.  Here is an improved version,
7137 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
7138 version also optimizes based on the fact that the starting bound does
7139 not need to be passed to the helper @code{@w{_forloop}}.
7140
7141 @comment examples
7142 @example
7143 $ @kbd{m4 -I examples}
7144 undivert(`forloop2.m4')dnl
7145 @result{}divert(`-1')
7146 @result{}# forloop(var, from, to, stmt) - improved version:
7147 @result{}#   works even if VAR is not a strict macro name
7148 @result{}#   performs sanity check that FROM is larger than TO
7149 @result{}#   allows complex numerical expressions in TO and FROM
7150 @result{}define(`forloop', `ifelse(eval(`($3) >= ($2)'), `1',
7151 @result{}  `pushdef(`$1', eval(`$2'))_$0(`$1',
7152 @result{}    eval(`$3'), `$4')popdef(`$1')')')
7153 @result{}define(`_forloop',
7154 @result{}  `$3`'ifelse(indir(`$1'), `$2', `',
7155 @result{}    `define(`$1', incr(indir(`$1')))$0($@@)')')
7156 @result{}divert`'dnl
7157 include(`forloop2.m4')
7158 @result{}
7159 forloop(`i', `2', `1', `no iteration occurs')
7160 @result{}
7161 forloop(`', `1', `2', ` odd iterator name')
7162 @result{} odd iterator name odd iterator name
7163 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
7164 @result{} 0xa 0xb 0xc
7165 forloop(`i', `a', `b', `non-numeric bounds')
7166 @error{}m4:stdin:6: Warning: eval: bad expression (bad input): (b) >= (a)
7167 @result{}
7168 @end example
7169
7170 One other change to notice is that the improved version used @samp{_$0}
7171 rather than @samp{_foreach} to invoke the helper routine.  In general,
7172 this is a good practice to follow, because then the set of macros can be
7173 uniformly transformed.  The following example shows a transformation
7174 that doubles the current quoting and appends a suffix @samp{2} to each
7175 transformed macro.  If @code{foreach} refers to the literal
7176 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
7177 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
7178 to an infinite recursion loop in this example.
7179
7180 @comment options: -L9
7181 @comment status: 1
7182 @comment examples
7183 @example
7184 $ @kbd{m4 -d -L 9 -I examples}
7185 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
7186 @result{}
7187 define(`double', `define(`$1'`2',
7188   arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
7189 @result{}
7190 double(`forloop')double(`_forloop')defn(`forloop2')
7191 @result{}ifelse(eval(``($3) >= ($2)''), ``1'',
7192 @result{}  ``pushdef(``$1'', eval(``$2''))_$0(``$1'',
7193 @result{}    eval(``$3''), ``$4'')popdef(``$1'')'')
7194 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7195 @result{}
7196 changequote(`[', `]')changequote([``], [''])
7197 @result{}
7198 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7199 @result{}
7200 changequote`'include(`forloop.m4')
7201 @result{}
7202 double(`forloop')double(`_forloop')defn(`forloop2')
7203 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
7204 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
7205 @result{}
7206 changequote(`[', `]')changequote([``], [''])
7207 @result{}
7208 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
7209 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
7210 @end example
7211
7212 Of course, it is possible to make even more improvements, such as
7213 adding an optional step argument, or allowing iteration through
7214 descending sequences.  @acronym{GNU} Autoconf provides some of these
7215 additional bells and whistles in its @code{m4_for} macro.
7216
7217 @node Improved foreach
7218 @section Solution for @code{foreach}
7219
7220 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
7221 presented earlier each have flaws.  First, we will examine and fix the
7222 quadratic behavior of @code{foreachq}:
7223
7224 @comment examples
7225 @example
7226 $ @kbd{m4 -I examples}
7227 include(`foreachq.m4')
7228 @result{}
7229 traceon(`shift')debugmode(`aq')
7230 @result{}
7231 foreachq(`x', ``1', `2', `3', `4'', `x
7232 ')dnl
7233 @result{}1
7234 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7235 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7236 @result{}2
7237 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7238 @error{}m4trace: -3- shift(`2', `3', `4')
7239 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7240 @error{}m4trace: -2- shift(`2', `3', `4')
7241 @result{}3
7242 @error{}m4trace: -5- shift(`1', `2', `3', `4')
7243 @error{}m4trace: -4- shift(`2', `3', `4')
7244 @error{}m4trace: -3- shift(`3', `4')
7245 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7246 @error{}m4trace: -3- shift(`2', `3', `4')
7247 @error{}m4trace: -2- shift(`3', `4')
7248 @result{}4
7249 @error{}m4trace: -6- shift(`1', `2', `3', `4')
7250 @error{}m4trace: -5- shift(`2', `3', `4')
7251 @error{}m4trace: -4- shift(`3', `4')
7252 @error{}m4trace: -3- shift(`4')
7253 @end example
7254
7255 @cindex quadratic behavior, avoiding
7256 @cindex avoiding quadratic behavior
7257 Each successive iteration was adding more quoted @code{shift}
7258 invocations, and the entire list contents were passing through every
7259 iteration.  In general, when recursing, it is a good idea to make the
7260 recursion use fewer arguments, rather than adding additional quoted
7261 uses of @code{shift}.  By doing so, @code{m4} uses less memory, invokes
7262 fewer macros, is less likely to run into machine limits, and most
7263 importantly, performs faster.  The fixed version of @code{foreachq} can
7264 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
7265
7266 @comment examples
7267 @example
7268 $ @kbd{m4 -I examples}
7269 include(`foreachq2.m4')
7270 @result{}
7271 undivert(`foreachq2.m4')dnl
7272 @result{}include(`quote.m4')dnl
7273 @result{}divert(`-1')
7274 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7275 @result{}#   quoted list, improved version
7276 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
7277 @result{}define(`_arg1q', ``$1'')
7278 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
7279 @result{}define(`_foreachq', `ifelse(`$2', `', `',
7280 @result{}  `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
7281 @result{}divert`'dnl
7282 traceon(`shift')debugmode(`aq')
7283 @result{}
7284 foreachq(`x', ``1', `2', `3', `4'', `x
7285 ')dnl
7286 @result{}1
7287 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7288 @result{}2
7289 @error{}m4trace: -3- shift(`2', `3', `4')
7290 @result{}3
7291 @error{}m4trace: -3- shift(`3', `4')
7292 @result{}4
7293 @end example
7294
7295 Note that the fixed version calls unquoted helper macros in
7296 @code{@w{_foreachq}} to trim elements immediately; those helper macros
7297 in turn must re-supply the layer of quotes lost in the macro invocation.
7298 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
7299 element, with @code{@w{_arg1}} of the earlier implementation that
7300 returned the first list element directly.  Additionally, by calling the
7301 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
7302 contains unexpanded macros.
7303
7304 The astute m4 programmer might notice that the solution above still uses
7305 more macro invocations than strictly necessary.  Note that @samp{$2},
7306 which contains an arbitrarily long quoted list, is expanded and
7307 rescanned three times per iteration of @code{_foreachq}.  Furthermore,
7308 every iteration of the algorithm effectively unboxes then reboxes the
7309 list, which costs a couple of macro invocations.  It is possible to
7310 rewrite the algorithm by swapping the order of the arguments to
7311 @code{_foreachq} in order to operate on an unboxed list in the first
7312 place, and by using the fixed-length @samp{$#} instead of an arbitrary
7313 length list as the key to end recursion.  The result is eight macro
7314 invocations per loop, instead of nine.  This alternative approach is
7315 available as @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
7316
7317 @comment examples
7318 @example
7319 $ @kbd{m4 -I examples}
7320 include(`foreachq3.m4')
7321 @result{}
7322 undivert(`foreachq3.m4')dnl
7323 @result{}divert(`-1')
7324 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7325 @result{}#   quoted list, alternate improved version
7326 @result{}define(`foreachq',
7327 @result{}`pushdef(`$1')_$0(`$1', `$3'ifelse(`$2', `', `',
7328 @result{}  `, $2'))popdef(`$1')')
7329 @result{}define(`_foreachq', `ifelse(`$#', `2', `',
7330 @result{}  `define(`$1', `$3')$2`'$0(`$1', `$2'ifelse(`$#', `3', `',
7331 @result{}    `, shift(shift(shift($@@)))'))')')
7332 @result{}divert`'dnl
7333 traceon(`shift')debugmode(`aq')
7334 @result{}
7335 foreachq(`x', ``1', `2', `3', `4'', `x
7336 ')dnl
7337 @result{}1
7338 @error{}m4trace: -4- shift(`x', `x
7339 @error{}', `1', `2', `3', `4')
7340 @error{}m4trace: -3- shift(`x
7341 @error{}', `1', `2', `3', `4')
7342 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7343 @result{}2
7344 @error{}m4trace: -4- shift(`x', `x
7345 @error{}', `2', `3', `4')
7346 @error{}m4trace: -3- shift(`x
7347 @error{}', `2', `3', `4')
7348 @error{}m4trace: -2- shift(`2', `3', `4')
7349 @result{}3
7350 @error{}m4trace: -4- shift(`x', `x
7351 @error{}', `3', `4')
7352 @error{}m4trace: -3- shift(`x
7353 @error{}', `3', `4')
7354 @error{}m4trace: -2- shift(`3', `4')
7355 @result{}4
7356 @end example
7357
7358 Prior to M4 1.4.11, every instance of @samp{$@@} was rescanned as it was
7359 encountered.  Thus, the @file{foreachq3.m4} alternative used much less
7360 memory than @file{foreachq2.m4}, and executed as much as 10% faster,
7361 since each iteration encountered fewer @samp{$@@}.  However, the
7362 implementation of rescanning every byte in @samp{$@@} was quadratic in
7363 the number of bytes scanned (for example, making the broken version in
7364 @file{foreachq.m4} cubic, rather than quadratic, in behavior).  Once the
7365 underlying M4 implementation was improved in 1.4.11 to reuse results of
7366 previous scans, both styles of @code{foreachq} become linear in the
7367 number of bytes scanned, and the difference in timing is no longer
7368 noticeable; in fact, after the change, the @file{foreachq2.m4} version
7369 uses slightly less memory since it tracks fewer arguments per macro
7370 invocation.
7371
7372 For yet another approach, the improved version of @code{foreach},
7373 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
7374 overquotes the arguments to @code{@w{_foreach}} to begin with, using
7375 @code{dquote_elt}.  Then @code{@w{_foreach}} can just use
7376 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
7377 front:
7378
7379 @comment examples
7380 @example
7381 $ @kbd{m4 -I examples}
7382 include(`foreach2.m4')
7383 @result{}
7384 undivert(`foreach2.m4')dnl
7385 @result{}include(`quote.m4')dnl
7386 @result{}divert(`-1')
7387 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
7388 @result{}#   parenthesized list, improved version
7389 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
7390 @result{}  (dquote(dquote_elt$2)), `$3')popdef(`$1')')
7391 @result{}define(`_arg1', `$1')
7392 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
7393 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
7394 @result{}divert`'dnl
7395 traceon(`shift')debugmode(`aq')
7396 @result{}
7397 foreach(`x', `(`1', `2', `3', `4')', `x
7398 ')dnl
7399 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7400 @error{}m4trace: -4- shift(`2', `3', `4')
7401 @error{}m4trace: -4- shift(`3', `4')
7402 @result{}1
7403 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
7404 @result{}2
7405 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
7406 @result{}3
7407 @error{}m4trace: -3- shift(``3'', ``4'')
7408 @result{}4
7409 @error{}m4trace: -3- shift(``4'')
7410 @end example
7411
7412 In summary, recursion over list elements is trickier than it appeared at
7413 first glance, but provides a powerful idiom within @code{m4} processing.
7414 As a final demonstration, both list styles are now able to handle
7415 several scenarios that would wreak havoc on one or both of the original
7416 implementations.  This points out one other difference between the
7417 list styles.  @code{foreach} evaluates unquoted list elements only once,
7418 in preparation for calling @code{@w{_foreach}}, similary for
7419 @code{foreachq} as provided by @file{foreachq3.m4}.  But
7420 @code{foreachq}, as provided by @file{foreachq2.m4},
7421 evaluates unquoted list elements twice while visiting the first list
7422 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}.  When
7423 deciding which list style to use, one must take into account whether
7424 repeating the side effects of unquoted list elements will have any
7425 detrimental effects.
7426
7427 @comment examples
7428 @example
7429 $ @kbd{m4 -d -I examples}
7430 include(`foreach2.m4')
7431 @result{}
7432 include(`foreachq2.m4')
7433 @result{}
7434 dnl 0-element list:
7435 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
7436 @result{} /@w{ }
7437 dnl 1-element list of empty element
7438 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
7439 @result{}<> / <>
7440 dnl 2-element list of empty elements
7441 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
7442 @result{}<><> / <><>
7443 dnl 1-element list of a comma
7444 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
7445 @result{}<,> / <,>
7446 dnl 2-element list of unbalanced parentheses
7447 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
7448 @result{}<(><)> / <(><)>
7449 define(`ab', `oops')dnl using defn(`iterator')
7450 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
7451  foreachq(`x', ``a', `b'', `defn(`x')')
7452 @result{}ab / ab
7453 define(`active', `ACT, IVE')
7454 @result{}
7455 traceon(`active')
7456 @result{}
7457 dnl list of unquoted macros; expansion occurs before recursion
7458 foreach(`x', `(active, active)', `<x>
7459 ')dnl
7460 @error{}m4trace: -4- active -> `ACT, IVE'
7461 @error{}m4trace: -4- active -> `ACT, IVE'
7462 @result{}<ACT>
7463 @result{}<IVE>
7464 @result{}<ACT>
7465 @result{}<IVE>
7466 foreachq(`x', `active, active', `<x>
7467 ')dnl
7468 @error{}m4trace: -3- active -> `ACT, IVE'
7469 @error{}m4trace: -3- active -> `ACT, IVE'
7470 @result{}<ACT>
7471 @error{}m4trace: -3- active -> `ACT, IVE'
7472 @error{}m4trace: -3- active -> `ACT, IVE'
7473 @result{}<IVE>
7474 @result{}<ACT>
7475 @result{}<IVE>
7476 dnl list of quoted macros; expansion occurs during recursion
7477 foreach(`x', `(`active', `active')', `<x>
7478 ')dnl
7479 @error{}m4trace: -1- active -> `ACT, IVE'
7480 @result{}<ACT, IVE>
7481 @error{}m4trace: -1- active -> `ACT, IVE'
7482 @result{}<ACT, IVE>
7483 foreachq(`x', ``active', `active'', `<x>
7484 ')dnl
7485 @error{}m4trace: -1- active -> `ACT, IVE'
7486 @result{}<ACT, IVE>
7487 @error{}m4trace: -1- active -> `ACT, IVE'
7488 @result{}<ACT, IVE>
7489 dnl list of double-quoted macro names; no expansion
7490 foreach(`x', `(``active'', ``active'')', `<x>
7491 ')dnl
7492 @result{}<active>
7493 @result{}<active>
7494 foreachq(`x', ```active'', ``active''', `<x>
7495 ')dnl
7496 @result{}<active>
7497 @result{}<active>
7498 @end example
7499
7500 @ignore
7501 @comment Not worth putting in the manual, but make sure that performance
7502 @comment on recursive algorithms is not quadratic.
7503
7504 @comment boxed recursion
7505
7506 @comment examples
7507 @comment options: -Dlimit=10 -Dverbose
7508 @example
7509 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose}
7510 include(`loop.m4')dnl
7511 @result{} 1 2 3 4 5 6 7 8 9 10
7512 @end example
7513
7514 @comment examples
7515 @comment options: -Dlimit=2500
7516 @example
7517 $ @kbd {m4 -I examples -Dlimit=2500}
7518 include(`loop.m4')dnl
7519 @end example
7520
7521 @comment examples
7522 @comment options: -Dlimit=10000
7523 @example
7524 $ @kbd {m4 -I examples -Dlimit=10000}
7525 define(`debug', `define(`popdef',`divert`'i')')
7526 @result{}
7527 include(`loop.m4')dnl
7528 @result{}10000
7529 @end example
7530
7531 @comment unboxed recursion
7532
7533 @comment examples
7534 @comment options: -Dlimit=10 -Dverbose -Dalt
7535 @example
7536 $ @kbd {m4 -I examples -Dlimit=10 -Dverbose -Dalt}
7537 include(`loop.m4')dnl
7538 @result{} 1 2 3 4 5 6 7 8 9 10
7539 @end example
7540
7541 @comment examples
7542 @comment options: -Dlimit=2500 -Dalt
7543 @example
7544 $ @kbd {m4 -I examples -Dlimit=2500 -Dalt}
7545 include(`loop.m4')dnl
7546 @end example
7547
7548 @comment examples
7549 @comment options: -Dlimit=10000 -Dalt
7550 @example
7551 $ @kbd {m4 -I examples -Dlimit=10000 -Dalt}
7552 define(`debug', `define(`popdef',`divert`'i')')
7553 @result{}
7554 include(`loop.m4')dnl
7555 @result{}10000
7556 @end example
7557
7558 @end ignore
7559
7560 @node Improved cleardivert
7561 @section Solution for @code{cleardivert}
7562
7563 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
7564 called without arguments to clear all pending diversions.  That is
7565 because using undivert with an empty string for an argument is different
7566 than using it with no arguments at all.  Compare the earlier definition
7567 with one that takes the number of arguments into account:
7568
7569 @example
7570 define(`cleardivert',
7571   `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
7572 @result{}
7573 divert(`1')one
7574 divert
7575 @result{}
7576 cleardivert
7577 @result{}
7578 undivert
7579 @result{}one
7580 @result{}
7581 define(`cleardivert',
7582   `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
7583     `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
7584 @result{}
7585 divert(`2')two
7586 divert
7587 @result{}
7588 cleardivert
7589 @result{}
7590 undivert
7591 @result{}
7592 @end example
7593
7594 @node Improved capitalize
7595 @section Solution for @code{capitalize}
7596
7597 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
7598 not allow clients to follow the quoting rule of thumb.  Consider the
7599 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
7600 difference between calling @code{capitalize} with the expansion of a
7601 macro, expanding the result of a case change, and changing the case of a
7602 double-quoted string:
7603
7604 @comment examples
7605 @example
7606 $ @kbd{m4 -I examples}
7607 include(`capitalize.m4')dnl
7608 define(`active', `act1, ive')dnl
7609 define(`Active', `Act2, Ive')dnl
7610 define(`ACTIVE', `ACT3, IVE')dnl
7611 upcase(active)
7612 @result{}ACT1,IVE
7613 upcase(`active')
7614 @result{}ACT3, IVE
7615 upcase(``active'')
7616 @result{}ACTIVE
7617 downcase(ACTIVE)
7618 @result{}act3,ive
7619 downcase(`ACTIVE')
7620 @result{}act1, ive
7621 downcase(``ACTIVE'')
7622 @result{}active
7623 capitalize(active)
7624 @result{}Act1
7625 capitalize(`active')
7626 @result{}Active
7627 capitalize(``active'')
7628 @result{}_capitalize(`active')
7629 define(`A', `OOPS')
7630 @result{}
7631 capitalize(active)
7632 @result{}OOPSct1
7633 capitalize(`active')
7634 @result{}OOPSctive
7635 @end example
7636
7637 First, when @code{capitalize} is called with more than one argument, it
7638 was throwing away later arguments, whereas @code{upcase} and
7639 @code{downcase} used @samp{$*} to collect them all.  The fix is simple:
7640 use @samp{$*} consistently.
7641
7642 Next, with single-quoting, @code{capitalize} outputs a single character,
7643 a set of quotes, then the rest of the characters, making it impossible
7644 to invoke @code{Active} after the fact, and allowing the alternate macro
7645 @code{A} to interfere.  Here, the solution is to use additional quoting
7646 in the helper macros, then pass the final over-quoted output string
7647 through @code{_arg1} to remove the extra quoting and finally invoke the
7648 concatenated portions as a single string.
7649
7650 Finally, when passed a double-quoted string, the nested macro
7651 @code{_capitalize} is never invoked because it ended up nested inside
7652 quotes.  This one is the toughest to fix.  In short, we have no idea how
7653 many levels of quotes are in effect on the substring being altered by
7654 @code{patsubst}.  If the replacement string cannot be expressed entirely
7655 in terms of literal text and backslash substitutions, then we need a
7656 mechanism to guarantee that the helper macros are invoked outside of
7657 quotes.  In other words, this sounds like a job for @code{changequote}
7658 (@pxref{Changequote}).  By changing the active quoting characters, we
7659 can guarantee that replacement text injected by @code{patsubst} always
7660 occurs in the middle of a string that has exactly one level of
7661 over-quoting using alternate quotes; so the replacement text closes the
7662 quoted string, invokes the helper macros, then reopens the quoted
7663 string.  In turn, that means the replacement text has unbalanced quotes,
7664 necessitating another round of @code{changequote}.
7665
7666 In the fixed version below, (also shipped as
7667 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
7668 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
7669 strings are chosen so as to be less likely to appear in the text being
7670 converted).  The helpers @code{_to_alt} and @code{_from_alt} merely
7671 reduce the number of characters required to perform a
7672 @code{changequote}, since the definition changes twice.  The outermost
7673 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
7674 with alternate quoting; the innermost pair is used so that the third
7675 argument to @code{patsubst} can contain an unbalanced
7676 @samp{]>>}/@samp{<<[} pair.  Note that @code{upcase} and @code{downcase}
7677 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
7678 they contain nested quotes but are invoked with the alternate quoting
7679 scheme in effect.
7680
7681 @comment examples
7682 @example
7683 $ @kbd{m4 -I examples}
7684 include(`capitalize2.m4')dnl
7685 define(`active', `act1, ive')dnl
7686 define(`Active', `Act2, Ive')dnl
7687 define(`ACTIVE', `ACT3, IVE')dnl
7688 define(`A', `OOPS')dnl
7689 capitalize(active; `active'; ``active''; ```actIVE''')
7690 @result{}Act1,Ive; Act2, Ive; Active; `Active'
7691 undivert(`capitalize2.m4')dnl
7692 @result{}divert(`-1')
7693 @result{}# upcase(text)
7694 @result{}# downcase(text)
7695 @result{}# capitalize(text)
7696 @result{}#   change case of text, improved version
7697 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
7698 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
7699 @result{}define(`_arg1', `$1')
7700 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
7701 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
7702 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
7703 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
7704 @result{}define(`_capitalize_alt',
7705 @result{}  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
7706 @result{}    <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
7707 @result{}define(`capitalize',
7708 @result{}  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
7709 @result{}    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
7710 @result{}divert`'dnl
7711 @end example
7712
7713 @node Improved fatal_error
7714 @section Solution for @code{fatal_error}
7715
7716 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
7717 of @acronym{GNU} M4 earlier than 1.4.8, where invoking
7718 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
7719 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
7720 though all files start at line 1.  Furthermore, versions earlier than
7721 1.4.6 did not support the @code{@w{__program__}} macro.  If you want
7722 @code{fatal_error} to work across the entire 1.4.x release series, a
7723 better implementation would be:
7724
7725 @comment status: 1
7726 @example
7727 define(`fatal_error',
7728   `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
7729 `:ifelse(__line__, `0', `',
7730     `__file__:__line__:')` fatal error: $*
7731 ')m4exit(`1')')
7732 @result{}
7733 m4wrap(`divnum(`demo of internal message')
7734 fatal_error(`inside wrapped text')')
7735 @result{}
7736 ^D
7737 @error{}m4:stdin:6: Warning: divnum: extra arguments ignored: 1 > 0
7738 @result{}0
7739 @error{}m4:stdin:6: fatal error: inside wrapped text
7740 @end example
7741
7742 @c ========================================================== Appendices
7743
7744 @node Copying This Package
7745 @appendix How to make copies of the overall M4 package
7746 @cindex License, code
7747
7748 This appendix covers the license for copying the source code of the
7749 overall M4 package.  This manual is under a different set of
7750 restrictions, covered later (@pxref{Copying This Manual}).
7751
7752 @menu
7753 * GNU General Public License::  License for copying the M4 package
7754 @end menu
7755
7756 @node GNU General Public License
7757 @appendixsec License for copying the M4 package
7758 @cindex GPL, GNU General Public License
7759 @cindex GNU General Public License
7760 @cindex General Public License (GPL), GNU
7761 @include gpl-3.0.texi
7762
7763 @node Copying This Manual
7764 @appendix How to make copies of this manual
7765 @cindex License, manual
7766
7767 This appendix covers the license for copying this manual.  Note that
7768 some of the longer examples in this manual are also distributed in the
7769 directory @file{m4-@value{VERSION}/@/examples/}, where a more
7770 permissive license is in effect when copying just the examples.
7771
7772 @menu
7773 * GNU Free Documentation License::  License for copying this manual
7774 @end menu
7775
7776 @node GNU Free Documentation License
7777 @appendixsec License for copying this manual
7778 @cindex FDL, GNU Free Documentation License
7779 @cindex GNU Free Documentation License
7780 @cindex Free Documentation License (FDL), GNU
7781 @include fdl.texi
7782
7783 @node Indices
7784 @appendix Indices of concepts and macros
7785
7786 @menu
7787 * Macro index::                 Index for all @code{m4} macros
7788 * Concept index::               Index for many concepts
7789 @end menu
7790
7791 @node Macro index
7792 @appendixsec Index for all @code{m4} macros
7793
7794 This index covers all @code{m4} builtins, as well as several useful
7795 composite macros.  References are exclusively to the places where a
7796 macro is introduced the first time.
7797
7798 @printindex fn
7799
7800 @node Concept index
7801 @appendixsec Index for many concepts
7802
7803 @printindex cp
7804
7805 @bye
7806
7807 @c Local Variables:
7808 @c coding: iso-8859-1
7809 @c fill-column: 72
7810 @c ispell-local-dictionary: "american"
7811 @c indent-tabs-mode: nil
7812 @c whitespace-check-buffer-indent: nil
7813 @c End: