doc/m4.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @comment ========================================================
   3 @comment %**start of header
   4 @setfilename m4.info
   5 @include version.texi
   6 @settitle GNU M4 @value{VERSION} macro processor
   7 @setchapternewpage odd
   8 @ifnothtml
   9 @setcontentsaftertitlepage
  10 @end ifnothtml
  11 @finalout
  12
  13 @set beta
  14
  15 @c @tabchar{}
  16 @c ----------
  17 @c The testsuite expects literal tab output in some examples, but
  18 @c literal tabs in texinfo leads to formatting issues.
  19 @macro tabchar
  20 @       @c
  21 @end macro
  22
  23 @c @ovar{ARG}
  24 @c -------------------
  25 @c The ARG is an optional argument.  To be used for macro arguments in
  26 @c their documentation (@defmac).
  27 @macro ovar{varname}
  28 @r{[}@var{\varname\}@r{]}@c
  29 @end macro
  30
  31 @c @dvar{ARG, DEFAULT}
  32 @c -------------------
  33 @c The ARG is an optional argument, defaulting to DEFAULT.  To be used
  34 @c for macro arguments in their documentation (@defmac).
  35 @macro dvar{varname, default}
  36 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
  37 @end macro
  38
  39 @comment %**end of header
  40 @comment ========================================================
  41
  42 @copying
  43
  44 This manual (@value{UPDATED}) is for @acronym{GNU} M4 (version
  45 @value{VERSION}), a package containing an implementation of the m4 macro
  46 language.
  47
  48 Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999,
  49 2000, 2001, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software
  50 Foundation, Inc.
  51
  52 @quotation
  53 Permission is granted to copy, distribute and/or modify this document
  54 under the terms of the @acronym{GNU} Free Documentation License,
  55 Version 1.3 or any later version published by the Free Software
  56 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
  57 Back-Cover Texts.  A copy of the license is included in the section
  58 entitled ``@acronym{GNU} Free Documentation License.''
  59 @end quotation
  60 @end copying
  61
  62 @dircategory Text creation and manipulation
  63 @direntry
  64 * M4: (m4).                     A powerful macro processor.
  65 @end direntry
  66
  67 @titlepage
  68 @title GNU M4, version @value{VERSION}
  69 @subtitle A powerful macro processor
  70 @subtitle Edition @value{EDITION}, @value{UPDATED}
  71 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
  72 @author Gary V. Vaughan, and Eric Blake
  73 @author (@email{bug-m4@@gnu.org})
  74
  75 @page
  76 @vskip 0pt plus 1filll
  77 @insertcopying
  78 @end titlepage
  79
  80 @contents
  81
  82 @ifnottex
  83 @node Top
  84 @top GNU M4
  85 @insertcopying
  86 @end ifnottex
  87
  88 @acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
  89 processor.  It is mostly SVR4 compatible, although it has some
  90 extensions (for example, handling more than 9 positional parameters
  91 to macros).  @code{m4} also has builtin functions for including
  92 files, running shell commands, doing arithmetic, etc.  Autoconf needs
  93 @acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
  94 running them.
  95
  96 @acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
  97 subsequent changes by Fran@,{c}ois Pinard and other volunteers
  98 on the Internet.  All names and email addresses can be found in the
  99 files @file{m4-@value{VERSION}/@/AUTHORS} and
 100 @file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
 101 distribution.
 102
 103 @ifclear beta
 104 This is release @value{VERSION}.  It is now considered stable:  future
 105 releases on this branch are only meant to fix bugs, increase speed, or
 106 improve documentation.
 107 @end ifclear
 108
 109 @ifset beta
 110 This is BETA release @value{VERSION}.  This is a development release,
 111 and as such, is prone to bugs, crashes, unforeseen features, incomplete
 112 documentation@dots{}, therefore, use at your own peril.  In case of
 113 problems, please do not hesitate to report them (see the
 114 @file{m4-@value{VERSION}/@/README} file in the distribution).
 115 @xref{Experiments}.
 116 @end ifset
 117
 118 @menu
 119 * Preliminaries::               Introduction and preliminaries
 120 * Invoking m4::                 Invoking @code{m4}
 121 * Syntax::                      Lexical and syntactic conventions
 122
 123 * Macros::                      How to invoke macros
 124 * Definitions::                 How to define new macros
 125 * Conditionals::                Conditionals, loops, and recursion
 126
 127 * Debugging::                   How to debug macros and input
 128
 129 * Input Control::               Input control
 130 * File Inclusion::              File inclusion
 131 * Diversions::                  Diverting and undiverting output
 132
 133 * Modules::                     Extending M4 with dynamic runtime modules
 134
 135 * Text handling::               Macros for text handling
 136 * Arithmetic::                  Macros for doing arithmetic
 137 * Shell commands::              Macros for running shell commands
 138 * Miscellaneous::               Miscellaneous builtin macros
 139 * Frozen files::                Fast loading of frozen state
 140
 141 * Compatibility::               Compatibility with other versions of @code{m4}
 142 * Answers::                     Correct version of some examples
 143
 144 * Copying This Package::        How to make copies of the overall M4 package
 145 * Copying This Manual::         How to make copies of this manual
 146 * Indices::                     Indices of concepts and macros
 147
 148 @detailmenu
 149  --- The Detailed Node Listing ---
 150
 151 Introduction and preliminaries
 152
 153 * Intro::                       Introduction to @code{m4}
 154 * History::                     Historical references
 155 * Bugs::                        Problems and bugs
 156 * Manual::                      Using this manual
 157
 158 Invoking @code{m4}
 159
 160 * Operation modes::             Command line options for operation modes
 161 * Dynamic loading features::    Command line options for dynamic loading
 162 * Preprocessor features::       Command line options for preprocessor features
 163 * Limits control::              Command line options for limits control
 164 * Frozen state::                Command line options for frozen state
 165 * Debugging options::           Command line options for debugging
 166 * Command line files::          Specifying input files on the command line
 167
 168 Lexical and syntactic conventions
 169
 170 * Names::                       Macro names
 171 * Quoted strings::              Quoting input to @code{m4}
 172 * Comments::                    Comments in @code{m4} input
 173 * Other tokens::                Other kinds of input tokens
 174 * Input processing::            How @code{m4} copies input to output
 175 * Regular expression syntax::   How @code{m4} interprets regular expressions
 176
 177 How to invoke macros
 178
 179 * Invocation::                  Macro invocation
 180 * Inhibiting Invocation::       Preventing macro invocation
 181 * Macro Arguments::             Macro arguments
 182 * Quoting Arguments::           On Quoting Arguments to macros
 183 * Macro expansion::             Expanding macros
 184
 185 How to define new macros
 186
 187 * Define::                      Defining a new macro
 188 * Arguments::                   Arguments to macros
 189 * Pseudo Arguments::            Special arguments to macros
 190 * Undefine::                    Deleting a macro
 191 * Defn::                        Renaming macros
 192 * Pushdef::                     Temporarily redefining macros
 193 * Renamesyms::                  Renaming macros with regular expressions
 194
 195 * Indir::                       Indirect call of macros
 196 * Builtin::                     Indirect call of builtins
 197 * M4symbols::                   Getting the defined macro names
 198
 199 Conditionals, loops, and recursion
 200
 201 * Ifdef::                       Testing if a macro is defined
 202 * Ifelse::                      If-else construct, or multibranch
 203 * Shift::                       Recursion in @code{m4}
 204 * Forloop::                     Iteration by counting
 205 * Foreach::                     Iteration by list contents
 206 * Stacks::                      Working with definition stacks
 207 * Composition::                 Building macros with macros
 208
 209 How to debug macros and input
 210
 211 * Dumpdef::                     Displaying macro definitions
 212 * Trace::                       Tracing macro calls
 213 * Debugmode::                   Controlling debugging options
 214 * Debuglen::                    Limiting debug output
 215 * Debugfile::                   Saving debugging output
 216
 217 Input control
 218
 219 * Dnl::                         Deleting whitespace in input
 220 * Changequote::                 Changing the quote characters
 221 * Changecom::                   Changing the comment delimiters
 222 * Changeresyntax::              Changing the regular expression syntax
 223 * Changesyntax::                Changing the lexical structure of the input
 224 * M4wrap::                      Saving text until end of input
 225
 226 File inclusion
 227
 228 * Include::                     Including named files
 229 * Search Path::                 Searching for include files
 230
 231 Diverting and undiverting output
 232
 233 * Divert::                      Diverting output
 234 * Undivert::                    Undiverting output
 235 * Divnum::                      Diversion numbers
 236 * Cleardivert::                 Discarding diverted text
 237
 238 Extending M4 with dynamic runtime modules
 239
 240 * M4modules::                   Listing loaded modules
 241 * Load::                        Loading additional modules
 242 * Unload::                      Removing loaded modules
 243 * Refcount::                    Tracking module references
 244 * Standard Modules::            Standard bundled modules
 245
 246 Macros for text handling
 247
 248 * Len::                         Calculating length of strings
 249 * Index macro::                 Searching for substrings
 250 * Regexp::                      Searching for regular expressions
 251 * Substr::                      Extracting substrings
 252 * Translit::                    Translating characters
 253 * Patsubst::                    Substituting text by regular expression
 254 * Format::                      Formatting strings (printf-like)
 255
 256 Macros for doing arithmetic
 257
 258 * Incr::                        Decrement and increment operators
 259 * Eval::                        Evaluating integer expressions
 260 * Mpeval::                      Multiple precision arithmetic
 261
 262 Macros for running shell commands
 263
 264 * Platform macros::             Determining the platform
 265 * Syscmd::                      Executing simple commands
 266 * Esyscmd::                     Reading the output of commands
 267 * Sysval::                      Exit status
 268 * Mkstemp::                     Making temporary files
 269 * Mkdtemp::                     Making temporary directories
 270
 271 Miscellaneous builtin macros
 272
 273 * Errprint::                    Printing error messages
 274 * Location::                    Printing current location
 275 * M4exit::                      Exiting from @code{m4}
 276 * Syncoutput::                  Turning on and off sync lines
 277
 278 Fast loading of frozen state
 279
 280 * Using frozen files::          Using frozen files
 281 * Frozen file format 1::        Frozen file format 1
 282 * Frozen file format 2::        Frozen file format 2
 283
 284 Compatibility with other versions of @code{m4}
 285
 286 * Extensions::                  Extensions in @acronym{GNU} M4
 287 * Incompatibilities::           Other incompatibilities
 288 * Experiments::                 Experimental features in @acronym{GNU} M4
 289
 290 Correct version of some examples
 291
 292 * Improved exch::               Solution for @code{exch}
 293 * Improved forloop::            Solution for @code{forloop}
 294 * Improved foreach::            Solution for @code{foreach}
 295 * Improved copy::               Solution for @code{copy}
 296 * Improved m4wrap::             Solution for @code{m4wrap}
 297 * Improved cleardivert::        Solution for @code{cleardivert}
 298 * Improved capitalize::         Solution for @code{capitalize}
 299 * Improved fatal_error::        Solution for @code{fatal_error}
 300
 301 How to make copies of the overall M4 package
 302
 303 * GNU General Public License::  License for copying the M4 package
 304
 305 How to make copies of this manual
 306
 307 * GNU Free Documentation License::  License for copying this manual
 308
 309 Indices of concepts and macros
 310
 311 * Macro index::                 Index for all @code{m4} macros
 312 * Concept index::               Index for many concepts
 313
 314 @end detailmenu
 315 @end menu
 316
 317 @node Preliminaries
 318 @chapter Introduction and preliminaries
 319
 320 This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
 321 comes from, how to read and use this documentation, how to call the
 322 @code{m4} program, and how to report bugs about it.  It concludes by
 323 giving tips for reading the remainder of the manual.
 324
 325 The following chapters then detail all the features of the @code{m4}
 326 language, as shipped in the @acronym{GNU} M4 package.
 327
 328 @menu
 329 * Intro::                       Introduction to @code{m4}
 330 * History::                     Historical references
 331 * Bugs::                        Problems and bugs
 332 * Manual::                      Using this manual
 333 @end menu
 334
 335 @node Intro
 336 @section Introduction to @code{m4}
 337
 338 @cindex overview of @code{m4}
 339 @code{m4} is a macro processor, in the sense that it copies its
 340 input to the output, expanding macros as it goes.  Macros are either
 341 builtin or user-defined, and can take any number of arguments.
 342 Besides just doing macro expansion, @code{m4} has builtin functions
 343 for including named files, running shell commands, doing integer
 344 arithmetic, manipulating text in various ways, performing recursion,
 345 etc.@dots{}  @code{m4} can be used either as a front-end to a compiler,
 346 or as a macro processor in its own right.
 347
 348 The @code{m4} macro processor is widely available on all UNIXes, and has
 349 been standardized by @acronym{POSIX}.
 350 Usually, only a small percentage of users are aware of its existence.
 351 However, those who find it often become committed users.  The
 352 popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
 353 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
 354 for many to install it, while these people will not themselves
 355 program in @code{m4}.  @acronym{GNU} @code{m4} is mostly compatible with the
 356 System V, Release 3 version, except for some minor differences.
 357 @xref{Compatibility}, for more details.
 358
 359 Some people find @code{m4} to be fairly addictive.  They first use
 360 @code{m4} for simple problems, then take bigger and bigger challenges,
 361 learning how to write complex sets of @code{m4} macros along the way.
 362 Once really addicted, users pursue writing of sophisticated @code{m4}
 363 applications even to solve simple problems, devoting more time
 364 debugging their @code{m4} scripts than doing real work.  Beware that
 365 @code{m4} may be dangerous for the health of compulsive programmers.
 366
 367 @node History
 368 @section Historical references
 369
 370 @cindex history of @code{m4}
 371 @cindex @acronym{GNU} M4, history of
 372 @code{GPM} was an important ancestor of @code{m4}.  See
 373 C. Strachey: ``A General Purpose Macro generator'', Computer Journal
 374 8,3 (1965), pp.@: 225 ff.  @code{GPM} is also succinctly described into
 375 David Gries classic ``Compiler Construction for Digital Computers''.
 376
 377 The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
 378 Addison-Wesley, Inc.@: (1976) describes and implements a Unix
 379 macro-processor language, which inspired Dennis Ritchie to write
 380 @code{m3}, a macro processor for the AP-3 minicomputer.
 381
 382 Kernighan and Ritchie then joined forces to develop the original
 383 @code{m4}, as described in ``The M4 Macro Processor'', Bell
 384 Laboratories (1977).  It had only 21 builtin macros.
 385
 386 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
 387 the true intricacies of real life: macros can be recognized without
 388 being pre-announced, skipping whitespace or end-of-lines is easier,
 389 more constructs are builtin instead of derived, etc.
 390
 391 Originally, the Kernighan and Plauger macro-processor, and then
 392 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
 393 that is, the @code{Ratfor} equivalent of @code{cpp}.  Later, @code{m4}
 394 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
 395
 396 Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
 397 @code{m4},
 398 in 1990, with the aim of removing the artificial limitations in many
 399 of the traditional @code{m4} implementations, such as maximum line
 400 length, macro size, or number of macros.
 401
 402 The late Professor A. Dain Samples described and implemented a further
 403 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
 404 Language: 2nd edition'', Electronic Announcement on comp.compilers
 405 newsgroup (1992).
 406
 407 Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
 408 1992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
 409 the stable release for 10 years.  It was at this time that @acronym{GNU}
 410 Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
 411 engine, since all other implementations of @code{m4} had too many
 412 limitations.
 413
 414 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
 415 addressed some long standing bugs in the venerable 1.4 release.  Then in
 416 2005, Gary V. Vaughan collected together the many patches to
 417 @acronym{GNU} @code{m4} 1.4 that were floating around the net and
 418 released 1.4.3 and 1.4.4.  And in 2006, Eric Blake joined the team and
 419 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
 420 More bug fixes were incorporated in 2007, with releases 1.4.9 and
 421 1.4.10.  Eric continued with some portability fixes for 1.4.11 and
 422 1.4.12 in 2008, and 1.4.13 in 2009.
 423
 424 Additionally, in 2008, Eric rewrote the scanning engine to reduce
 425 recursive evaluation from quadratic to linear complexity.  This was
 426 released as M4 1.6 in 2009.  The 1.x branch series remains open for bug
 427 fixes.
 428
 429 Meanwhile, development was underway for new features for @code{m4},
 430 such as dynamic module loading and additional builtins, practically
 431 rewriting the entire code base.  This development has spurred
 432 improvements to other @acronym{GNU} software, such as @acronym{GNU}
 433 Libtool.  @acronym{GNU} M4 2.0 is the result of this effort.
 434
 435 @node Bugs
 436 @section Problems and bugs
 437
 438 @cindex reporting bugs
 439 @cindex bug reports
 440 @cindex suggestions, reporting
 441 If you have problems with @acronym{GNU} M4 or think you've found a bug,
 442 please report it.  Before reporting a bug, make sure you've actually
 443 found a real bug.  Carefully reread the documentation and see if it
 444 really says you can do what you're trying to do.  If it's not clear
 445 whether you should be able to do something or not, report that too; it's
 446 a bug in the documentation!
 447
 448 Before reporting a bug or trying to fix it yourself, try to isolate it
 449 to the smallest possible input file that reproduces the problem.  Then
 450 send us the input file and the exact results @code{m4} gave you.  Also
 451 say what you expected to occur; this will help us decide whether the
 452 problem was really in the documentation.
 453
 454 Once you've got a precise problem, send e-mail to
 455 @email{bug-m4@@gnu.org}.  Please include the version number of @code{m4}
 456 you are using.  You can get this information with the command
 457 @kbd{m4 --version}.  You can also run @kbd{make check} to generate the
 458 file @file{tests/@/testsuite.log}, useful for including in your report.
 459
 460 Non-bug suggestions are always welcome as well.  If you have questions
 461 about things that are unclear in the documentation or are just obscure
 462 features, please report them too.
 463
 464 @node Manual
 465 @section Using this manual
 466
 467 @cindex examples, understanding
 468 This manual contains a number of examples of @code{m4} input and output,
 469 and a simple notation is used to distinguish input, output and error
 470 messages from @code{m4}.  Examples are set out from the normal text, and
 471 shown in a fixed width font, like this
 472
 473 @comment ignore
 474 @example
 475 This is an example of an example!
 476 @end example
 477
 478 To distinguish input from output, all output from @code{m4} is prefixed
 479 by the string @samp{@result{}}, and all error messages by the string
 480 @samp{@error{}}.  When showing how command line options affect matters,
 481 the command line is shown with a prompt @samp{$ @kbd{like this}},
 482 otherwise, you can assume that a simple @kbd{m4} invocation will work.
 483 Thus:
 484
 485 @comment ignore
 486 @example
 487 $ @kbd{command line to invoke m4}
 488 Example of input line
 489 @result{}Output line from m4
 490 @error{}and an error message
 491 @end example
 492
 493 The sequence @samp{^D} in an example indicates the end of the input
 494 file.  The sequence @samp{@key{NL}} refers to the newline character.
 495 The majority of these examples are self-contained, and you can run them
 496 with similar results.  In fact, the testsuite that is bundled in the
 497 @acronym{GNU} M4 package consists in part of the examples
 498 in this document!  Some of the examples assume that your current
 499 directory is located where you unpacked the installation, so if you plan
 500 on following along, you may find it helpful to do this now:
 501
 502 @comment ignore
 503 @example
 504 $ @kbd{cd m4-@value{VERSION}}
 505 @end example
 506
 507 As each of the predefined macros in @code{m4} is described, a prototype
 508 call of the macro will be shown, giving descriptive names to the
 509 arguments, e.g.,
 510
 511 @deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @
 512   @ovar{argument}@dots{})
 513 This is a sample prototype.  There is not really a macro named
 514 @code{example}, but this documents that if there were, it would be a
 515 Composite macro, rather than a Builtin, and would be provided by the
 516 module @code{none}.
 517
 518 It requires at least one argument, @var{string}.  Remember that in
 519 @code{m4}, there must not be a space between the macro name and the
 520 opening parenthesis, unless it was intended to call the macro without
 521 any arguments.  The brackets around @var{count} and @var{argument} show
 522 that these arguments are optional.  If @var{count} is omitted, the macro
 523 behaves as if count were @samp{1}, whereas if @var{argument} is omitted,
 524 the macro behaves as if it were the empty string.  A blank argument is
 525 not the same as an omitted argument.  For example, @samp{example(`a')},
 526 @samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave
 527 identically with @var{count} set to @samp{1}; while @samp{example(`a',)}
 528 and @samp{example(`a',`')} would explicitly pass the empty string for
 529 @var{count}.  The ellipses (@samp{@dots{}}) show that the macro
 530 processes additional arguments after @var{argument}, rather than
 531 ignoring them.
 532 @end deffn
 533
 534 Each builtin definition will list, in parentheses, the module that must
 535 be loaded to use that macro.  The standard modules include
 536 @samp{m4} (which is always available), @samp{gnu} (for @acronym{GNU} specific
 537 m4 extensions), and @samp{traditional} (for compatibility with System V
 538 m4).  @xref{Modules}.
 539
 540 @cindex numbers
 541 All macro arguments in @code{m4} are strings, but some are given
 542 special interpretation, e.g., as numbers, file names, regular
 543 expressions, etc.  The documentation for each macro will state how the
 544 parameters are interpreted, and what happens if the argument cannot be
 545 parsed according to the desired interpretation.  Unless specified
 546 otherwise, a parameter specified to be a number is parsed as a decimal,
 547 even if the argument has leading zeros; and parsing the empty string as
 548 a number results in 0 rather than an error, although a warning will be
 549 issued.
 550
 551 This document consistently writes and uses @dfn{builtin}, without a
 552 hyphen, as if it were an English word.  This is how the @code{builtin}
 553 primitive is spelled within @code{m4}.
 554
 555 @node Invoking m4
 556 @chapter Invoking @code{m4}
 557
 558 @cindex command line
 559 @cindex invoking @code{m4}
 560 The format of the @code{m4} command is:
 561
 562 @comment ignore
 563 @example
 564 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
 565 @end example
 566
 567 @cindex command line, options
 568 @cindex options, command line
 569 @cindex @env{POSIXLY_CORRECT}
 570 All options begin with @samp{-}, or if long option names are used, with
 571 @samp{--}.  A long option name need not be written completely, any
 572 unambiguous prefix is sufficient.  @acronym{POSIX} requires @code{m4} to
 573 recognize arguments intermixed with files, even when
 574 @env{POSIXLY_CORRECT} is set in the environment.  Most options take
 575 effect at startup regardless of their position, but some are documented
 576 below as taking effect after any files that occurred earlier in the
 577 command line.  The argument @option{--} is a marker to denote the end of
 578 options.
 579
 580 With short options, options that do not take arguments may be combined
 581 into a single command line argument with subsequent options, options
 582 with mandatory arguments may be provided either as a single command line
 583 argument or as two arguments, and options with optional arguments must
 584 be provided as a single argument.  In other words,
 585 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
 586 @kbd{m4 -Q -P -D foo -d ./a -d+f}, although the latter form is
 587 considered canonical.
 588
 589 With long options, options with mandatory arguments may be provided with
 590 an equal sign (@samp{=}) in a single argument, or as two arguments, and
 591 options with optional arguments must be provided as a single argument.
 592 In other words, @kbd{m4 --def foo --debug a} is equivalent to
 593 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
 594 considered canonical (not to mention more robust, in case a future
 595 version of @code{m4} introduces an option named @option{--default}).
 596
 597 @code{m4} understands the following options, grouped by functionality.
 598
 599 @menu
 600 * Operation modes::             Command line options for operation modes
 601 * Dynamic loading features::    Command line options for dynamic loading
 602 * Preprocessor features::       Command line options for preprocessor features
 603 * Limits control::              Command line options for limits control
 604 * Frozen state::                Command line options for frozen state
 605 * Debugging options::           Command line options for debugging
 606 * Command line files::          Specifying input files on the command line
 607 @end menu
 608
 609 @node Operation modes
 610 @section Command line options for operation modes
 611
 612 Several options control the overall operation of @code{m4}:
 613
 614 @table @code
 615 @item --help
 616 Print a help summary on standard output, then immediately exit
 617 @code{m4} without reading any input files or performing any other
 618 actions.
 619
 620 @item --version
 621 Print the version number of the program on standard output, then
 622 immediately exit @code{m4} without reading any input files or
 623 performing any other actions.
 624
 625 @item -b
 626 @itemx --batch
 627 Makes this invocation of @code{m4} non-interactive.  This means that
 628 output will be buffered, and an interrupt or pipe write error will halt
 629 execution.  If neither
 630 @option{-b} nor @option{-i} are specified, this is activated by default
 631 when any input files are specified, or when either standard input or
 632 standard error is not a terminal.  Note that this means that @kbd{m4}
 633 alone might be interactive, but @kbd{m4 -} is not, even though both
 634 commands process only standard input.  If both @option{-b} and
 635 @option{-i} are specified, only the last one takes effect.
 636
 637 @item -c
 638 @itemx --discard-comments
 639 Discard all comments instead of copying them to the output.
 640
 641 @item -E
 642 @itemx --fatal-warnings
 643 @cindex errors, fatal
 644 @cindex fatal errors
 645 Controls the effect of warnings.  If unspecified, then execution
 646 continues and exit status is unaffected when a warning is printed.  If
 647 specified exactly once, warnings become fatal; when one is issued,
 648 execution continues, but the exit status will be non-zero.  If specified
 649 multiple times, then execution halts with non-zero status the first time
 650 a warning is issued.  The introduction of behavior levels is new to M4
 651 1.4.9; for behavior consistent with earlier versions, you should specify
 652 @option{-E} twice.
 653
 654
 655 For backwards compatibility reasons, using @option{-E} behaves as if an
 656 implicit @option{--debug=-d} option is also present.  This is so that
 657 scripts written for older M4 versions will not fail if they used
 658 constructs that were previously silently allowed, but would now trigger
 659 a warning.
 660
 661 @example
 662 $ @kbd{m4}
 663 defn(`oops')
 664 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
 665 @result{}
 666 ^D
 667 @end example
 668
 669 @comment ignore
 670 @example
 671 $ @kbd{echo $?}
 672 @result{}0
 673 @end example
 674
 675 @comment options: -E
 676 @example
 677 $ @kbd{m4 -E}
 678 defn(`oops')
 679 @result{}
 680 ^D
 681 @end example
 682
 683 @comment ignore
 684 @example
 685 $ @kbd{echo $?}
 686 @result{}0
 687 @end example
 688
 689 @comment options: -E -d
 690 @comment status: 1
 691 @example
 692 $ @kbd{m4 -E -d}
 693 defn(`oops')
 694 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
 695 @result{}
 696 ^D
 697 @end example
 698
 699 @comment ignore
 700 @example
 701 $ @kbd{echo $?}
 702 @result{}1
 703 @end example
 704
 705 @item -i
 706 @itemx --interactive
 707 @itemx -e
 708 Makes this invocation of @code{m4} interactive.  This means that all
 709 output will be unbuffered, interrupts will be ignored, and behavior on
 710 pipe write errors is inherited from the parent process.  If neither
 711 @option{-b} nor @option{-i} are specified, this is activated by default
 712 when no input files are specified, and when both standard input and
 713 standard error are terminals (similar to the way that /bin/sh determines
 714 when to be interactive).  If both @option{-b} and @option{-i} are
 715 specified, only the last one takes effect.  The spelling @option{-e}
 716 exists for compatibility with other @code{m4} implementations, and
 717 issues a warning because it may be withdrawn in a future version of
 718 @acronym{GNU} M4.
 719
 720 @item -P
 721 @itemx --prefix-builtins
 722 Internally modify @emph{all} builtin macro names so they all start with
 723 the prefix @samp{m4_}.  For example, using this option, one should write
 724 @samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}}
 725 instead of @samp{@w{__file__}}.  This option has no effect if @option{-R}
 726 is also specified.
 727
 728 @item -Q
 729 @itemx --quiet
 730 @itemx --silent
 731 Suppress warnings, such as missing or superfluous arguments in macro
 732 calls, or treating the empty string as zero.  Error messages are still
 733 printed.  The distinction between error and warning is fuzzy, and if
 734 you encounter a situation where the message output did not match your
 735 expectations, please report that as a bug.  This option is implied if
 736 @env{POSIXLY_CORRECT} is set in the environment.
 737
 738 @item -r@r{[}@var{resyntax-spec}@r{]}
 739 @itemx --regexp-syntax@r{[}=@var{resyntax-spec}@r{]}
 740 Set the regular expression syntax according to @var{resyntax-spec}.
 741 When this option is not given, or @var{resyntax-spec} is omitted,
 742 @acronym{GNU} M4 uses the flavor @code{GNU_M4}, which provides
 743 emacs-compatible regular expressions.  @xref{Changeresyntax}, for more
 744 details on the format and meaning of @var{resyntax-spec}.  This option
 745 may be given more than once, and order with respect to file names is
 746 significant.
 747
 748 @item --safer
 749 Cripple the following builtins, since each can perform potentially
 750 unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}),
 751 @code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}),
 752 @code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}).
 753 An attempt to use any of these macros will result in an error.  This
 754 option is intended to make it safer to preprocess an input file of
 755 unknown origin.
 756
 757 @item -W
 758 @itemx --warnings
 759 Enable warnings.  Warnings are on by default unless
 760 @env{POSIXLY_CORRECT} was set in the environment; this option exists to
 761 allow overriding @option{--silent}.
 762 @comment FIXME should we accept -Wall, -Wnone, -Wcategory,
 763 @comment -Wno-category...?
 764 @end table
 765
 766 @node Dynamic loading features
 767 @section Command line options for dynamic loading
 768
 769 On platforms that support dynamic libraries, there are some options
 770 that affect dynamic loading.
 771
 772 @table @code
 773 @item -M @var{directory}
 774 @itemx --module-directory=@var{directory}
 775 Specify an alternate @var{directory} to search for modules.  This option
 776 can be used multiple times to add several different directories to the
 777 module search path.  @xref{Modules}, for more details.
 778
 779 @item -m @var{module}
 780 @itemx --load-module=@var{module}
 781 Load @var{module} before parsing more input files.  @var{module} is
 782 searched for in each directory of the module search path, until the
 783 first match is found or the list is exhausted.  @xref{Modules}, for more
 784 details.  By default, the modules @samp{m4}, @samp{traditional}, and
 785 @samp{gnu} are preloaded, although this can be controlled during
 786 configuration with the @option{--with-modules} option to
 787 @file{m4-@value{VERSION}/@/configure}.  This option may be given more
 788 than once, and order with respect to file names is significant.
 789
 790 @item --unload-module=@var{module}
 791 Unload @var{module} before parsing more input files.  @xref{Modules},
 792 for more details.  This option may be given more than once, and order
 793 with respect to file names is significant.
 794 @end table
 795
 796 @node Preprocessor features
 797 @section Command line options for preprocessor features
 798
 799 @cindex macro definitions, on the command line
 800 @cindex command line, macro definitions on the
 801 @cindex preprocessor features
 802 Several options allow @code{m4} to behave more like a preprocessor.
 803 Macro definitions and deletions can be made on the command line, the
 804 search path can be altered, and the output file can track where the
 805 input came from.  These features occur with the following options:
 806
 807 @table @code
 808 @item -B @var{directory}
 809 @itemx --prepend-include=@var{directory}
 810 Make @code{m4} search @var{directory} for included files, prior to
 811 searching the current working directory.  @xref{Search Path}, for more
 812 details.  This option may be given more than once.  Some other
 813 implementations of @code{m4} use @option{-B @var{number}} to change their
 814 hard-coded limits, but that is unnecessary in @acronym{GNU} where the
 815 only limit is your hardware capability.  So although it is unlikely that
 816 you will want to include a relative directory whose name is purely
 817 numeric, @acronym{GNU} @code{m4} will warn you about this potential
 818 compatibility issue; you can avoid the warning by using the long
 819 spelling, or by using @samp{./@var{number}} if you really meant it.
 820
 821 @item -D @var{name}@r{[}=@var{value}@r{]}
 822 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
 823 This enters @var{name} into the symbol table.  If @samp{=@var{value}} is
 824 missing, the value is taken to be the empty string.  The @var{value} can
 825 be any string, and the macro can be defined to take arguments, just as
 826 if it was defined from within the input.  This option may be given more
 827 than once; order with respect to file names is significant, and
 828 redefining the same @var{name} loses the previous value.
 829
 830 @item --import-environment
 831 Imports every variable in the environment as a macro.  This is done
 832 before @option{-D} and @option{-U}, so they can override the
 833 environment.
 834
 835 @item -I @var{directory}
 836 @itemx --include=@var{directory}
 837 Make @code{m4} search @var{directory} for included files that are not
 838 found in the current working directory.  @xref{Search Path}, for more
 839 details.  This option may be given more than once.
 840
 841 @item --popdef=@var{name}
 842 This deletes the top-most meaning @var{name} might have.  Obviously,
 843 only predefined macros can be deleted in this way.  This option may be
 844 given more than once; popping a @var{name} that does not have a
 845 definition is silently ignored.  Order is significant with respect to
 846 file names.
 847
 848 @item -p @var{name}@r{[}=@var{value}@r{]}
 849 @itemx --pushdef=@var{name}@r{[}=@var{value}@r{]}
 850 This enters @var{name} into the symbol table.  If @samp{=@var{value}} is
 851 missing, the value is taken to be the empty string.  The @var{value} can
 852 be any string, and the macro can be defined to take arguments, just as
 853 if it was defined from within the input.  This option may be given more
 854 than once; order with respect to file names is significant, and
 855 redefining the same @var{name} adds another definition to its stack.
 856
 857 @item -s
 858 @itemx --synclines
 859 Short for @option{--syncoutput=1}, turning on synchronization lines
 860 (sometimes called @dfn{synclines}).
 861
 862 @item --syncoutput@r{[}=@var{state}@r{]}
 863 @cindex synchronization lines
 864 @cindex location, input
 865 @cindex input location
 866 Control the generation of synchronization lines from the command line.
 867 Synchronization lines are for use by the C preprocessor or other
 868 similar tools.  Order is significant with respect to file names.  This
 869 option is useful, for example, when @code{m4} is used as a
 870 front end to a compiler.  Source file name and line number information
 871 is conveyed by directives of the form @samp{#line @var{linenum}
 872 "@var{file}"}, which are inserted as needed into the middle of the
 873 output.  Such directives mean that the following line originated or was
 874 expanded from the contents of input file @var{file} at line
 875 @var{linenum}.  The @samp{"@var{file}"} part is often omitted when
 876 the file name did not change from the previous directive.
 877
 878 Synchronization directives are always given on complete lines by
 879 themselves.  When a synchronization discrepancy occurs in the middle of
 880 an output line, the associated synchronization directive is delayed
 881 until the next newline that does not occur in the middle of a quoted
 882 string or comment.  @xref{Syncoutput}, for runtime control.  @var{state}
 883 is interpreted the same as the argument to @code{syncoutput}; if
 884 @var{state} is omitted, or @option{--syncoutput} is not used,
 885 synchronization lines are disabled.
 886
 887 @item -U @var{name}
 888 @itemx --undefine=@var{name}
 889 This deletes any predefined meaning @var{name} might have.  Obviously,
 890 only predefined macros can be deleted in this way.  This option may be
 891 given more than once; undefining a @var{name} that does not have a
 892 definition is silently ignored.  Order is significant with respect to
 893 file names.
 894 @end table
 895
 896 @node Limits control
 897 @section Command line options for limits control
 898
 899 There are some limits within @code{m4} that can be tuned.  For
 900 compatibility, @code{m4} also accepts some options that control limits
 901 in other implementations, but which are automatically unbounded (limited
 902 only by your hardware and operating system constraints) in @acronym{GNU}
 903 @code{m4}.
 904
 905 @table @code
 906 @item -g
 907 @itemx --gnu
 908 Enable all the extensions in this implementation.  This is on by
 909 default unless @env{POSIXLY_CORRECT} is set in the environment; it
 910 exists to allow overriding @option{--traditional}.
 911
 912 @item -G
 913 @itemx --posix
 914 @itemx --traditional
 915 Suppress all the extensions made in this implementation, compared to the
 916 System V version.  @xref{Compatibility}, for a list of these.  This
 917 loads the @samp{traditional} module in place of the @samp{gnu} module.
 918 It is implied if @env{POSIXLY_CORRECT} is set in the environment.
 919
 920 @item -L @var{num}
 921 @itemx --nesting-limit=@var{num}
 922 @cindex nesting limit
 923 @cindex limit, nesting
 924 Artificially limit the nesting of macro calls to @var{num} levels,
 925 stopping program execution if this limit is ever exceeded.  When not
 926 specified, nesting is limited to 1024 levels.  A value of zero means
 927 unlimited; but then heavily nested code could potentially cause a stack
 928 overflow.  @var{num} can have an optional scaling suffix.
 929 @comment FIXME - need a node on what scaling suffixes are supported (see
 930 @comment [info coreutils 'block size'] for ideas), and need to consider
 931 @comment whether builtins should also understand scaling suffixes:
 932 @comment eval, mpeval, perhaps format
 933
 934 The precise effect of this option might be more correctly associated
 935 with textual nesting than dynamic recursion.  It has been useful
 936 when some complex @code{m4} input was generated by mechanical means.
 937 Most users would never need this option.  If shown to be obtrusive,
 938 this option (which is still experimental) might well disappear.
 939
 940 @cindex rescanning
 941 This option does @emph{not} have the ability to break endless
 942 rescanning loops, since these do not necessarily consume much memory
 943 or stack space.  Through clever usage of rescanning loops, one can
 944 request complex, time-consuming computations from @code{m4} with useful
 945 results.  Putting limitations in this area would break @code{m4} power.
 946 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
 947 only the simplest example (but @pxref{Compatibility}).  Expecting @acronym{GNU}
 948 @code{m4} to detect these would be a little like expecting a compiler
 949 system to detect and diagnose endless loops: it is a quite @emph{hard}
 950 problem in general, if not undecidable!
 951
 952 @item -H @var{num}
 953 @itemx --hashsize=@var{num}
 954 @itemx --word-regexp=@var{regexp}
 955 These options are present only for compatibility with previous versions
 956 of GNU @code{m4}.  They do nothing except issue a warning, because the
 957 symbol table size is not fixed anymore, and because the new
 958 @code{changesyntax} feature is more efficient than the withdrawn
 959 experimental @code{changeword}.  These options will eventually disappear
 960 in future releases.
 961
 962 @item -S @var{num}
 963 @itemx -T @var{num}
 964 These options are present for compatibility with System V @code{m4}, but
 965 do nothing in this implementation.  They may disappear in future
 966 releases, and issue a warning to that effect.
 967 @end table
 968
 969 @node Frozen state
 970 @section Command line options for frozen state
 971
 972 @acronym{GNU} @code{m4} comes with a feature of freezing internal state
 973 (@pxref{Frozen files}).  This can be used to speed up @code{m4}
 974 execution when reusing a common initialization script.
 975
 976 @table @code
 977 @item -F @var{file}
 978 @itemx --freeze-state=@var{file}
 979 Once execution is finished, write out the frozen state on the specified
 980 @var{file}.  It is conventional, but not required, for @var{file} to end
 981 in @samp{.m4f}.
 982
 983 @item -R @var{file}
 984 @itemx --reload-state=@var{file}
 985 Before execution starts, recover the internal state from the specified
 986 frozen @var{file}.  The options @option{-D}, @option{-U}, @option{-t},
 987 @option{-m}, @option{-r}, and @option{--import-environment} take effect
 988 after state is reloaded, but before the input files are read.
 989 @end table
 990
 991 @node Debugging options
 992 @section Command line options for debugging
 993
 994 Finally, there are several options for aiding in debugging @code{m4}
 995 scripts.
 996
 997 @table @code
 998 @item -d@r{[}@r{[}-@r{|}+@r{]}@var{flags}@r{]}
 999 @itemx --debug@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
1000 @itemx --debugmode@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
1001 Set the debug-level according to the flags @var{flags}.  The debug-level
1002 controls the format and amount of information presented by the debugging
1003 functions.  @xref{Debugmode}, for more details on the format and
1004 meaning of @var{flags}.  If omitted, @var{flags} defaults to
1005 @samp{+adeq}.  If the option occurs multiple times, @var{flags} starting
1006 with @samp{-} or @samp{+} are cumulative, while @var{flags} starting
1007 with a letter override all earlier settings.  The debug-level starts
1008 with @samp{d} enabled and all other flags disabled.  To disable all
1009 previously set flags, specify an explicit @var{flags} of @samp{-V}.  For
1010 backward compatibility reasons, the option @option{--fatal-warnings}
1011 implies @samp{--debug=-d} as part of its effects.  The spelling
1012 @option{--debug} is recognized as an unambiguous option for
1013 compatibility with earlier versions of @acronym{GNU} M4, but for
1014 consistency with the builtin name, you can also use the spelling
1015 @option{--debugmode}.  Order is significant with respect to file names.
1016
1017 The cumulative effect of the various options in this example is
1018 equivalent to a single invocation of @code{debugmode(`adlqx')}:
1019
1020 @comment options: -d-V -d+lx --debug --debugmode=-e
1021 @example
1022 $ @kbd{m4 -d+lx --debug --debugmode=-e}
1023 traceon(`len')
1024 @result{}
1025 len(`123')
1026 @error{}m4trace:2: -1- id 2: len(`123')
1027 @result{}3
1028 @end example
1029
1030 @item --debugfile@r{[}=@var{file}@r{]}
1031 @itemx -o @var{file}
1032 @itemx --error-output=@var{file}
1033 Redirect debug messages and trace output to the
1034 named @var{file}.  Warnings, error messages, and @code{errprint} output
1035 are still printed to standard error.  Output from @code{dumpdef} goes to
1036 this file when the debug level @code{o} is not set (@pxref{Debugmode}).
1037 If these options are not used, or
1038 if @var{file} is unspecified (only possible for @option{--debugfile}),
1039 debug output goes to standard error; if @var{file} is the empty string,
1040 debug output is discarded.  @xref{Debugfile}, for more details.  The
1041 option @option{--debugfile} may be given more than once, and order is
1042 significant with respect to file names.  The spellings @option{-o} and
1043 @option{--error-output} are misleading and
1044 inconsistent with other @acronym{GNU} tools; using those spellings will
1045 evoke a warning, and they may be withdrawn or change semantics in a
1046 future release.
1047
1048 @item -l @var{num}
1049 @itemx --debuglen=@var{num}
1050 @itemx --arglength=@var{num}
1051 Restrict the size of the output generated by macro tracing or by
1052 @code{dumpdef} to @var{num} characters per string.  If unspecified or
1053 zero, output is unlimited.  @xref{Debuglen}, for more details.
1054 @var{num} can have an optional scaling suffix.  The spelling
1055 @option{--arglength} is deprecated, since it does not match the
1056 @code{debuglen} macro; using it will evoke a warning, and it may be
1057 withdrawn in a future release.
1058 @comment FIXME - Should we add an option that controls whether output
1059 @comment strings are sanitized with escape sequences, so that dumpdef is
1060 @comment truly one line per macro?
1061 @comment FIXME - see comment on --nesting-limit about NUM.
1062
1063 @item -t @var{name}
1064 @itemx --trace=@var{name}
1065 @itemx --traceon=@var{name}
1066 This enables tracing for the macro @var{name}, at any point where it is
1067 defined.  @var{name} need not be defined when this option is given.
1068 This option may be given more than once, and order is significant with
1069 respect to file names.  @xref{Trace}, for more details.
1070
1071 @item --traceoff=@var{name}
1072 This disables tracing for the macro @var{name}, at any point where it is
1073 defined.  @var{name} need not be defined when this option is given.
1074 This option may be given more than once, and order is significant with
1075 respect to file names.  @xref{Trace}, for more details.
1076 @end table
1077
1078 @node Command line files
1079 @section Specifying input files on the command line
1080
1081 @cindex command line, file names on the
1082 @cindex file names, on the command line
1083 The remaining arguments on the command line are taken to be input file
1084 names.  If no names are present, standard input is read.  A file
1085 name of @file{-} is taken to mean standard input.  It is
1086 conventional, but not required, for input files to end in @samp{.m4}.
1087
1088 The input files are read in the sequence given.  Standard input can be
1089 read more than once, so the file name @file{-} may appear multiple times
1090 on the command line; this makes a difference when input is from a
1091 terminal or other special file type.  It is an error if an input file
1092 ends in the middle of argument collection, a comment, or a quoted
1093 string.
1094 @comment FIXME - it would be nicer if we let these three things
1095 @comment continue across file boundaries, provided that we warn in
1096 @comment interactive use when switching to stdin in a non-default parse
1097 @comment state.
1098
1099 Various options, such as @option{--define} (@option{-D}), @option{--undefine}
1100 (@option{-U}), @option{--synclines} (@option{-s}), @option{--trace}
1101 (@option{-t}), @option{--regexp-syntax} (@option{-r}), and
1102 @option{--load-module} (@option{-m}), only take effect after processing
1103 input from any file names that occur earlier on the command line.  For
1104 example, assume the file @file{foo} contains:
1105
1106 @comment file: foo
1107 @example
1108 $ @kbd{cat foo}
1109 bar
1110 @end example
1111
1112 The text @samp{bar} can then be redefined over multiple uses of
1113 @file{foo}:
1114
1115 @comment options: -Dbar=hello foo -Dbar=world foo
1116 @example
1117 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
1118 @result{}hello
1119 @result{}world
1120 @end example
1121
1122 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
1123 exit status of @code{m4} will be 0 for success, 1 for general failure
1124 (such as problems with reading an input file), and 63 for version
1125 mismatch (@pxref{Using frozen files}).
1126
1127 If you need to read a file whose name starts with a @file{-}, you can
1128 specify it as @samp{./-file}, or use @option{--} to mark the end of
1129 options.
1130
1131 @node Syntax
1132 @chapter Lexical and syntactic conventions
1133
1134 @cindex input tokens
1135 @cindex tokens
1136 As @code{m4} reads its input, it separates it into @dfn{tokens}.  A
1137 token is either a name, a quoted string, or any single character, that
1138 is not a part of either a name or a string.  Input to @code{m4} can also
1139 contain comments.  @acronym{GNU} @code{m4} does not yet understand
1140 multibyte locales; all operations are byte-oriented rather than
1141 character-oriented (although if your locale uses a single byte
1142 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1143 However, @code{m4} is eight-bit clean, so you can
1144 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1145 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1146 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1147
1148 @comment FIXME - each builtin needs to document how it handles NUL, then
1149 @comment update the above paragraph to mention that NUL is now handled
1150 @comment transparently.
1151
1152 @menu
1153 * Names::                       Macro names
1154 * Quoted strings::              Quoting input to @code{m4}
1155 * Comments::                    Comments in @code{m4} input
1156 * Other tokens::                Other kinds of input tokens
1157 * Input processing::            How @code{m4} copies input to output
1158 * Regular expression syntax::   How @code{m4} interprets regular expressions
1159 @end menu
1160
1161 @node Names
1162 @section Macro names
1163
1164 @cindex names
1165 @cindex words
1166 A name is any sequence of letters, digits, and the character @samp{_}
1167 (underscore), where the first character is not a digit.  @code{m4} will
1168 use the longest such sequence found in the input.  If a name has a
1169 macro definition, it will be subject to macro expansion
1170 (@pxref{Macros}).  Names are case-sensitive.
1171
1172 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1173
1174 The definitions of letters, digits and other input characters can be
1175 changed at any time, using the builtin macro @code{changesyntax}.
1176 @xref{Changesyntax}, for more information.
1177
1178 @node Quoted strings
1179 @section Quoting input to @code{m4}
1180
1181 @cindex quoted string
1182 @cindex string, quoted
1183 A quoted string is a sequence of characters surrounded by quote
1184 strings, defaulting to
1185 @samp{`} (grave-accent, also known as back-tick, with UCS value U0060)
1186 and @samp{'} (apostrophe, also known as single-quote, with UCS value
1187 U0027), where the nested begin and end quotes within the
1188 string are balanced.  The value of a string token is the text, with one
1189 level of quotes stripped off.  Thus
1190
1191 @comment ignore
1192 @example
1193 `'
1194 @result{}
1195 @end example
1196
1197 @noindent
1198 is the empty string, and double-quoting turns into single-quoting.
1199
1200 @comment ignore
1201 @example
1202 ``quoted''
1203 @result{}`quoted'
1204 @end example
1205
1206 The quote characters can be changed at any time, using the builtin macros
1207 @code{changequote} (@pxref{Changequote}) or @code{changesyntax}
1208 (@pxref{Changesyntax}).
1209
1210 @node Comments
1211 @section Comments in @code{m4} input
1212
1213 @cindex comments
1214 Comments in @code{m4} are normally delimited by the characters @samp{#}
1215 and newline.  All characters between the comment delimiters are ignored,
1216 but the entire comment (including the delimiters) is passed through to
1217 the output, unless you supply the @option{--discard-comments} or
1218 @option{-c} option at the command line (@pxref{Operation modes, ,
1219 Invoking m4}).  When discarding comments, the comment delimiters are
1220 discarded, even if the close-comment string is a newline.
1221
1222 Comments cannot be nested, so the first newline after a @samp{#} ends
1223 the comment.  The commenting effect of the begin-comment string
1224 can be inhibited by quoting it.
1225
1226 @example
1227 $ @kbd{m4}
1228 `quoted text' # `commented text'
1229 @result{}quoted text # `commented text'
1230 `quoting inhibits' `#' `comments'
1231 @result{}quoting inhibits # comments
1232 @end example
1233
1234 @comment options: -c
1235 @example
1236 $ @kbd{m4 -c}
1237 `quoted text' # `commented text'
1238 `quoting inhibits' `#' `comments'
1239 @result{}quoted text quoting inhibits # comments
1240 @end example
1241
1242 The comment delimiters can be changed to any string at any time, using
1243 the builtin macros @code{changecom} (@pxref{Changecom}) or
1244 @code{changesyntax} (@pxref{Changesyntax}).
1245
1246 @node Other tokens
1247 @section Other kinds of input tokens
1248
1249 @cindex tokens, special
1250 Any character, that is neither a part of a name, nor of a quoted string,
1251 nor a comment, is a token by itself.  When not in the context of macro
1252 expansion, all of these tokens are just copied to output.  However,
1253 during macro expansion, whitespace characters (space, tab, newline,
1254 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1255 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1256 roles, explained later.  Which characters actually perform these roles
1257 can be adjusted with @code{changesyntax} (@pxref{Changesyntax}).
1258
1259 @node Input processing
1260 @section How @code{m4} copies input to output
1261
1262 As @code{m4} reads the input token by token, it will copy each token
1263 directly to the output immediately.
1264
1265 The exception is when it finds a word with a macro definition.  In that
1266 case @code{m4} will calculate the macro's expansion, possibly reading
1267 more input to get the arguments.  It then inserts the expansion in front
1268 of the remaining input.  In other words, the resulting text from a macro
1269 call will be read and parsed into tokens again.
1270
1271 @code{m4} expands a macro as soon as possible.  If it finds a macro call
1272 when collecting the arguments to another, it will expand the second call
1273 first.  This process continues until there are no more macro calls to
1274 expand and all the input has been consumed.
1275
1276 For a running example, examine how @code{m4} handles this input:
1277
1278 @comment ignore
1279 @example
1280 format(`Result is %d', eval(`2**15'))
1281 @end example
1282
1283 @noindent
1284 First, @code{m4} sees that the token @samp{format} is a macro name, so
1285 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1286 and @samp{@w{ }}, before encountering another potential macro.  Sure
1287 enough, @samp{eval} is a macro name, so the nested argument collection
1288 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1289 with the lone argument of @samp{2**15}.  The expansion of
1290 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1291 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1292 combined with the next @samp{)}, the format macro now has all its
1293 arguments, as if the user had typed:
1294
1295 @comment ignore
1296 @example
1297 format(`Result is %d', 32768)
1298 @end example
1299
1300 @noindent
1301 The format macro expands to @samp{Result is 32768}, and we have another
1302 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1303 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1304 @samp{8}.  None of these are macros, so the final output is
1305
1306 @comment ignore
1307 @example
1308 @result{}Result is 32768
1309 @end example
1310
1311 As a more complicated example, we will contrast an actual code example
1312 from the Gnulib project@footnote{Derived from a patch in
1313 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1314 and a followup patch in
1315 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1316 showing both a buggy approach and the desired results.  The user desires
1317 to output a shell assignment statement that takes its argument and turns
1318 it into a shell variable by converting it to uppercase and prepending a
1319 prefix.  The original attempt looks like this:
1320
1321 @example
1322 changequote([,])dnl
1323 define([gl_STRING_MODULE_INDICATOR],
1324   [
1325     dnl comment
1326     GNULIB_]translit([$1],[a-z],[A-Z])[=1
1327   ])dnl
1328   gl_STRING_MODULE_INDICATOR([strcase])
1329 @result{} @w{ }
1330 @result{}        GNULIB_strcase=1
1331 @result{} @w{ }
1332 @end example
1333
1334 Oops -- the argument did not get capitalized.  And although the manual
1335 is not able to easily show it, both lines that appear empty actually
1336 contain two trailing spaces.  By stepping through the parse, it is easy
1337 to see what happened.  First, @code{m4} sees the token
1338 @samp{changequote}, which it recognizes as a macro, followed by
1339 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1340 argument list.  The macro expands to the empty string, but changes the
1341 quoting characters to something more useful for generating shell code
1342 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1343 but unbalanced @samp{[]} tend to be rare).  Also in the first line,
1344 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1345 macro that consumes the rest of the line, resulting in no output for
1346 that line.
1347
1348 The second line starts a macro definition.  @code{m4} sees the token
1349 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1350 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}.  Because an unquoted
1351 comma was encountered, the first argument is known to be the expansion
1352 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1353 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1354 whitespace is discarded as part of argument collection.  Then comes a
1355 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1356 comment@key{NL}@ @ @ @ GNULIB_]}.  This is followed by the token
1357 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1358 macro expansion has started.
1359
1360 The arguments to the @code{translit} are found by the tokens @samp{(},
1361 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1362 @samp{)}.  All three string arguments are expanded (or in other words,
1363 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1364 capitalization, the result of the macro is @samp{$1}.  This expansion is
1365 rescanned, resulting in the two literal characters @samp{$} and
1366 @samp{1}.
1367
1368 Scanning of the outer macro resumes, and picks up with
1369 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}.  The collected pieces of
1370 expanded text are concatenated, with the end result that the macro
1371 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1372 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1373 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1374
1375 The final line is then parsed, beginning with @samp{ } and @samp{ }
1376 that are output literally.  Then @samp{gl_STRING_MODULE_INDICATOR} is
1377 recognized as a macro name, with an argument list of @samp{(},
1378 @samp{[strcase]}, and @samp{)}.  Since the definition of the macro
1379 contains the sequence @samp{$1}, that sequence is replaced with the
1380 argument @samp{strcase} prior to starting the rescan.  The rescan sees
1381 @samp{@key{NL}} and four spaces, which are output literally, then
1382 @samp{dnl}, which discards the text @samp{ comment@key{NL}}.  Next
1383 comes four more spaces, also output literally, and the token
1384 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1385 substitution.  Since that is not a macro name, it is output literally,
1386 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1387 two more spaces.  Finally, the original @samp{@key{NL}} seen after the
1388 macro invocation is scanned and output literally.
1389
1390 Now for a corrected approach.  This rearranges the use of newlines and
1391 whitespace so that less whitespace is output (which, although harmless
1392 to shell scripts, can be visually unappealing), and fixes the quoting
1393 issues so that the capitalization occurs when the macro
1394 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1395 defined.  It also adds another layer of quoting to the first argument of
1396 @code{translit}, to ensure that the output will be rescanned as a string
1397 rather than a potential uppercase macro name needing further expansion.
1398
1399 @example
1400 changequote([,])dnl
1401 define([gl_STRING_MODULE_INDICATOR],
1402   [dnl comment
1403   GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1404 ])dnl
1405   gl_STRING_MODULE_INDICATOR([strcase])
1406 @result{}    GNULIB_STRCASE=1
1407 @end example
1408
1409 The parsing of the first line is unchanged.  The second line sees the
1410 name of the macro to define, then sees the discarded @samp{@key{NL}}
1411 and two spaces, as before.  But this time, the next token is
1412 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1413 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1414 @samp{)} to end the macro definition and @samp{dnl} to skip the
1415 newline.  No early expansion of @code{translit} occurs, so the entire
1416 string becomes the definition of the macro.
1417
1418 The final line is then parsed, beginning with two spaces that are
1419 output literally, and an invocation of
1420 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1421 Again, the @samp{$1} in the macro definition is substituted prior to
1422 rescanning.  Rescanning first encounters @samp{dnl}, and discards
1423 @samp{ comment@key{NL}}.  Then two spaces are output literally.  Next
1424 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1425 output literally.  The token @samp{[]} is an empty string, so it does
1426 not affect output.  Then the token @samp{translit} is encountered.
1427
1428 This time, the arguments to @code{translit} are parsed as @samp{(},
1429 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1430 @samp{[A-Z]}, and @samp{)}.  The two spaces are discarded, and the
1431 translit results in the desired result @samp{[STRCASE]}.  This is
1432 rescanned, but since it is a string, the quotes are stripped and the
1433 only output is a literal @samp{STRCASE}.
1434 Then the scanner sees @samp{=} and @samp{1}, which are output
1435 literally, followed by @samp{dnl} which discards the rest of the
1436 definition of @code{gl_STRING_MODULE_INDICATOR}.  The newline at the
1437 end of output is the literal @samp{@key{NL}} that appeared after the
1438 invocation of the macro.
1439
1440 The order in which @code{m4} expands the macros can be further explored
1441 using the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
1442
1443 @node Regular expression syntax
1444 @section How @code{m4} interprets regular expressions
1445
1446 There are several contexts where @code{m4} parses an argument as a
1447 regular expression.  This section describes the various flavors of
1448 regular expressions.  @xref{Changeresyntax}.
1449
1450 @include regexprops-generic.texi
1451
1452 @node Macros
1453 @chapter How to invoke macros
1454
1455 This chapter covers macro invocation, macro arguments and how macro
1456 expansion is treated.
1457
1458 @menu
1459 * Invocation::                  Macro invocation
1460 * Inhibiting Invocation::       Preventing macro invocation
1461 * Macro Arguments::             Macro arguments
1462 * Quoting Arguments::           On Quoting Arguments to macros
1463 * Macro expansion::             Expanding macros
1464 @end menu
1465
1466 @node Invocation
1467 @section Macro invocation
1468
1469 @cindex macro invocation
1470 @cindex invoking macros
1471 Macro invocations has one of the forms
1472
1473 @comment ignore
1474 @example
1475 name
1476 @end example
1477
1478 @noindent
1479 which is a macro invocation without any arguments, or
1480
1481 @comment ignore
1482 @example
1483 name(arg1, arg2, @dots{}, arg@var{n})
1484 @end example
1485
1486 @noindent
1487 which is a macro invocation with @var{n} arguments.  Macros can have any
1488 number of arguments.  All arguments are strings, but different macros
1489 might interpret the arguments in different ways.
1490
1491 The opening parenthesis @emph{must} follow the @var{name} directly, with
1492 no spaces in between.  If it does not, the macro is called with no
1493 arguments at all.
1494
1495 For a macro call to have no arguments, the parentheses @emph{must} be
1496 left out.  The macro call
1497
1498 @comment ignore
1499 @example
1500 name()
1501 @end example
1502
1503 @noindent
1504 is a macro call with one argument, which is the empty string, not a call
1505 with no arguments.
1506
1507 @node Inhibiting Invocation
1508 @section Preventing macro invocation
1509
1510 An innovation of the @code{m4} language, compared to some of its
1511 predecessors (like Strachey's @code{GPM}, for example), is the ability
1512 to recognize macro calls without resorting to any special, prefixed
1513 invocation character.  While generally useful, this feature might
1514 sometimes be the source of spurious, unwanted macro calls.  So, @acronym{GNU}
1515 @code{m4} offers several mechanisms or techniques for inhibiting the
1516 recognition of names as macro calls.
1517
1518 @cindex @acronym{GNU} extensions
1519 @cindex blind macro
1520 @cindex macro, blind
1521 First of all, many builtin macros cannot meaningfully be called without
1522 arguments.  As a @acronym{GNU} extension, for any of these macros,
1523 whenever an opening parenthesis does not immediately follow their name,
1524 the builtin macro call is not triggered.  This solves the most usual
1525 cases, like for @samp{include} or @samp{eval}.  Later in this document,
1526 the sentence ``This macro is recognized only with parameters'' refers to
1527 this specific provision of @acronym{GNU} M4, also known as a blind
1528 builtin macro.  For the builtins defined by @acronym{POSIX} that bear
1529 this disclaimer, @acronym{POSIX} specifically states that invoking those
1530 builtins without arguments is unspecified, because many other
1531 implementations simply invoke the builtin as though it were given one
1532 empty argument instead.
1533
1534 @example
1535 $ @kbd{m4}
1536 eval
1537 @result{}eval
1538 eval(`1')
1539 @result{}1
1540 @end example
1541
1542 There is also a command line option (@option{--prefix-builtins}, or
1543 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1544 builtin macros with a prefix of @samp{m4_} at startup.  The option has
1545 no effect whatsoever on user defined macros.  For example, with this option,
1546 one has to write @code{m4_dnl} and even @code{m4_m4exit}.  It also has
1547 no effect on whether a macro requires parameters.
1548
1549 @comment options: -P
1550 @example
1551 $ @kbd{m4 -P}
1552 eval
1553 @result{}eval
1554 eval(`1')
1555 @result{}eval(1)
1556 m4_eval
1557 @result{}m4_eval
1558 m4_eval(`1')
1559 @result{}1
1560 @end example
1561
1562 Another alternative is to redefine problematic macros to a name less
1563 likely to cause conflicts, @xref{Definitions}.  Or the parsing engine
1564 can be changed to redefine what constitutes a valid macro name,
1565 @xref{Changesyntax}.
1566
1567 Of course, the simplest way to prevent a name from being interpreted
1568 as a call to an existing macro is to quote it.  The remainder of
1569 this section studies a little more deeply how quoting affects macro
1570 invocation, and how quoting can be used to inhibit macro invocation.
1571
1572 Even if quoting is usually done over the whole macro name, it can also
1573 be done over only a few characters of this name (provided, of course,
1574 that the unquoted portions are not also a macro).  It is also possible
1575 to quote the empty string, but this works only @emph{inside} the name.
1576 For example:
1577
1578 @example
1579 `divert'
1580 @result{}divert
1581 `d'ivert
1582 @result{}divert
1583 di`ver't
1584 @result{}divert
1585 div`'ert
1586 @result{}divert
1587 @end example
1588
1589 @noindent
1590 all yield the string @samp{divert}.  While in both:
1591
1592 @example
1593 `'divert
1594 @result{}
1595 divert`'
1596 @result{}
1597 @end example
1598
1599 @noindent
1600 the @code{divert} builtin macro will be called, which expands to the
1601 empty string.
1602
1603 @cindex rescanning
1604 The output of macro evaluations is always rescanned.  In the following
1605 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1606 if @code{m4}
1607 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1608
1609 @example
1610 define(`cde', `CDE')
1611 @result{}
1612 define(`x', `substr(ab')
1613 @result{}
1614 define(`y', `cde, `1', `3')')
1615 @result{}
1616 x`'y
1617 @result{}bCD
1618 @end example
1619
1620 Unquoted strings on either side of a quoted string are subject to
1621 being recognized as macro names.  In the following example, quoting the
1622 empty string allows for the second @code{macro} to be recognized as such:
1623
1624 @example
1625 define(`macro', `m')
1626 @result{}
1627 macro(`m')macro
1628 @result{}mmacro
1629 macro(`m')`'macro
1630 @result{}mm
1631 @end example
1632
1633 Quoting may prevent recognizing as a macro name the concatenation of a
1634 macro expansion with the surrounding characters.  In this example:
1635
1636 @example
1637 define(`macro', `di$1')
1638 @result{}
1639 macro(`v')`ert'
1640 @result{}divert
1641 macro(`v')ert
1642 @result{}
1643 @end example
1644
1645 @noindent
1646 the input will produce the string @samp{divert}.  When the quotes were
1647 removed, the @code{divert} builtin was called instead.
1648
1649 @node Macro Arguments
1650 @section Macro arguments
1651
1652 @cindex macros, arguments to
1653 @cindex arguments to macros
1654 When a name is seen, and it has a macro definition, it will be expanded
1655 as a macro.
1656
1657 If the name is followed by an opening parenthesis, the arguments will be
1658 collected before the macro is called.  If too few arguments are
1659 supplied, the missing arguments are taken to be the empty string.
1660 However, some builtins are documented to behave differently for a
1661 missing optional argument than for an explicit empty string.  If there
1662 are too many arguments, the excess arguments are ignored.  Unquoted
1663 leading whitespace is stripped off all arguments, but whitespace
1664 generated by a macro expansion or occurring after a macro that expanded
1665 to an empty string remains intact.  Whitespace includes space, tab,
1666 newline, carriage return, vertical tab, and formfeed.
1667
1668 @example
1669 define(`macro', `$1')
1670 @result{}
1671 macro( unquoted leading space lost)
1672 @result{}unquoted leading space lost
1673 macro(` quoted leading space kept')
1674 @result{} quoted leading space kept
1675 macro(
1676  divert `unquoted space kept after expansion')
1677 @result{} unquoted space kept after expansion
1678 macro(macro(`
1679 ')`whitespace from expansion kept')
1680 @result{}
1681 @result{}whitespace from expansion kept
1682 macro(`unquoted trailing whitespace kept'
1683 )
1684 @result{}unquoted trailing whitespace kept
1685 @result{}
1686 @end example
1687
1688 @cindex warnings, suppressing
1689 @cindex suppressing warnings
1690 Normally @code{m4} will issue warnings if a builtin macro is called
1691 with an inappropriate number of arguments, but it can be suppressed with
1692 the @option{--quiet} command line option (or @option{--silent}, or
1693 @option{-Q}, @pxref{Operation modes, , Invoking m4}).  For user
1694 defined macros, there is no check of the number of arguments given.
1695
1696 @example
1697 $ @kbd{m4}
1698 index(`abc')
1699 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
1700 @result{}0
1701 index(`abc',)
1702 @result{}0
1703 index(`abc', `b', `0', `ignored')
1704 @error{}m4:stdin:3: warning: index: extra arguments ignored: 4 > 3
1705 @result{}1
1706 @end example
1707
1708 @comment options: -Q
1709 @example
1710 $ @kbd{m4 -Q}
1711 index(`abc')
1712 @result{}0
1713 index(`abc',)
1714 @result{}0
1715 index(`abc', `b', `', `ignored')
1716 @result{}1
1717 @end example
1718
1719 Macros are expanded normally during argument collection, and whatever
1720 commas, quotes and parentheses that might show up in the resulting
1721 expanded text will serve to define the arguments as well.  Thus, if
1722 @var{foo} expands to @samp{, b, c}, the macro call
1723
1724 @comment ignore
1725 @example
1726 bar(a foo, d)
1727 @end example
1728
1729 @noindent
1730 is a macro call with four arguments, which are @samp{a }, @samp{b},
1731 @samp{c} and @samp{d}.  To understand why the first argument contains
1732 whitespace, remember that unquoted leading whitespace is never part
1733 of an argument, but trailing whitespace always is.
1734
1735 It is possible for a macro's definition to change during argument
1736 collection, in which case the expansion uses the definition that was in
1737 effect at the time the opening @samp{(} was seen.
1738
1739 @example
1740 define(`f', `1')
1741 @result{}
1742 f(define(`f', `2'))
1743 @result{}1
1744 f
1745 @result{}2
1746 @end example
1747
1748 It is an error if the end of file occurs while collecting arguments.
1749
1750 @comment status: 1
1751 @example
1752 hello world
1753 @result{}hello world
1754 define(
1755 ^D
1756 @error{}m4:stdin:2: define: end of file in argument list
1757 @end example
1758
1759 @node Quoting Arguments
1760 @section On Quoting Arguments to macros
1761
1762 @cindex quoted macro arguments
1763 @cindex macros, quoted arguments to
1764 @cindex arguments, quoted macro
1765 Each argument has unquoted leading whitespace removed.  Within each
1766 argument, all unquoted parentheses must match.  For example, if
1767 @var{foo} is a macro,
1768
1769 @comment ignore
1770 @example
1771 foo(() (`(') `(')
1772 @end example
1773
1774 @noindent
1775 is a macro call, with one argument, whose value is @samp{() (() (}.
1776 Commas separate arguments, except when they occur inside quotes,
1777 comments, or unquoted parentheses.  @xref{Pseudo Arguments}, for
1778 examples.
1779
1780 It is common practice to quote all arguments to macros, unless you are
1781 sure you want the arguments expanded.  Thus, in the above
1782 example with the parentheses, the `right' way to do it is like this:
1783
1784 @comment ignore
1785 @example
1786 foo(`() (() (')
1787 @end example
1788
1789 @cindex quoting rule of thumb
1790 @cindex rule of thumb, quoting
1791 It is, however, in certain cases necessary (because nested expansion
1792 must occur to create the arguments for the outer macro) or convenient
1793 (because it uses fewer characters) to leave out quotes for some
1794 arguments, and there is nothing wrong in doing it.  It just makes life a
1795 bit harder, if you are not careful to follow a consistent quoting style.
1796 For consistency, this manual follows the rule of thumb that each layer
1797 of parentheses introduces another layer of single quoting, except when
1798 showing the consequences of quoting rules.  This is done even when the
1799 quoted string cannot be a macro, such as with integers when you have not
1800 changed the syntax via @code{changesyntax} (@pxref{Changesyntax}).
1801
1802 The quoting rule of thumb of one level of quoting per parentheses has a
1803 nice property: when a macro name appears inside parentheses, you can
1804 determine when it will be expanded.  If it is not quoted, it will be
1805 expanded prior to the outer macro, so that its expansion becomes the
1806 argument.  If it is single-quoted, it will be expanded after the outer
1807 macro.  And if it is double-quoted, it will be used as literal text
1808 instead of a macro name.
1809
1810 @example
1811 define(`active', `ACT, IVE')
1812 @result{}
1813 define(`show', `$1 $1')
1814 @result{}
1815 show(active)
1816 @result{}ACT ACT
1817 show(`active')
1818 @result{}ACT, IVE ACT, IVE
1819 show(``active'')
1820 @result{}active active
1821 @end example
1822
1823 @node Macro expansion
1824 @section Macro expansion
1825
1826 @cindex macros, expansion of
1827 @cindex expansion of macros
1828 When the arguments, if any, to a macro call have been collected, the
1829 macro is expanded, and the expansion text is pushed back onto the input
1830 (unquoted), and reread.  The expansion text from one macro call might
1831 therefore result in more macros being called, if the calls are included,
1832 completely or partially, in the first macro calls' expansion.
1833
1834 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1835 @var{bar} expands to @samp{Hello world}, the input
1836
1837 @comment options: -Dbar='Hello world' -Dfoo=bar
1838 @example
1839 $ @kbd{m4 -Dbar="Hello world" -Dfoo=bar}
1840 foo
1841 @result{}Hello world
1842 @end example
1843
1844 @noindent
1845 will expand first to @samp{bar}, and when this is reread and
1846 expanded, into @samp{Hello world}.
1847
1848 @node Definitions
1849 @chapter How to define new macros
1850
1851 @cindex macros, how to define new
1852 @cindex defining new macros
1853 Macros can be defined, redefined and deleted in several different ways.
1854 Also, it is possible to redefine a macro without losing a previous
1855 value, and bring back the original value at a later time.
1856
1857 @menu
1858 * Define::                      Defining a new macro
1859 * Arguments::                   Arguments to macros
1860 * Pseudo Arguments::            Special arguments to macros
1861 * Undefine::                    Deleting a macro
1862 * Defn::                        Renaming macros
1863 * Pushdef::                     Temporarily redefining macros
1864 * Renamesyms::                  Renaming macros with regular expressions
1865
1866 * Indir::                       Indirect call of macros
1867 * Builtin::                     Indirect call of builtins
1868 * M4symbols::                   Getting the defined macro names
1869 @end menu
1870
1871 @node Define
1872 @section Defining a macro
1873
1874 The normal way to define or redefine macros is to use the builtin
1875 @code{define}:
1876
1877 @deffn {Builtin (m4)} define (@var{name}, @ovar{expansion})
1878 Defines @var{name} to expand to @var{expansion}.  If
1879 @var{expansion} is not given, it is taken to be empty.
1880
1881 The expansion of @code{define} is void.
1882 The macro @code{define} is recognized only with parameters.
1883 @end deffn
1884 @comment Other implementations, such as Solaris, can define a macro
1885 @comment with a builtin token attached to text:
1886 @comment  define(foo, a`'defn(`divnum')b)
1887 @comment  defn(`foo') => ab
1888 @comment  dumpdef(`foo') => foo: a<divnum>b
1889 @comment  len(defn(`foo')) => 3
1890 @comment  index(defn(`foo'), defn(`divnum')) => 1
1891 @comment  foo => a0b
1892 @comment It may be worth making some changes to support this behavior,
1893 @comment or something similar to it.
1894 @comment
1895 @comment But be sure it has sane semantics, with potentially deferred
1896 @comment expansion of builtins.  For example, this should not warn
1897 @comment about trying to access the definition of an undefined macro:
1898 @comment  define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops')
1899 @comment Also, think how to handle conflicting argument counts:
1900 @comment  define(`bar', defn(`dnl', `len'))
1901
1902 The following example defines the macro @var{foo} to expand to the text
1903 @samp{Hello World.}.
1904
1905 @example
1906 define(`foo', `Hello world.')
1907 @result{}
1908 foo
1909 @result{}Hello world.
1910 @end example
1911
1912 The empty line in the output is there because the newline is not
1913 a part of the macro definition, and it is consequently copied to
1914 the output.  This can be avoided by use of the macro @code{dnl}.
1915 @xref{Dnl}, for details.
1916
1917 The first argument to @code{define} should be quoted; otherwise, if the
1918 macro is already defined, you will be defining a different macro.  This
1919 example shows the problems with underquoting, since we did not want to
1920 redefine @code{one}:
1921
1922 @example
1923 define(foo, one)
1924 @result{}
1925 define(foo, two)
1926 @result{}
1927 one
1928 @result{}two
1929 @end example
1930
1931 @cindex @acronym{GNU} extensions
1932 @acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
1933 definition of a macro if it has several definitions from @code{pushdef}
1934 (@pxref{Pushdef}).  Some other implementations of @code{m4} replace all
1935 definitions of a macro with @code{define}.  @xref{Incompatibilities},
1936 for more details.
1937
1938 As a @acronym{GNU} extension, the first argument to @code{define} does
1939 not have to be a simple word.
1940 It can be any text string, even the empty string.  A macro with a
1941 non-standard name cannot be invoked in the normal way, as the name is
1942 not recognized.  It can only be referenced by the builtins @code{Indir}
1943 (@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
1944
1945 @cindex arrays
1946 Arrays and associative arrays can be simulated by using non-standard
1947 macro names.
1948
1949 @deffn Composite array (@var{index})
1950 @deffnx Composite array_set (@var{index}, @ovar{value})
1951 Provide access to entries within an array.  @code{array} reads the entry
1952 at location @var{index}, and @code{array_set} assigns @var{value} to
1953 location @var{index}.
1954 @end deffn
1955
1956 @example
1957 define(`array', `defn(format(``array[%d]'', `$1'))')
1958 @result{}
1959 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1960 @result{}
1961 array_set(`4', `array element no. 4')
1962 @result{}
1963 array_set(`17', `array element no. 17')
1964 @result{}
1965 array(`4')
1966 @result{}array element no. 4
1967 array(eval(`10 + 7'))
1968 @result{}array element no. 17
1969 @end example
1970
1971 Change the @samp{%d} to @samp{%s} and it is an associative array.
1972
1973 @node Arguments
1974 @section Arguments to macros
1975
1976 @cindex macros, arguments to
1977 @cindex arguments to macros
1978 Macros can have arguments.  The @var{n}th argument is denoted by
1979 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1980 argument, when the macro is expanded.  Replacement of arguments happens
1981 before rescanning, regardless of how many nesting levels of quoting
1982 appear in the expansion.  Here is an example of a macro with
1983 two arguments.
1984
1985 @deffn Composite exch (@var{arg1}, @var{arg2})
1986 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1987 their order.
1988 @end deffn
1989
1990 @example
1991 define(`exch', `$2, $1')
1992 @result{}
1993 exch(`arg1', `arg2')
1994 @result{}arg2, arg1
1995 @end example
1996
1997 This can be used, for example, if you like the arguments to
1998 @code{define} to be reversed.
1999
2000 @example
2001 define(`exch', `$2, $1')
2002 @result{}
2003 define(exch(``expansion text'', ``macro''))
2004 @result{}
2005 macro
2006 @result{}expansion text
2007 @end example
2008
2009 @xref{Quoting Arguments}, for an explanation of the double quotes.
2010 (You should try and improve this example so that clients of @code{exch}
2011 do not have to double quote; or @pxref{Improved exch, , Answers}).
2012
2013 @cindex @acronym{GNU} extensions
2014 @acronym{GNU} @code{m4} allows the number following the @samp{$} to
2015 consist of one
2016 or more digits, allowing macros to have any number of arguments.  This
2017 is not so in UNIX implementations of @code{m4}, which only recognize
2018 one digit.
2019 @comment FIXME - See Austin group XCU ERN 111.  POSIX says that $11 must
2020 @comment be the first argument concatenated with 1, and instead reserves
2021 @comment ${11} for implementation use.  Once this is implemented, the
2022 @comment documentation needs to reflect how these extended arguments
2023 @comment are handled, as well as backwards compatibility issues with
2024 @comment 1.4.x.  Also, consider adding further extensions such as
2025 @comment ${1-default}, which expands to `default' if $1 is empty.
2026
2027 As a special case, the zeroth argument, @code{$0}, is always the name
2028 of the macro being expanded.
2029
2030 @example
2031 define(`test', ``Macro name: $0'')
2032 @result{}
2033 test
2034 @result{}Macro name: test
2035 @end example
2036
2037 If you want quoted text to appear as part of the expansion text,
2038 remember that quotes can be nested in quoted strings.  Thus, in
2039
2040 @example
2041 define(`foo', `This is macro `foo'.')
2042 @result{}
2043 foo
2044 @result{}This is macro foo.
2045 @end example
2046
2047 @noindent
2048 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
2049 a quoted string, and not a name.
2050
2051 @node Pseudo Arguments
2052 @section Special arguments to macros
2053
2054 @cindex special arguments to macros
2055 @cindex macros, special arguments to
2056 @cindex arguments to macros, special
2057 There is a special notation for the number of actual arguments supplied,
2058 and for all the actual arguments.
2059
2060 The number of actual arguments in a macro call is denoted by @code{$#}
2061 in the expansion text.
2062
2063 @deffn Composite nargs (@dots{})
2064 Expands to a count of the number of arguments supplied.
2065 @end deffn
2066
2067 @example
2068 define(`nargs', `$#')
2069 @result{}
2070 nargs
2071 @result{}0
2072 nargs()
2073 @result{}1
2074 nargs(`arg1', `arg2', `arg3')
2075 @result{}3
2076 nargs(`commas can be quoted, like this')
2077 @result{}1
2078 nargs(arg1#inside comments, commas do not separate arguments
2079 still arg1)
2080 @result{}1
2081 nargs((unquoted parentheses, like this, group arguments))
2082 @result{}1
2083 @end example
2084
2085 Remember that @samp{#} defaults to the comment character; if you forget
2086 quotes to inhibit the comment behavior, your macro definition may not
2087 end where you expected.
2088
2089 @example
2090 dnl Attempt to define a macro to just `$#'
2091 define(underquoted, $#)
2092 oops)
2093 @result{}
2094 underquoted
2095 @result{}0)
2096 @result{}oops
2097 @end example
2098
2099 The notation @code{$*} can be used in the expansion text to denote all
2100 the actual arguments, unquoted, with commas in between.  For example
2101
2102 @example
2103 define(`echo', `$*')
2104 @result{}
2105 echo(arg1,    arg2, arg3 , arg4)
2106 @result{}arg1,arg2,arg3 ,arg4
2107 @end example
2108
2109 Often each argument should be quoted, and the notation @code{$@@} handles
2110 that.  It is just like @code{$*}, except that it quotes each argument.
2111 A simple example of that is:
2112
2113 @example
2114 define(`echo', `$@@')
2115 @result{}
2116 echo(arg1,    arg2, arg3 , arg4)
2117 @result{}arg1,arg2,arg3 ,arg4
2118 @end example
2119
2120 Where did the quotes go?  Of course, they were eaten, when the expanded
2121 text were reread by @code{m4}.  To show the difference, try
2122
2123 @example
2124 define(`echo1', `$*')
2125 @result{}
2126 define(`echo2', `$@@')
2127 @result{}
2128 define(`foo', `This is macro `foo'.')
2129 @result{}
2130 echo1(foo)
2131 @result{}This is macro This is macro foo..
2132 echo1(`foo')
2133 @result{}This is macro foo.
2134 echo2(foo)
2135 @result{}This is macro foo.
2136 echo2(`foo')
2137 @result{}foo
2138 @end example
2139
2140 @noindent
2141 @xref{Trace}, if you do not understand this.  As another example of the
2142 difference, remember that comments encountered in arguments are passed
2143 untouched to the macro, and that quoting disables comments.
2144
2145 @example
2146 define(`echo1', `$*')
2147 @result{}
2148 define(`echo2', `$@@')
2149 @result{}
2150 define(`foo', `bar')
2151 @result{}
2152 echo1(#foo'foo
2153 foo)
2154 @result{}#foo'foo
2155 @result{}bar
2156 echo2(#foo'foo
2157 foo)
2158 @result{}#foobar
2159 @result{}bar'
2160 @end example
2161
2162 A @samp{$} sign in the expansion text, that is not followed by anything
2163 @code{m4} understands, is simply copied to the macro expansion, as any
2164 other text is.
2165
2166 @example
2167 define(`foo', `$$$ hello $$$')
2168 @result{}
2169 foo
2170 @result{}$$$ hello $$$
2171 @end example
2172
2173 @cindex rescanning
2174 @cindex literal output
2175 @cindex output, literal
2176 If you want a macro to expand to something like @samp{$12}, the
2177 judicious use of nested quoting can put a safe character between the
2178 @code{$} and the next character, relying on the rescanning to remove the
2179 nested quote.  This will prevent @code{m4} from interpreting the
2180 @code{$} sign as a reference to an argument.
2181
2182 @example
2183 define(`foo', `no nested quote: $1')
2184 @result{}
2185 foo(`arg')
2186 @result{}no nested quote: arg
2187 define(`foo', `nested quote around $: `$'1')
2188 @result{}
2189 foo(`arg')
2190 @result{}nested quote around $: $1
2191 define(`foo', `nested empty quote after $: $`'1')
2192 @result{}
2193 foo(`arg')
2194 @result{}nested empty quote after $: $1
2195 define(`foo', `nested quote around next character: $`1'')
2196 @result{}
2197 foo(`arg')
2198 @result{}nested quote around next character: $1
2199 define(`foo', `nested quote around both: `$1'')
2200 @result{}
2201 foo(`arg')
2202 @result{}nested quote around both: arg
2203 @end example
2204
2205 @node Undefine
2206 @section Deleting a macro
2207
2208 @cindex macros, how to delete
2209 @cindex deleting macros
2210 @cindex undefining macros
2211 A macro definition can be removed with @code{undefine}:
2212
2213 @deffn {Builtin (m4)} undefine (@var{name}@dots{})
2214 For each argument, remove the macro @var{name}.  The macro names must
2215 necessarily be quoted, since they will be expanded otherwise.  If an
2216 argument is not a defined macro, then the @samp{d} debug level controls
2217 whether a warning is issued (@pxref{Debugmode}).
2218
2219 The expansion of @code{undefine} is void.
2220 The macro @code{undefine} is recognized only with parameters.
2221 @end deffn
2222
2223 @example
2224 foo bar blah
2225 @result{}foo bar blah
2226 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2227 @result{}
2228 foo bar blah
2229 @result{}some other text
2230 undefine(`foo')
2231 @result{}
2232 foo bar blah
2233 @result{}foo other text
2234 undefine(`bar', `blah')
2235 @result{}
2236 foo bar blah
2237 @result{}foo bar blah
2238 @end example
2239
2240 Undefining a macro inside that macro's expansion is safe; the macro
2241 still expands to the definition that was in effect at the @samp{(}.
2242
2243 @example
2244 define(`f', ``$0':$1')
2245 @result{}
2246 f(f(f(undefine(`f')`hello world')))
2247 @result{}f:f:f:hello world
2248 f(`bye')
2249 @result{}f(bye)
2250 @end example
2251
2252 As of M4 1.6, @code{undefine} can warn if @var{name} is not a macro, by
2253 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2254 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2255 m4}).
2256
2257 @example
2258 $ @kbd{m4}
2259 undefine(`a')
2260 @error{}m4:stdin:1: warning: undefine: undefined macro 'a'
2261 @result{}
2262 debugmode(`-d')
2263 @result{}
2264 undefine(`a')
2265 @result{}
2266 @end example
2267
2268 @node Defn
2269 @section Renaming macros
2270
2271 @cindex macros, how to rename
2272 @cindex renaming macros
2273 @cindex macros, displaying definitions
2274 @cindex definitions, displaying macro
2275 It is possible to rename an already defined macro.  To do this, you need
2276 the builtin @code{defn}:
2277
2278 @deffn {Builtin (m4)} defn (@var{name}@dots{})
2279 Expands to the @emph{quoted definition} of each @var{name}.  If an
2280 argument is not a defined macro, the expansion for that argument is
2281 empty, and the @samp{d} debug level controls whether a warning is issued
2282 (@pxref{Debugmode}).
2283
2284 If @var{name} is a user-defined macro, the quoted definition is simply
2285 the quoted expansion text.  If, instead, @var{name} is a builtin, the
2286 expansion is a special token, which points to the builtin's internal
2287 definition.  This token meaningful primarily as the second argument to
2288 @code{define} (and @code{pushdef}), and is silently converted to an
2289 empty string in many other contexts.
2290
2291 The macro @code{defn} is recognized only with parameters.
2292 @end deffn
2293
2294 Its normal use is best understood through an example, which shows how to
2295 rename @code{undefine} to @code{zap}:
2296
2297 @example
2298 define(`zap', defn(`undefine'))
2299 @result{}
2300 zap(`undefine')
2301 @result{}
2302 undefine(`zap')
2303 @result{}undefine(zap)
2304 @end example
2305
2306 In this way, @code{defn} can be used to copy macro definitions, and also
2307 definitions of builtin macros.  Even if the original macro is removed,
2308 the other name can still be used to access the definition.
2309
2310 The fact that macro definitions can be transferred also explains why you
2311 should use @code{$0}, rather than retyping a macro's name in its
2312 definition:
2313
2314 @example
2315 define(`foo', `This is `$0'')
2316 @result{}
2317 define(`bar', defn(`foo'))
2318 @result{}
2319 bar
2320 @result{}This is bar
2321 @end example
2322
2323 Macros used as string variables should be referred through @code{defn},
2324 to avoid unwanted expansion of the text:
2325
2326 @example
2327 define(`string', `The macro dnl is very useful
2328 ')
2329 @result{}
2330 string
2331 @result{}The macro@w{ }
2332 defn(`string')
2333 @result{}The macro dnl is very useful
2334 @result{}
2335 @end example
2336
2337 @cindex rescanning
2338 However, it is important to remember that @code{m4} rescanning is purely
2339 textual.  If an unbalanced end-quote string occurs in a macro
2340 definition, the rescan will see that embedded quote as the termination
2341 of the quoted string, and the remainder of the macro's definition will
2342 be rescanned unquoted.  Thus it is a good idea to avoid unbalanced
2343 end-quotes in macro definitions or arguments to macros.
2344
2345 @example
2346 define(`foo', a'a)
2347 @result{}
2348 define(`a', `A')
2349 @result{}
2350 define(`echo', `$@@')
2351 @result{}
2352 foo
2353 @result{}A'A
2354 defn(`foo')
2355 @result{}aA'
2356 echo(foo)
2357 @result{}AA'
2358 @end example
2359
2360 On the other hand, it is possible to exploit the fact that @code{defn}
2361 can concatenate multiple macros prior to the rescanning phase, in order
2362 to join the definitions of macros that, in isolation, have unbalanced
2363 quotes.  This is particularly useful when one has used several macros to
2364 accumulate text that M4 should rescan as a whole.  In the example below,
2365 note how the use of @code{defn} on @code{l} in isolation opens a string,
2366 which is not closed until the next line; but used on @code{l} and
2367 @code{r} together results in nested quoting.
2368
2369 @example
2370 define(`l', `<[>')define(`r', `<]>')
2371 @result{}
2372 changequote(`[', `]')
2373 @result{}
2374 defn([l])defn([r])
2375 ])
2376 @result{}<[>]defn([r])
2377 @result{})
2378 defn([l], [r])
2379 @result{}<[>][<]>
2380 @end example
2381
2382 @cindex builtins, special tokens
2383 @cindex tokens, builtin macro
2384 Using @code{defn} to generate special tokens for builtin macros will
2385 generate a warning in contexts where a macro name is expected.  But in
2386 contexts that operate on text, the builtin token is just silently
2387 converted to an empty string.  As of M4 1.6, expansion of user macros
2388 will also preserve builtin tokens.  However, any use of builtin tokens
2389 outside of the second argument to @code{define} and @code{pushdef} is
2390 generally not portable, since earlier @acronym{GNU} M4 versions, as well
2391 as other @code{m4} implementations, vary on how such tokens are treated.
2392
2393 @example
2394 $ @kbd{m4 -d}
2395 defn(`defn')
2396 @result{}
2397 define(defn(`divnum'), `cannot redefine a builtin token')
2398 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2399 @result{}
2400 divnum
2401 @result{}0
2402 len(defn(`divnum'))
2403 @result{}0
2404 define(`echo', `$@@')
2405 @result{}
2406 define(`mydivnum', shift(echo(`', defn(`divnum'))))
2407 @result{}
2408 mydivnum
2409 @result{}0
2410 define(`', `empty-$1')
2411 @result{}
2412 defn(defn(`divnum'))
2413 @error{}m4:stdin:9: warning: defn: invalid macro name ignored
2414 @result{}
2415 pushdef(defn(`divnum'), `oops')
2416 @error{}m4:stdin:10: warning: pushdef: invalid macro name ignored
2417 @result{}
2418 traceon(defn(`divnum'))
2419 @error{}m4:stdin:11: warning: traceon: invalid macro name ignored
2420 @result{}
2421 indir(defn(`divnum'), `string')
2422 @error{}m4:stdin:12: warning: indir: invalid macro name ignored
2423 @result{}
2424 indir(`', `string')
2425 @result{}empty-string
2426 traceoff(defn(`divnum'))
2427 @error{}m4:stdin:14: warning: traceoff: invalid macro name ignored
2428 @result{}
2429 popdef(defn(`divnum'))
2430 @error{}m4:stdin:15: warning: popdef: invalid macro name ignored
2431 @result{}
2432 dumpdef(defn(`divnum'))
2433 @error{}m4:stdin:16: warning: dumpdef: invalid macro name ignored
2434 @result{}
2435 undefine(defn(`divnum'))
2436 @error{}m4:stdin:17: warning: undefine: invalid macro name ignored
2437 @result{}
2438 dumpdef(`')
2439 @error{}:@tabchar{}`empty-$1'
2440 @result{}
2441 m4symbols(defn(`divnum'))
2442 @error{}m4:stdin:19: warning: m4symbols: invalid macro name ignored
2443 @result{}
2444 define(`foo', `define(`$1', $2)')dnl
2445 foo(`bar', defn(`divnum'))
2446 @result{}
2447 bar
2448 @result{}0
2449 @end example
2450
2451 As of M4 1.6, @code{defn} can warn if @var{name} is not a macro, by
2452 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2453 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2454 m4}).  Also, @code{defn} with multiple arguments can join text with
2455 builtin tokens.  However, when defining a macro via @code{define} or
2456 @code{pushdef}, a warning is issued and the builtin token ignored if the
2457 builtin token does not occur in isolation.  A future version of
2458 @acronym{GNU} M4 may lift this restriction.
2459
2460 @example
2461 $ @kbd{m4 -d}
2462 defn(`foo')
2463 @error{}m4:stdin:1: warning: defn: undefined macro 'foo'
2464 @result{}
2465 debugmode(`-d')
2466 @result{}
2467 defn(`foo')
2468 @result{}
2469 define(`a', `A')define(`AA', `b')
2470 @result{}
2471 traceon(`defn', `define')
2472 @result{}
2473 defn(`a', `divnum', `a')
2474 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A''
2475 @result{}AA
2476 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2477 @error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
2478 @error{}m4:stdin:7: warning: define: cannot concatenate builtins
2479 @error{}m4trace: -1- define(`mydivnum', `<divnum><divnum>') -> `'
2480 @result{}
2481 traceoff(`defn', `define')dumpdef(`mydivnum')
2482 @error{}mydivnum:@tabchar{}`'
2483 @result{}
2484 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
2485 @error{}m4:stdin:9: warning: define: cannot concatenate builtins
2486 @result{}
2487 define(`mydivnum', defn(`divnum')`a')mydivnum
2488 @error{}m4:stdin:10: warning: define: cannot concatenate builtins
2489 @result{}A
2490 define(`mydivnum', `a'defn(`divnum'))mydivnum
2491 @error{}m4:stdin:11: warning: define: cannot concatenate builtins
2492 @result{}A
2493 define(`q', ``$@@'')
2494 @result{}
2495 define(`foo', q(`a', defn(`divnum')))foo
2496 @error{}m4:stdin:13: warning: define: cannot concatenate builtins
2497 @result{}a,
2498 ifdef(`foo', `yes', `no')
2499 @result{}yes
2500 @end example
2501
2502 @node Pushdef
2503 @section Temporarily redefining macros
2504
2505 @cindex macros, temporary redefinition of
2506 @cindex temporary redefinition of macros
2507 @cindex redefinition of macros, temporary
2508 @cindex definition stack
2509 @cindex pushdef stack
2510 @cindex stack, macro definition
2511 It is possible to redefine a macro temporarily, reverting to the
2512 previous definition at a later time.  This is done with the builtins
2513 @code{pushdef} and @code{popdef}:
2514
2515 @deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion})
2516 @deffnx {Builtin (m4)} popdef (@var{name}@dots{})
2517 Analogous to @code{define} and @code{undefine}.
2518
2519 These macros work in a stack-like fashion.  A macro is temporarily
2520 redefined with @code{pushdef}, which replaces an existing definition of
2521 @var{name}, while saving the previous definition, before the new one is
2522 installed.  If there is no previous definition, @code{pushdef} behaves
2523 exactly like @code{define}.
2524
2525 If a macro has several definitions (of which only one is accessible),
2526 the topmost definition can be removed with @code{popdef}.  If there is
2527 no previous definition, @code{popdef} behaves like @code{undefine}, and
2528 if there is no definition at all, the @samp{d} debug level controls
2529 whether a warning is issued (@pxref{Debugmode}).
2530
2531 The expansion of both @code{pushdef} and @code{popdef} is void.
2532 The macros @code{pushdef} and @code{popdef} are recognized only with
2533 parameters.
2534 @end deffn
2535
2536 @example
2537 define(`foo', `Expansion one.')
2538 @result{}
2539 foo
2540 @result{}Expansion one.
2541 pushdef(`foo', `Expansion two.')
2542 @result{}
2543 foo
2544 @result{}Expansion two.
2545 pushdef(`foo', `Expansion three.')
2546 @result{}
2547 pushdef(`foo', `Expansion four.')
2548 @result{}
2549 popdef(`foo')
2550 @result{}
2551 foo
2552 @result{}Expansion three.
2553 popdef(`foo', `foo')
2554 @result{}
2555 foo
2556 @result{}Expansion one.
2557 popdef(`foo')
2558 @result{}
2559 foo
2560 @result{}foo
2561 @end example
2562
2563 If a macro with several definitions is redefined with @code{define}, the
2564 topmost definition is @emph{replaced} with the new definition.  If it is
2565 removed with @code{undefine}, @emph{all} the definitions are removed,
2566 and not only the topmost one.  However, @acronym{POSIX} allows other
2567 implementations that treat @code{define} as replacing an entire stack
2568 of definitions with a single new definition, so to be portable to other
2569 implementations, it may be worth explicitly using @code{popdef} and
2570 @code{pushdef} rather than relying on the @acronym{GNU} behavior of
2571 @code{define}.
2572
2573 @example
2574 define(`foo', `Expansion one.')
2575 @result{}
2576 foo
2577 @result{}Expansion one.
2578 pushdef(`foo', `Expansion two.')
2579 @result{}
2580 foo
2581 @result{}Expansion two.
2582 define(`foo', `Second expansion two.')
2583 @result{}
2584 foo
2585 @result{}Second expansion two.
2586 undefine(`foo')
2587 @result{}
2588 foo
2589 @result{}foo
2590 @end example
2591
2592 @cindex local variables
2593 @cindex variables, local
2594 Local variables within macros are made with @code{pushdef} and
2595 @code{popdef}.  At the start of the macro a new definition is pushed,
2596 within the macro it is manipulated and at the end it is popped,
2597 revealing the former definition.
2598
2599 It is possible to temporarily redefine a builtin with @code{pushdef}
2600 and @code{defn}.
2601
2602 As of M4 1.6, @code{popdef} can warn if @var{name} is not a macro, by
2603 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2604 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2605 m4}).
2606
2607 @example
2608 define(`a', `1')
2609 @result{}
2610 popdef
2611 @result{}popdef
2612 popdef(`a', `a')
2613 @error{}m4:stdin:3: warning: popdef: undefined macro 'a'
2614 @result{}
2615 debugmode(`-d')
2616 @result{}
2617 popdef(`a')
2618 @result{}
2619 @end example
2620
2621 @node Renamesyms
2622 @section Renaming macros with regular expressions
2623
2624 @cindex regular expressions
2625 @cindex macros, how to rename
2626 @cindex renaming macros
2627 @cindex @acronym{GNU} extensions
2628 Sometimes it is desirable to rename multiple symbols without having to
2629 use a long sequence of calls to @code{define}.  The @code{renamesyms}
2630 builtin allows this:
2631
2632 @deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @
2633   @ovar{resyntax})
2634 Global renaming of macros is done by @code{renamesyms}, which selects
2635 all macros with names that match @var{regexp}, and renames each match
2636 according to @var{replacement}.  It is unspecified what happens if the
2637 rename causes multiple macros to map to the same name.
2638 @comment FIXME - right now, collisions cause a core dump on some platforms:
2639 @comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa')
2640
2641 If @var{resyntax} is given, the particular flavor of regular
2642 expression understood with respect to @var{regexp} can be changed from
2643 the current default.  @xref{Changeresyntax}, for details of the values
2644 that can be given for this argument.
2645
2646 A macro that does not have a name that matches @var{regexp} is left
2647 with its original name.  If only part of the name matches, any part of
2648 the name that is not covered by @var{regexp} is copied to the
2649 replacement name.  Whenever a match is found in the name, the search
2650 proceeds from the end of the match, so no character in the original
2651 name can be substituted twice.  If @var{regexp} matches a string of
2652 zero length, the start position for the continued search is
2653 incremented to avoid infinite loops.
2654
2655 Where a replacement is to be made, @var{replacement} replaces the
2656 matched text in the original name, with @samp{\@var{n}} substituted by
2657 the text matched by the @var{n}th parenthesized sub-expression of
2658 @var{regexp}, and @samp{\&} being the text matched by the entire
2659 regular expression.
2660
2661 The expansion of @code{renamesyms} is void.
2662 The macro @code{renamesyms} is recognized only with parameters.
2663 This macro was added in M4 2.0.
2664 @end deffn
2665
2666 The following example starts with a rename similar to the
2667 @option{--prefix-builtins} option (or @option{-P}), prefixing every
2668 macro with @code{m4_}.  However, note that @option{-P} only renames M4
2669 builtin macros, even if other macros were defined previously, while
2670 @code{renamesyms} will rename any macros that match when it runs,
2671 including text macros.  The rest of the example demonstrates the
2672 behavior of unanchored regular expressions in symbol renaming.
2673
2674 @comment options: -Dfoo=bar -P
2675 @example
2676 $ @kbd{m4 -Dfoo=bar -P}
2677 foo
2678 @result{}bar
2679 m4_foo
2680 @result{}m4_foo
2681 m4_defn(`foo')
2682 @result{}bar
2683 @end example
2684
2685 @example
2686 $ @kbd{m4}
2687 define(`foo', `bar')
2688 @result{}
2689 renamesyms(`^.*$', `m4_\&')
2690 @result{}
2691 foo
2692 @result{}foo
2693 m4_foo
2694 @result{}bar
2695 m4_defn(`m4_foo')
2696 @result{}bar
2697 m4_renamesyms(`f', `g')
2698 @result{}
2699 m4_igdeg(`m4_goo', `m4_goo')
2700 @result{}bar
2701 @end example
2702
2703 If @var{resyntax} is given, @var{regexp} must be given according to
2704 the syntax chosen, though the default regular expression syntax
2705 remains unchanged for other invocations.  Here is a more realistic
2706 example that performs a similar renaming on macros, except that it
2707 ignores macros with names that begin with @samp{_}, and avoids creating
2708 macros with names that begin with @samp{m4_m4}.
2709
2710 @example
2711 renamesyms(`^[^_]\w*$', `m4_\&')
2712 @result{}
2713 m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED')
2714 @result{}
2715 m4_wrap(__line__
2716 )
2717 @result{}
2718 ^D
2719 @result{}3
2720 @end example
2721
2722 When a symbol has multiple definitions, thanks to @code{pushdef}, the
2723 entire stack is renamed.
2724
2725 @example
2726 pushdef(`foo', `1')pushdef(`foo', `2')
2727 @result{}
2728 renamesyms(`^foo$', `bar')
2729 @result{}
2730 bar
2731 @result{}2
2732 popdef(`bar')bar
2733 @result{}1
2734 popdef(`bar')bar
2735 @result{}bar
2736 @end example
2737
2738 @node Indir
2739 @section Indirect call of macros
2740
2741 @cindex indirect call of macros
2742 @cindex call of macros, indirect
2743 @cindex macros, indirect call of
2744 @cindex @acronym{GNU} extensions
2745 Any macro can be called indirectly with @code{indir}:
2746
2747 @deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}})
2748 Results in a call to the macro @var{name}, which is passed the rest of
2749 the arguments @var{args}.  If @var{name} is not defined, the expansion
2750 is void, and the @samp{d} debug level controls whether a warning is
2751 issued (@pxref{Debugmode}).
2752
2753 The macro @code{indir} is recognized only with parameters.
2754 @end deffn
2755
2756 This can be used to call macros with computed or ``invalid''
2757 names (@code{define} allows such names to be defined):
2758
2759 @example
2760 define(`$$internal$macro', `Internal macro (name `$0')')
2761 @result{}
2762 $$internal$macro
2763 @result{}$$internal$macro
2764 indir(`$$internal$macro')
2765 @result{}Internal macro (name $$internal$macro)
2766 @end example
2767
2768 The point is, here, that larger macro packages can have private macros
2769 defined, that will not be called by accident.  They can @emph{only} be
2770 called through the builtin @code{indir}.
2771
2772 One other point to observe is that argument collection occurs before
2773 @code{indir} invokes @var{name}, so if argument collection changes the
2774 value of @var{name}, that will be reflected in the final expansion.
2775 This is different than the behavior when invoking macros directly,
2776 where the definition that was in effect before argument collection is
2777 used.
2778
2779 @example
2780 $ @kbd{m4 -d}
2781 define(`f', `1')
2782 @result{}
2783 f(define(`f', `2'))
2784 @result{}1
2785 indir(`f', define(`f', `3'))
2786 @result{}3
2787 indir(`f', undefine(`f'))
2788 @error{}m4:stdin:4: warning: indir: undefined macro 'f'
2789 @result{}
2790 debugmode(`-d')
2791 @result{}
2792 indir(`f')
2793 @result{}
2794 @end example
2795
2796 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2797 arguments, @code{indir} defers to the invoked @var{name} for whether a
2798 token representing a builtin is recognized or flattened to the empty
2799 string.
2800
2801 @example
2802 $ @kbd{m4 -d}
2803 indir(defn(`defn'), `divnum')
2804 @error{}m4:stdin:1: warning: indir: invalid macro name ignored
2805 @result{}
2806 indir(`define', defn(`defn'), `divnum')
2807 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2808 @result{}
2809 indir(`define', `foo', defn(`divnum'))
2810 @result{}
2811 foo
2812 @result{}0
2813 indir(`divert', defn(`foo'))
2814 @error{}m4:stdin:5: warning: divert: empty string treated as 0
2815 @result{}
2816 @end example
2817
2818 Warning messages issued on behalf of an indirect macro use an
2819 unambiguous representation of the macro name, using escape sequences
2820 similar to C strings, and with colons also quoted.
2821
2822 @example
2823 define(`%%:\
2824 odd', defn(`divnum'))
2825 @result{}
2826 indir(`%%:\
2827 odd', `extra')
2828 @error{}m4:stdin:3: warning: %%\:\\\nodd: extra arguments ignored: 1 > 0
2829 @result{}0
2830 @end example
2831
2832 @node Builtin
2833 @section Indirect call of builtins
2834
2835 @cindex indirect call of builtins
2836 @cindex call of builtins, indirect
2837 @cindex builtins, indirect call of
2838 @cindex @acronym{GNU} extensions
2839 Builtin macros can be called indirectly with @code{builtin}:
2840
2841 @deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}})
2842 @deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1})
2843 Results in a call to the builtin @var{name}, which is passed the
2844 rest of the arguments @var{args}.  If @var{name} does not name a
2845 builtin, the expansion is void, and the @samp{d} debug level controls
2846 whether a warning is issued (@pxref{Debugmode}).
2847
2848 As a special case, if @var{name} is exactly the special token
2849 representing the @code{builtin} macro, as obtained by @code{defn}
2850 (@pxref{Defn}), then @var{args} must consist of a single @var{name1},
2851 and the expansion is the special token representing the builtin macro
2852 named by @var{name1}.
2853
2854 The macro @code{builtin} is recognized only with parameters.
2855 @end deffn
2856
2857 This can be used even if @var{name} has been given another definition
2858 that has covered the original, or been undefined so that no macro
2859 maps to the builtin.
2860
2861 @example
2862 pushdef(`define', `hidden')
2863 @result{}
2864 undefine(`undefine')
2865 @result{}
2866 define(`foo', `bar')
2867 @result{}hidden
2868 foo
2869 @result{}foo
2870 builtin(`define', `foo', defn(`divnum'))
2871 @result{}
2872 foo
2873 @result{}0
2874 builtin(`define', `foo', `BAR')
2875 @result{}
2876 foo
2877 @result{}BAR
2878 undefine(`foo')
2879 @result{}undefine(foo)
2880 foo
2881 @result{}BAR
2882 builtin(`undefine', `foo')
2883 @result{}
2884 foo
2885 @result{}foo
2886 @end example
2887
2888 The @var{name} argument only matches the original name of the builtin,
2889 even when the @option{--prefix-builtins} option (or @option{-P},
2890 @pxref{Operation modes, , Invoking m4}) is in effect.  This is different
2891 from @code{indir}, which only tracks current macro names.
2892
2893 @comment options: -P
2894 @example
2895 $ @kbd{m4 -P}
2896 m4_builtin(`divnum')
2897 @result{}0
2898 m4_builtin(`m4_divnum')
2899 @error{}m4:stdin:2: warning: m4_builtin: undefined builtin 'm4_divnum'
2900 @result{}
2901 m4_indir(`divnum')
2902 @error{}m4:stdin:3: warning: m4_indir: undefined macro 'divnum'
2903 @result{}
2904 m4_indir(`m4_divnum')
2905 @result{}0
2906 m4_debugmode(`-d')
2907 @result{}
2908 m4_builtin(`m4_divnum')
2909 @result{}
2910 @end example
2911
2912 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2913 without arguments, even when they normally require parameters to be
2914 recognized; but it will provoke a warning, and the expansion will behave
2915 as though empty strings had been passed as the required arguments.
2916
2917 @example
2918 builtin
2919 @result{}builtin
2920 builtin()
2921 @error{}m4:stdin:2: warning: builtin: undefined builtin ''
2922 @result{}
2923 builtin(`builtin')
2924 @error{}m4:stdin:3: warning: builtin: too few arguments: 0 < 1
2925 @result{}
2926 builtin(`builtin',)
2927 @error{}m4:stdin:4: warning: builtin: undefined builtin ''
2928 @result{}
2929 builtin(`builtin', ``'
2930 ')
2931 @error{}m4:stdin:5: warning: builtin: undefined builtin '`\'\n'
2932 @result{}
2933 indir(`index')
2934 @error{}m4:stdin:7: warning: index: too few arguments: 0 < 2
2935 @result{}0
2936 @end example
2937
2938 Normally, once a builtin macro is undefined, the only way to retrieve
2939 its functionality is by defining a new macro that expands to
2940 @code{builtin} under the hood.  But this extra layer of expansion is
2941 slightly inefficient, not to mention the fact that it is not robust to
2942 changes in the current quoting scheme due to @code{changequote}
2943 (@pxref{Changequote}).  On the other hand, defining a macro to the
2944 special token produced by @code{defn} (@pxref{Defn}) is very efficient,
2945 and avoids the need for quoting within the macro definition; but
2946 @code{defn} only works if the desired macro is already defined by some
2947 other name.  So @code{builtin} provides a special case where it is
2948 possible to retrieve the same special token representing a builtin as
2949 what @code{defn} would provide, were the desired macro still defined.
2950 This feature is activated by passing @code{defn(`builtin')} as the first
2951 argument to builtin.  Normally, passing a special token representing a
2952 macro as @var{name} results in a warning and an empty expansion, but in
2953 this case, if the second argument @var{name1} names a valid builtin,
2954 there is no warning and the expansion is the appropriate special
2955 token.  In fact, with just the @code{builtin} macro accessible, it is
2956 possible to reconstitute the entire startup state of @code{m4}.
2957
2958 In the example below, compare the number of macro invocations performed
2959 by @code{defn1} and @code{defn2}, and the differences once quoting is
2960 changed.
2961
2962 @example
2963 $ @kbd{m4 -d}
2964 undefine(`defn')
2965 @result{}
2966 define(`foo', `bar')
2967 @result{}
2968 define(`defn1', `builtin(`defn', $@@)')
2969 @result{}
2970 define(`defn2', builtin(builtin(`defn', `builtin'), `defn'))
2971 @result{}
2972 dumpdef(`defn1', `defn2')
2973 @error{}defn1:@tabchar{}`builtin(`defn', $@@)'
2974 @error{}defn2:@tabchar{}<defn>
2975 @result{}
2976 traceon
2977 @result{}
2978 defn1(`foo')
2979 @error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')'
2980 @error{}m4trace: -1- builtin(`defn', `foo') -> ``bar''
2981 @result{}bar
2982 defn2(`foo')
2983 @error{}m4trace: -1- defn2(`foo') -> ``bar''
2984 @result{}bar
2985 traceoff
2986 @error{}m4trace: -1- traceoff -> `'
2987 @result{}
2988 changequote(`[', `]')
2989 @result{}
2990 defn1([foo])
2991 @error{}m4:stdin:11: warning: builtin: undefined builtin '`defn\''
2992 @result{}
2993 defn2([foo])
2994 @result{}bar
2995 define([defn1], [builtin([defn], $@@)])
2996 @result{}
2997 defn1([foo])
2998 @result{}bar
2999 changequote
3000 @result{}
3001 defn1(`foo')
3002 @error{}m4:stdin:16: warning: builtin: undefined builtin '[defn]'
3003 @result{}
3004 @end example
3005
3006 @node M4symbols
3007 @section Getting the defined macro names
3008
3009 @cindex macro names, listing
3010 @cindex listing macro names
3011 @cindex currently defined macros
3012 @cindex @acronym{GNU} extensions
3013 The name of the currently defined macros can be accessed by
3014 @code{m4symbols}:
3015
3016 @deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}})
3017 Without arguments, @code{m4symbols} expands to a sorted list of quoted
3018 strings, separated by commas.  This contrasts with @code{dumpdef}
3019 (@pxref{Dumpdef}), whose output cannot be accessed by @code{m4}
3020 programs.
3021
3022 When given arguments, @code{m4symbols} returns the sorted subset of the
3023 @var{names} currently defined, and silently ignores the rest.
3024 This macro was added in M4 2.0.
3025 @end deffn
3026
3027 @example
3028 m4symbols(`ifndef', `ifdef', `define', `undef')
3029 @result{}define,ifdef
3030 @end example
3031
3032 @node Conditionals
3033 @chapter Conditionals, loops, and recursion
3034
3035 Macros, expanding to plain text, perhaps with arguments, are not quite
3036 enough.  We would like to have macros expand to different things, based
3037 on decisions taken at run-time.  For that, we need some kind of conditionals.
3038 Also, we would like to have some kind of loop construct, so we could do
3039 something a number of times, or while some condition is true.
3040
3041 @menu
3042 * Ifdef::                       Testing if a macro is defined
3043 * Ifelse::                      If-else construct, or multibranch
3044 * Shift::                       Recursion in @code{m4}
3045 * Forloop::                     Iteration by counting
3046 * Foreach::                     Iteration by list contents
3047 * Stacks::                      Working with definition stacks
3048 * Composition::                 Building macros with macros
3049 @end menu
3050
3051 @node Ifdef
3052 @section Testing if a macro is defined
3053
3054 @cindex conditionals
3055 There are two different builtin conditionals in @code{m4}.  The first is
3056 @code{ifdef}:
3057
3058 @deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2})
3059 If @var{name} is defined as a macro, @code{ifdef} expands to
3060 @var{string-1}, otherwise to @var{string-2}.  If @var{string-2} is
3061 omitted, it is taken to be the empty string (according to the normal
3062 rules).
3063
3064 The macro @code{ifdef} is recognized only with parameters.
3065 @end deffn
3066
3067 @example
3068 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3069 @result{}foo is not defined
3070 define(`foo', `')
3071 @result{}
3072 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3073 @result{}foo is defined
3074 ifdef(`no_such_macro', `yes', `no', `extra argument')
3075 @error{}m4:stdin:4: warning: ifdef: extra arguments ignored: 4 > 3
3076 @result{}no
3077 @end example
3078
3079 As of M4 1.6, @code{ifdef} transparently handles builtin tokens
3080 generated by @code{defn} (@pxref{Defn}) that occur in either
3081 @var{string}, although a warning is issued for invalid macro names.
3082
3083 @example
3084 define(`', `empty')
3085 @result{}
3086 ifdef(defn(`defn'), `yes', `no')
3087 @error{}m4:stdin:2: warning: ifdef: invalid macro name ignored
3088 @result{}no
3089 define(`foo', ifdef(`divnum', defn(`divnum'), `undefined'))
3090 @result{}
3091 foo
3092 @result{}0
3093 @end example
3094
3095 @node Ifelse
3096 @section If-else construct, or multibranch
3097
3098 @cindex comparing strings
3099 @cindex discarding input
3100 @cindex input, discarding
3101 The other conditional, @code{ifelse}, is much more powerful.  It can be
3102 used as a way to introduce a long comment, as an if-else construct, or
3103 as a multibranch, depending on the number of arguments supplied:
3104
3105 @deffn {Builtin (m4)} ifelse (@var{comment})
3106 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
3107   @ovar{not-equal})
3108 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
3109   @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
3110 Used with only one argument, the @code{ifelse} simply discards it and
3111 produces no output.
3112
3113 If called with three or four arguments, @code{ifelse} expands into
3114 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
3115 for character), otherwise it expands to @var{not-equal}.  A final fifth
3116 argument is ignored, after triggering a warning.
3117
3118 If called with six or more arguments, and @var{string-1} and
3119 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
3120 otherwise the first three arguments are discarded and the processing
3121 starts again.
3122
3123 The macro @code{ifelse} is recognized only with parameters.
3124 @end deffn
3125
3126 Using only one argument is a common @code{m4} idiom for introducing a
3127 block comment, as an alternative to repeatedly using @code{dnl}.  This
3128 special usage is recognized by @acronym{GNU} @code{m4}, so that in this
3129 case, the warning about missing arguments is never triggered.
3130
3131 @example
3132 ifelse(`some comments')
3133 @result{}
3134 ifelse(`foo', `bar')
3135 @error{}m4:stdin:2: warning: ifelse: too few arguments: 2 < 3
3136 @result{}
3137 @end example
3138
3139 Using three or four arguments provides decision points.
3140
3141 @example
3142 ifelse(`foo', `bar', `true')
3143 @result{}
3144 ifelse(`foo', `foo', `true')
3145 @result{}true
3146 define(`foo', `bar')
3147 @result{}
3148 ifelse(foo, `bar', `true', `false')
3149 @result{}true
3150 ifelse(foo, `foo', `true', `false')
3151 @result{}false
3152 @end example
3153
3154 @cindex macro, blind
3155 @cindex blind macro
3156 Notice how the first argument was used unquoted; it is common to compare
3157 the expansion of a macro with a string.  With this macro, you can now
3158 reproduce the behavior of blind builtins, where the macro is recognized
3159 only with arguments.
3160
3161 @example
3162 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
3163 @result{}
3164 foo
3165 @result{}foo
3166 foo()
3167 @result{}arguments:1
3168 foo(`a', `b', `c')
3169 @result{}arguments:3
3170 @end example
3171
3172 For an example of a way to make defining blind macros easier, see
3173 @ref{Composition}.
3174
3175 @cindex multibranches
3176 @cindex switch statement
3177 @cindex case statement
3178 The macro @code{ifelse} can take more than four arguments.  If given more
3179 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
3180 statement in traditional programming languages.  If @var{string-1} and
3181 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
3182 the procedure is repeated with the first three arguments discarded.  This
3183 calls for an example:
3184
3185 @example
3186 ifelse(`foo', `bar', `third', `gnu', `gnats')
3187 @error{}m4:stdin:1: warning: ifelse: extra arguments ignored: 5 > 4
3188 @result{}gnu
3189 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
3190 @result{}
3191 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
3192 @result{}seventh
3193 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
3194 @error{}m4:stdin:4: warning: ifelse: extra arguments ignored: 8 > 7
3195 @result{}7
3196 @end example
3197
3198 As of M4 1.6, @code{ifelse} transparently handles builtin tokens
3199 generated by @code{defn} (@pxref{Defn}).  Because of this, it is always
3200 safe to compare two macro definitions, without worrying whether the
3201 macro might be a builtin.
3202
3203 @example
3204 ifelse(defn(`defn'), `', `yes', `no')
3205 @result{}no
3206 ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
3207 @result{}no
3208 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
3209 @result{}yes
3210 define(`foo', ifelse(`', `', defn(`divnum')))
3211 @result{}
3212 foo
3213 @result{}0
3214 @end example
3215
3216 Naturally, the normal case will be slightly more advanced than these
3217 examples.  A common use of @code{ifelse} is in macros implementing loops
3218 of various kinds.
3219
3220 @node Shift
3221 @section Recursion in @code{m4}
3222
3223 @cindex recursive macros
3224 @cindex macros, recursive
3225 There is no direct support for loops in @code{m4}, but macros can be
3226 recursive.  There is no limit on the number of recursion levels, other
3227 than those enforced by your hardware and operating system.
3228
3229 @cindex loops
3230 Loops can be programmed using recursion and the conditionals described
3231 previously.
3232
3233 There is a builtin macro, @code{shift}, which can, among other things,
3234 be used for iterating through the actual arguments to a macro:
3235
3236 @deffn {Builtin (m4)} shift (@var{arg1}, @dots{})
3237 Takes any number of arguments, and expands to all its arguments except
3238 @var{arg1}, separated by commas, with each argument quoted.
3239
3240 The macro @code{shift} is recognized only with parameters.
3241 @end deffn
3242
3243 @example
3244 shift
3245 @result{}shift
3246 shift(`bar')
3247 @result{}
3248 shift(`foo', `bar', `baz')
3249 @result{}bar,baz
3250 @end example
3251
3252 An example of the use of @code{shift} is this macro:
3253
3254 @cindex reversing arguments
3255 @cindex arguments, reversing
3256 @deffn Composite reverse (@dots{})
3257 Takes any number of arguments, and reverses their order.
3258 @end deffn
3259
3260 It is implemented as:
3261
3262 @example
3263 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3264                           `reverse(shift($@@)), `$1'')')
3265 @result{}
3266 reverse
3267 @result{}
3268 reverse(`foo')
3269 @result{}foo
3270 reverse(`foo', `bar', `gnats', `and gnus')
3271 @result{}and gnus, gnats, bar, foo
3272 @end example
3273
3274 While not a very interesting macro, it does show how simple loops can be
3275 made with @code{shift}, @code{ifelse} and recursion.  It also shows
3276 that @code{shift} is usually used with @samp{$@@}.  Another example of
3277 this is an implementation of a short-circuiting conditional operator.
3278
3279 @cindex short-circuiting conditional
3280 @cindex conditional, short-circuiting
3281 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
3282   @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
3283 Similar to @code{ifelse}, where an equal comparison between the first
3284 two strings results in the third, otherwise the first three arguments
3285 are discarded and the process repeats.  The difference is that each
3286 @var{test-<n>} is expanded only when it is encountered.  This means that
3287 every third argument to @code{cond} is normally given one more level of
3288 quoting than the corresponding argument to @code{ifelse}.
3289 @end deffn
3290
3291 Here is the implementation of @code{cond}, along with a demonstration of
3292 how it can short-circuit the side effects in @code{side}.  Notice how
3293 all the unquoted side effects happen regardless of how many comparisons
3294 are made with @code{ifelse}, compared with only the relevant effects
3295 with @code{cond}.
3296
3297 @example
3298 define(`cond',
3299 `ifelse(`$#', `1', `$1',
3300         `ifelse($1, `$2', `$3',
3301                 `$0(shift(shift(shift($@@))))')')')dnl
3302 define(`side', `define(`counter', incr(counter))$1')dnl
3303 define(`example1',
3304 `define(`counter', `0')dnl
3305 ifelse(side(`$1'), `yes', `one comparison: ',
3306        side(`$1'), `no', `two comparisons: ',
3307        side(`$1'), `maybe', `three comparisons: ',
3308        `side(`default answer: ')')counter')dnl
3309 define(`example2',
3310 `define(`counter', `0')dnl
3311 cond(`side(`$1')', `yes', `one comparison: ',
3312      `side(`$1')', `no', `two comparisons: ',
3313      `side(`$1')', `maybe', `three comparisons: ',
3314      `side(`default answer: ')')counter')dnl
3315 example1(`yes')
3316 @result{}one comparison: 3
3317 example1(`no')
3318 @result{}two comparisons: 3
3319 example1(`maybe')
3320 @result{}three comparisons: 3
3321 example1(`feeling rather indecisive today')
3322 @result{}default answer: 4
3323 example2(`yes')
3324 @result{}one comparison: 1
3325 example2(`no')
3326 @result{}two comparisons: 2
3327 example2(`maybe')
3328 @result{}three comparisons: 3
3329 example2(`feeling rather indecisive today')
3330 @result{}default answer: 4
3331 @end example
3332
3333 @cindex joining arguments
3334 @cindex arguments, joining
3335 @cindex concatenating arguments
3336 Another common task that requires iteration is joining a list of
3337 arguments into a single string.
3338
3339 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3340 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3341 Generate a single-quoted string, consisting of each @var{arg} separated
3342 by @var{separator}.  While @code{joinall} always outputs a
3343 @var{separator} between arguments, @code{join} avoids the
3344 @var{separator} for an empty @var{arg}.
3345 @end deffn
3346
3347 Here are some examples of its usage, based on the implementation
3348 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3349 package:
3350
3351 @comment examples
3352 @example
3353 $ @kbd{m4 -I examples}
3354 include(`join.m4')
3355 @result{}
3356 join,join(`-'),join(`-', `'),join(`-', `', `')
3357 @result{},,,
3358 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3359 @result{},,,-
3360 join(`-', `1')
3361 @result{}1
3362 join(`-', `1', `2', `3')
3363 @result{}1-2-3
3364 join(`', `1', `2', `3')
3365 @result{}123
3366 join(`-', `', `1', `', `', `2', `')
3367 @result{}1-2
3368 joinall(`-', `', `1', `', `', `2', `')
3369 @result{}-1---2-
3370 join(`,', `1', `2', `3')
3371 @result{}1,2,3
3372 define(`nargs', `$#')dnl
3373 nargs(join(`,', `1', `2', `3'))
3374 @result{}1
3375 @end example
3376
3377 Examining the implementation shows some interesting points about several
3378 m4 programming idioms.
3379
3380 @comment examples
3381 @example
3382 $ @kbd{m4 -I examples}
3383 undivert(`join.m4')dnl
3384 @result{}divert(`-1')
3385 @result{}# join(sep, args) - join each non-empty ARG into a single
3386 @result{}# string, with each element separated by SEP
3387 @result{}define(`join',
3388 @result{}`ifelse(`$#', `2', ``$2'',
3389 @result{}  `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3390 @result{}define(`_join',
3391 @result{}`ifelse(`$#$2', `2', `',
3392 @result{}  `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3393 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3394 @result{}# into a single string, with each element separated by SEP
3395 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3396 @result{}define(`_joinall',
3397 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3398 @result{}divert`'dnl
3399 @end example
3400
3401 First, notice that this implementation creates helper macros
3402 @code{_join} and @code{_joinall}.  This division of labor makes it
3403 easier to output the correct number of @var{separator} instances:
3404 @code{join} and @code{joinall} are responsible for the first argument,
3405 without a separator, while @code{_join} and @code{_joinall} are
3406 responsible for all remaining arguments, always outputting a separator
3407 when outputting an argument.
3408
3409 Next, observe how @code{join} decides to iterate to itself, because the
3410 first @var{arg} was empty, or to output the argument and swap over to
3411 @code{_join}.  If the argument is non-empty, then the nested
3412 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3413 with the @samp{$0} to form the next macro name to invoke.  The
3414 @code{joinall} implementation is simpler since it does not have to
3415 suppress empty @var{arg}; it always executes once then defers to
3416 @code{_joinall}.
3417
3418 Another important idiom is the idea that @var{separator} is reused for
3419 each iteration.  Each iteration has one less argument, but rather than
3420 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3421 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3422
3423 Next, notice that it is possible to compare more than one condition in a
3424 single @code{ifelse} test.  The test of @samp{$#$2} against @samp{2}
3425 allows @code{_join} to iterate for two separate reasons---either there
3426 are still more than two arguments, or there are exactly two arguments
3427 but the last argument is not empty.
3428
3429 Finally, notice that these macros require exactly two arguments to
3430 terminate recursion, but that they still correctly result in empty
3431 output when given no @var{args} (i.e., zero or one macro argument).  On
3432 the first pass when there are too few arguments, the @code{shift}
3433 results in no output, but leaves an empty string to serve as the
3434 required second argument for the second pass.  Put another way,
3435 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3436 former guarantees at least two arguments.
3437
3438 @cindex quote manipulation
3439 @cindex manipulating quotes
3440 Sometimes, a recursive algorithm requires adding quotes to each element,
3441 or treating multiple arguments as a single element:
3442
3443 @deffn Composite quote (@dots{})
3444 @deffnx Composite dquote (@dots{})
3445 @deffnx Composite dquote_elt (@dots{})
3446 Takes any number of arguments, and adds quoting.  With @code{quote},
3447 only one level of quoting is added, effectively removing whitespace
3448 after commas and turning multiple arguments into a single string.  With
3449 @code{dquote}, two levels of quoting are added, one around each element,
3450 and one around the list.  And with @code{dquote_elt}, two levels of
3451 quoting are added around each element.
3452 @end deffn
3453
3454 An actual implementation of these three macros is distributed as
3455 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package.  First,
3456 let's examine their usage:
3457
3458 @comment examples
3459 @example
3460 $ @kbd{m4 -I examples}
3461 include(`quote.m4')
3462 @result{}
3463 -quote-dquote-dquote_elt-
3464 @result{}----
3465 -quote()-dquote()-dquote_elt()-
3466 @result{}--`'-`'-
3467 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3468 @result{}-1-`1'-`1'-
3469 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3470 @result{}-1,2-`1',`2'-`1',`2'-
3471 define(`n', `$#')dnl
3472 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3473 @result{}-1-1-2-
3474 dquote(dquote_elt(`1', `2'))
3475 @result{}``1'',``2''
3476 dquote_elt(dquote(`1', `2'))
3477 @result{}``1',`2''
3478 @end example
3479
3480 The last two lines show that when given two arguments, @code{dquote}
3481 results in one string, while @code{dquote_elt} results in two.  Now,
3482 examine the implementation.  Note that @code{quote} and
3483 @code{dquote_elt} make decisions based on their number of arguments, so
3484 that when called without arguments, they result in nothing instead of a
3485 quoted empty string; this is so that it is possible to distinguish
3486 between no arguments and an empty first argument.  @code{dquote}, on the
3487 other hand, results in a string no matter what, since it is still
3488 possible to tell whether it was invoked without arguments based on the
3489 resulting string.
3490
3491 @comment examples
3492 @example
3493 $ @kbd{m4 -I examples}
3494 undivert(`quote.m4')dnl
3495 @result{}divert(`-1')
3496 @result{}# quote(args) - convert args to single-quoted string
3497 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3498 @result{}# dquote(args) - convert args to quoted list of quoted strings
3499 @result{}define(`dquote', ``$@@'')
3500 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3501 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3502 @result{}                             ```$1'',$0(shift($@@))')')
3503 @result{}divert`'dnl
3504 @end example
3505
3506 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3507 than @samp{joinall(`,', @var{args})} for producing the same output.
3508
3509 @cindex nine arguments, more than
3510 @cindex more than nine arguments
3511 @cindex arguments, more than nine
3512 One more useful macro based on @code{shift} allows portably selecting
3513 an arbitrary argument (usually greater than the ninth argument), without
3514 relying on the @acronym{GNU} extension of multi-digit arguments
3515 (@pxref{Arguments}).
3516
3517 @deffn Composite argn (@var{n}, @dots{})
3518 Expands to argument @var{n} out of the remaining arguments.  @var{n}
3519 must be a positive number.  Usually invoked as
3520 @samp{argn(`@var{n}',$@@)}.
3521 @end deffn
3522
3523 It is implemented as:
3524
3525 @example
3526 define(`argn', `ifelse(`$1', 1, ``$2'',
3527   `argn(decr(`$1'), shift(shift($@@)))')')
3528 @result{}
3529 argn(`1', `a')
3530 @result{}a
3531 define(`foo', `argn(`11', $@@)')
3532 @result{}
3533 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3534 @result{}k
3535 @end example
3536
3537 @node Forloop
3538 @section Iteration by counting
3539
3540 @cindex for loops
3541 @cindex loops, counting
3542 @cindex counting loops
3543 Here is an example of a loop macro that implements a simple for loop.
3544
3545 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3546 Takes the name in @var{iterator}, which must be a valid macro name, and
3547 successively assign it each integer value from @var{start} to @var{end},
3548 inclusive.  For each assignment to @var{iterator}, append @var{text} to
3549 the expansion of the @code{forloop}.  @var{text} may refer to
3550 @var{iterator}.  Any definition of @var{iterator} prior to this
3551 invocation is restored.
3552 @end deffn
3553
3554 It can, for example, be used for simple counting:
3555
3556 @comment examples
3557 @example
3558 $ @kbd{m4 -I examples}
3559 include(`forloop.m4')
3560 @result{}
3561 forloop(`i', `1', `8', `i ')
3562 @result{}1 2 3 4 5 6 7 8@w{ }
3563 @end example
3564
3565 For-loops can be nested, like:
3566
3567 @comment examples
3568 @example
3569 $ @kbd{m4 -I examples}
3570 include(`forloop.m4')
3571 @result{}
3572 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3573 ')
3574 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3575 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3576 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3577 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3578 @result{}
3579 @end example
3580
3581 The implementation of the @code{forloop} macro is fairly
3582 straightforward.  The @code{forloop} macro itself is simply a wrapper,
3583 which saves the previous definition of the first argument, calls the
3584 internal macro @code{@w{_forloop}}, and re-establishes the saved
3585 definition of the first argument.
3586
3587 The macro @code{@w{_forloop}} expands the fourth argument once, and
3588 tests to see if the iterator has reached the final value.  If it has
3589 not finished, it increments the iterator (using the predefined macro
3590 @code{incr}, @pxref{Incr}), and recurses.
3591
3592 Here is an actual implementation of @code{forloop}, distributed as
3593 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3594
3595 @comment examples
3596 @example
3597 $ @kbd{m4 -I examples}
3598 undivert(`forloop.m4')dnl
3599 @result{}divert(`-1')
3600 @result{}# forloop(var, from, to, stmt) - simple version
3601 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3602 @result{}define(`_forloop',
3603 @result{}       `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3604 @result{}divert`'dnl
3605 @end example
3606
3607 Notice the careful use of quotes.  Certain macro arguments are left
3608 unquoted, each for its own reason.  Try to find out @emph{why} these
3609 arguments are left unquoted, and see what happens if they are quoted.
3610 (As presented, these two macros are useful but not very robust for
3611 general use.  They lack even basic error handling for cases like
3612 @var{start} less than @var{end}, @var{end} not numeric, or
3613 @var{iterator} not being a macro name.  See if you can improve these
3614 macros; or @pxref{Improved forloop, , Answers}).
3615
3616 @node Foreach
3617 @section Iteration by list contents
3618
3619 @cindex for each loops
3620 @cindex loops, list iteration
3621 @cindex iterating over lists
3622 Here is an example of a loop macro that implements list iteration.
3623
3624 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3625 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3626 Takes the name in @var{iterator}, which must be a valid macro name, and
3627 successively assign it each value from @var{paren-list} or
3628 @var{quote-list}.  In @code{foreach}, @var{paren-list} is a
3629 comma-separated list of elements contained in parentheses.  In
3630 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3631 contained in a quoted string.  For each assignment to @var{iterator},
3632 append @var{text} to the overall expansion.  @var{text} may refer to
3633 @var{iterator}.  Any definition of @var{iterator} prior to this
3634 invocation is restored.
3635 @end deffn
3636
3637 As an example, this displays each word in a list inside of a sentence,
3638 using an implementation of @code{foreach} distributed as
3639 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3640 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3641
3642 @comment examples
3643 @example
3644 $ @kbd{m4 -I examples}
3645 include(`foreach.m4')
3646 @result{}
3647 foreach(`x', (foo, bar, foobar), `Word was: x
3648 ')dnl
3649 @result{}Word was: foo
3650 @result{}Word was: bar
3651 @result{}Word was: foobar
3652 include(`foreachq.m4')
3653 @result{}
3654 foreachq(`x', `foo, bar, foobar', `Word was: x
3655 ')dnl
3656 @result{}Word was: foo
3657 @result{}Word was: bar
3658 @result{}Word was: foobar
3659 @end example
3660
3661 It is possible to be more complex; each element of the @var{paren-list}
3662 or @var{quote-list} can itself be a list, to pass as further arguments
3663 to a helper macro.  This example generates a shell case statement:
3664
3665 @comment examples
3666 @example
3667 $ @kbd{m4 -I examples}
3668 include(`foreach.m4')
3669 @result{}
3670 define(`_case', `  $1)
3671     $2=" $1";;
3672 ')dnl
3673 define(`_cat', `$1$2')dnl
3674 case $`'1 in
3675 @result{}case $1 in
3676 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3677         `_cat(`_case', x)')dnl
3678 @result{}  a)
3679 @result{}    vara=" a";;
3680 @result{}  b)
3681 @result{}    varb=" b";;
3682 @result{}  c)
3683 @result{}    varc=" c";;
3684 esac
3685 @result{}esac
3686 @end example
3687
3688 The implementation of the @code{foreach} macro is a bit more involved;
3689 it is a wrapper around two helper macros.  First, @code{@w{_arg1}} is
3690 needed to grab the first element of a list.  Second,
3691 @code{@w{_foreach}} implements the recursion, successively walking
3692 through the original list.  Here is a simple implementation of
3693 @code{foreach}:
3694
3695 @comment examples
3696 @example
3697 $ @kbd{m4 -I examples}
3698 undivert(`foreach.m4')dnl
3699 @result{}divert(`-1')
3700 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3701 @result{}#   parenthesized list, simple version
3702 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3703 @result{}define(`_arg1', `$1')
3704 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3705 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3706 @result{}divert`'dnl
3707 @end example
3708
3709 Unfortunately, that implementation is not robust to macro names as list
3710 elements.  Each iteration of @code{@w{_foreach}} is stripping another
3711 layer of quotes, leading to erratic results if list elements are not
3712 already fully expanded.  The first cut at implementing @code{foreachq}
3713 takes this into account.  Also, when using quoted elements in a
3714 @var{paren-list}, the overall list must be quoted.  A @var{quote-list}
3715 has the nice property of requiring fewer characters to create a list
3716 containing the same quoted elements.  To see the difference between the
3717 two macros, we attempt to pass double-quoted macro names in a list,
3718 expecting the macro name on output after one layer of quotes is removed
3719 during list iteration and the final layer removed during the final
3720 rescan:
3721
3722 @comment examples
3723 @example
3724 $ @kbd{m4 -I examples}
3725 define(`a', `1')define(`b', `2')define(`c', `3')
3726 @result{}
3727 include(`foreach.m4')
3728 @result{}
3729 include(`foreachq.m4')
3730 @result{}
3731 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3732 ')
3733 @result{}1
3734 @result{}(2)1
3735 @result{}
3736 @result{}, x
3737 @result{})
3738 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3739 ')dnl
3740 @result{}a
3741 @result{}(b
3742 @result{}c)
3743 @end example
3744
3745 Obviously, @code{foreachq} did a better job; here is its implementation:
3746
3747 @comment examples
3748 @example
3749 $ @kbd{m4 -I examples}
3750 undivert(`foreachq.m4')dnl
3751 @result{}include(`quote.m4')dnl
3752 @result{}divert(`-1')
3753 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3754 @result{}#   quoted list, simple version
3755 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3756 @result{}define(`_arg1', `$1')
3757 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3758 @result{}  `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3759 @result{}divert`'dnl
3760 @end example
3761
3762 Notice that @code{@w{_foreachq}} had to use the helper macro
3763 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3764 embedded @code{ifelse} call does not go haywire if a list element
3765 contains a comma.  Unfortunately, this implementation of @code{foreachq}
3766 has its own severe flaw.  Whereas the @code{foreach} implementation was
3767 linear, this macro is quadratic in the number of list elements, and is
3768 much more likely to trip up the limit set by the command line option
3769 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3770 Invoking m4}).  Additionally, this implementation does not expand
3771 @samp{defn(`@var{iterator}')} very well, when compared with
3772 @code{foreach}.
3773
3774 @comment examples
3775 @example
3776 $ @kbd{m4 -I examples}
3777 include(`foreach.m4')include(`foreachq.m4')
3778 @result{}
3779 foreach(`name', `(`a', `b')', ` defn(`name')')
3780 @result{} a b
3781 foreachq(`name', ``a', `b'', ` defn(`name')')
3782 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3783 @end example
3784
3785 It is possible to have robust iteration with linear behavior and sane
3786 @var{iterator} contents for either list style.  See if you can learn
3787 from the best elements of both of these implementations to create robust
3788 macros (or @pxref{Improved foreach, , Answers}).
3789
3790 @node Stacks
3791 @section Working with definition stacks
3792
3793 @cindex definition stack
3794 @cindex pushdef stack
3795 @cindex stack, macro definition
3796 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3797 operation in @code{m4}.  Normally, only the topmost definition in a
3798 stack is important, but sometimes, it is desirable to manipulate the
3799 entire definition stack.
3800
3801 @deffn Composite stack_foreach (@var{macro}, @var{action})
3802 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3803 For each of the @code{pushdef} definitions associated with @var{macro},
3804 invoke the macro @var{action} with a single argument of that definition.
3805 @code{stack_foreach} visits the oldest definition first, while
3806 @code{stack_foreach_lifo} visits the current definition first.
3807 @var{action} should not modify or dereference @var{macro}.  There are a
3808 few special macros, such as @code{defn}, which cannot be used as the
3809 @var{macro} parameter.
3810 @end deffn
3811
3812 A sample implementation of these macros is distributed in the file
3813 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3814
3815 @comment examples
3816 @example
3817 $ @kbd{m4 -I examples}
3818 include(`stack.m4')
3819 @result{}
3820 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3821 @result{}
3822 define(`show', ``$1'
3823 ')
3824 @result{}
3825 stack_foreach(`a', `show')dnl
3826 @result{}1
3827 @result{}2
3828 @result{}3
3829 stack_foreach_lifo(`a', `show')dnl
3830 @result{}3
3831 @result{}2
3832 @result{}1
3833 @end example
3834
3835 Now for the implementation.  Note the definition of a helper macro,
3836 @code{_stack_reverse}, which destructively swaps the contents of one
3837 stack of definitions into the reverse order in the temporary macro
3838 @samp{tmp-$1}.  By calling the helper twice, the original order is
3839 restored back into the macro @samp{$1}; since the operation is
3840 destructive, this explains why @samp{$1} must not be modified or
3841 dereferenced during the traversal.  The caller can then inject
3842 additional code to pass the definition currently being visited to
3843 @samp{$2}.  The choice of helper names is intentional; since @samp{-} is
3844 not valid as part of a macro name, there is no risk of conflict with a
3845 valid macro name, and the code is guaranteed to use @code{defn} where
3846 necessary.  Finally, note that any macro used in the traversal of a
3847 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3848 handled by @code{stack_foreach}, since the macro would temporarily be
3849 undefined during the algorithm.
3850
3851 @comment examples
3852 @example
3853 $ @kbd{m4 -I examples}
3854 undivert(`stack.m4')dnl
3855 @result{}divert(`-1')
3856 @result{}# stack_foreach(macro, action)
3857 @result{}# Invoke ACTION with a single argument of each definition
3858 @result{}# from the definition stack of MACRO, starting with the oldest.
3859 @result{}define(`stack_foreach',
3860 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3861 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3862 @result{}# stack_foreach_lifo(macro, action)
3863 @result{}# Invoke ACTION with a single argument of each definition
3864 @result{}# from the definition stack of MACRO, starting with the newest.
3865 @result{}define(`stack_foreach_lifo',
3866 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3867 @result{}`_stack_reverse(`tmp-$1', `$1')')
3868 @result{}define(`_stack_reverse',
3869 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3870 @result{}divert`'dnl
3871 @end example
3872
3873 @node Composition
3874 @section Building macros with macros
3875
3876 @cindex macro composition
3877 @cindex composing macros
3878 Since m4 is a macro language, it is possible to write macros that
3879 can build other macros.  First on the list is a way to automate the
3880 creation of blind macros.
3881
3882 @cindex macro, blind
3883 @cindex blind macro
3884 @deffn Composite define_blind (@var{name}, @ovar{value})
3885 Defines @var{name} as a blind macro, such that @var{name} will expand to
3886 @var{value} only when given explicit arguments.  @var{value} should not
3887 be the result of @code{defn} (@pxref{Defn}).  This macro is only
3888 recognized with parameters, and results in an empty string.
3889 @end deffn
3890
3891 Defining a macro to define another macro can be a bit tricky.  We want
3892 to use a literal @samp{$#} in the argument to the nested @code{define}.
3893 However, if @samp{$} and @samp{#} are adjacent in the definition of
3894 @code{define_blind}, then it would be expanded as the number of
3895 arguments to @code{define_blind} rather than the intended number of
3896 arguments to @var{name}.  The solution is to pass the difficult
3897 characters through extra arguments to a helper macro
3898 @code{_define_blind}.  When composing macros, it is a common idiom to
3899 need a helper macro to concatenate text that forms parameters in the
3900 composed macro, rather than interpreting the text as a parameter of the
3901 composing macro.
3902
3903 As for the limitation against using @code{defn}, there are two reasons.
3904 If a macro was previously defined with @code{define_blind}, then it can
3905 safely be renamed to a new blind macro using plain @code{define}; using
3906 @code{define_blind} to rename it just adds another layer of
3907 @code{ifelse}, occupying memory and slowing down execution.  And if a
3908 macro is a builtin, then it would result in an attempt to define a macro
3909 consisting of both text and a builtin token; this is not supported, and
3910 the builtin token is flattened to an empty string.
3911
3912 With that explanation, here's the definition, and some sample usage.
3913 Notice that @code{define_blind} is itself a blind macro.
3914
3915 @example
3916 $ @kbd{m4 -d}
3917 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3918 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3919 @result{}
3920 define(`_define_blind', `define(`$1',
3921 `ifelse(`$3', `0', ``$4'', `$2')')')
3922 @result{}
3923 define_blind
3924 @result{}define_blind
3925 define_blind(`foo', `arguments were $*')
3926 @result{}
3927 foo
3928 @result{}foo
3929 foo(`bar')
3930 @result{}arguments were bar
3931 define(`blah', defn(`foo'))
3932 @result{}
3933 blah
3934 @result{}blah
3935 blah(`a', `b')
3936 @result{}arguments were a,b
3937 defn(`blah')
3938 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
3939 @end example
3940
3941 @cindex currying arguments
3942 @cindex argument currying
3943 Another interesting composition tactic is argument @dfn{currying}, or
3944 factoring a macro that takes multiple arguments for use in a context
3945 that provides exactly one argument.
3946
3947 @deffn Composite curry (@var{macro}, @dots{})
3948 Expand to a macro call that takes exactly one argument, then appends
3949 that argument to the original arguments and invokes @var{macro} with the
3950 resulting list of arguments.
3951 @end deffn
3952
3953 A demonstration of currying makes the intent of this macro a little more
3954 obvious.  The macro @code{stack_foreach} mentioned earlier is an example
3955 of a context that provides exactly one argument to a macro name.  But
3956 coupled with currying, we can invoke @code{reverse} with two arguments
3957 for each definition of a macro stack.  This example uses the file
3958 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
3959 distribution.
3960
3961 @comment examples
3962 @example
3963 $ @kbd{m4 -I examples}
3964 include(`curry.m4')include(`stack.m4')
3965 @result{}
3966 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3967                           `reverse(shift($@@)), `$1'')')
3968 @result{}
3969 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3970 @result{}
3971 stack_foreach(`a', `:curry(`reverse', `4')')
3972 @result{}:1, 4:2, 4:3, 4
3973 curry(`curry', `reverse', `1')(`2')(`3')
3974 @result{}3, 2, 1
3975 @end example
3976
3977 Now for the implementation.  Notice how @code{curry} leaves off with a
3978 macro name but no open parenthesis, while still in the middle of
3979 collecting arguments for @samp{$1}.  The macro @code{_curry} is the
3980 helper macro that takes one argument, then adds it to the list and
3981 finally supplies the closing parenthesis.  The use of a comma inside the
3982 @code{shift} call allows currying to also work for a macro that takes
3983 one argument, although it often makes more sense to invoke that macro
3984 directly rather than going through @code{curry}.
3985
3986 @comment examples
3987 @example
3988 $ @kbd{m4 -I examples}
3989 undivert(`curry.m4')dnl
3990 @result{}divert(`-1')
3991 @result{}# curry(macro, args)
3992 @result{}# Expand to a macro call that takes one argument, then invoke
3993 @result{}# macro(args, extra).
3994 @result{}define(`curry', `$1(shift($@@,)_$0')
3995 @result{}define(`_curry', ``$1')')
3996 @result{}divert`'dnl
3997 @end example
3998
3999 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
4000 tokens, which are silently flattened to the empty string when passed
4001 through another text macro.  The following example demonstrates a usage
4002 of @code{curry} that works in M4 1.6, but is not portable to earlier
4003 versions:
4004
4005 @comment examples
4006 @example
4007 $ @kbd{m4 -I examples}
4008 include(`curry.m4')
4009 @result{}
4010 curry(`define', `mylen')(defn(`len'))
4011 @result{}
4012 mylen(`abc')
4013 @result{}3
4014 @end example
4015
4016 @cindex renaming macros
4017 @cindex copying macros
4018 @cindex macros, copying
4019 Putting the last few concepts together, it is possible to copy or rename
4020 an entire stack of macro definitions.
4021
4022 @deffn Composite copy (@var{source}, @var{dest})
4023 @deffnx Composite rename (@var{source}, @var{dest})
4024 Ensure that @var{dest} is undefined, then define it to the same stack of
4025 definitions currently in @var{source}.  @code{copy} leaves @var{source}
4026 unchanged, while @code{rename} undefines @var{source}.  There are only a
4027 few macros, such as @code{copy} or @code{defn}, which cannot be copied
4028 via this macro.
4029 @end deffn
4030
4031 The implementation is relatively straightforward (although since it uses
4032 @code{curry}, it is unable to copy builtin macros when used with M4
4033 1.4.x.  See if you can design a portable version that works across all
4034 M4 versions, or @pxref{Improved copy, , Answers}).
4035
4036 @comment examples
4037 @example
4038 $ @kbd{m4 -I examples}
4039 include(`curry.m4')include(`stack.m4')
4040 @result{}
4041 define(`rename', `copy($@@)undefine(`$1')')dnl
4042 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
4043 ')m4exit(`1')',
4044    `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
4045 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
4046 @result{}
4047 copy(`a', `b')
4048 @result{}
4049 rename(`b', `c')
4050 @result{}
4051 a b c
4052 @result{}2 b 2
4053 popdef(`a', `c')a c
4054 @result{}0 0
4055 popdef(`a', `c')a c
4056 @result{}1 1
4057 @end example
4058
4059 @node Debugging
4060 @chapter How to debug macros and input
4061
4062 @cindex debugging macros
4063 @cindex macros, debugging
4064 When writing macros for @code{m4}, they often do not work as intended on
4065 the first try (as is the case with most programming languages).
4066 Fortunately, there is support for macro debugging in @code{m4}.
4067
4068 @menu
4069 * Dumpdef::                     Displaying macro definitions
4070 * Trace::                       Tracing macro calls
4071 * Debugmode::                   Controlling debugging options
4072 * Debuglen::                    Limiting debug output
4073 * Debugfile::                   Saving debugging output
4074 @end menu
4075
4076 @node Dumpdef
4077 @section Displaying macro definitions
4078
4079 @cindex displaying macro definitions
4080 @cindex macros, displaying definitions
4081 @cindex definitions, displaying macro
4082 @cindex standard error, output to
4083 If you want to see what a name expands into, you can use the builtin
4084 @code{dumpdef}:
4085
4086 @deffn {Builtin (m4)} dumpdef (@ovar{name@dots{}})
4087 Accepts any number of arguments.  If called without any arguments, it
4088 displays the definitions of all known names, otherwise it displays the
4089 definitions of each @var{name} given, sorted by name.  If a @var{name}
4090 is undefined, the @samp{d} debug level controls whether a warning is
4091 issued (@pxref{Debugmode}).  Likewise, the @samp{o} debug level controls
4092 whether the output is issued to standard error or the current debug
4093 file (@pxref{Debugfile}).
4094
4095 The expansion of @code{dumpdef} is void.
4096 @end deffn
4097
4098 @example
4099 $ @kbd{m4 -d}
4100 define(`foo', `Hello world.')
4101 @result{}
4102 dumpdef(`foo')
4103 @error{}foo:@tabchar{}`Hello world.'
4104 @result{}
4105 dumpdef(`define')
4106 @error{}define:@tabchar{}<define>
4107 @result{}
4108 @end example
4109
4110 The last example shows how builtin macros definitions are displayed.
4111 The definition that is dumped corresponds to what would occur if the
4112 macro were to be called at that point, even if other definitions are
4113 still live due to redefining a macro during argument collection.
4114
4115 @example
4116 $ @kbd{m4 -d}
4117 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
4118 @result{}
4119 f(popdef(`f')dumpdef(`f'))
4120 @error{}f:@tabchar{}``$0'1'
4121 @result{}f2
4122 f(popdef(`f')dumpdef(`f'))
4123 @error{}m4:stdin:3: warning: dumpdef: undefined macro 'f'
4124 @result{}f1
4125 debugmode(`-d')
4126 @result{}
4127 dumpdef(`f')
4128 @result{}
4129 @end example
4130
4131 @xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and
4132 @samp{s} flags affect the details of the display.  Remember, the
4133 @samp{q} flag is implied when the @option{--debug} option (@option{-d},
4134 @pxref{Debugging options, , Invoking m4}) is used in the command line
4135 without arguments.  Also, @option{--debuglen} (@pxref{Debuglen}) can affect
4136 output, by truncating longer strings (but not builtin and module names).
4137
4138 @comment options: -ds -l3
4139 @example
4140 $ @kbd{m4 -ds -l 3}
4141 pushdef(`foo', `1 long string')
4142 @result{}
4143 pushdef(`foo', defn(`divnum'))
4144 @result{}
4145 pushdef(`foo', `3')
4146 @result{}
4147 debugmode(`+m')
4148 @result{}
4149 dumpdef(`foo', `dnl', `indir', `__gnu__')
4150 @error{}__gnu__:@tabchar{}@{gnu@}
4151 @error{}dnl:@tabchar{}<dnl>@{m4@}
4152 @error{}foo:@tabchar{}3, <divnum>@{m4@}, 1 l...
4153 @error{}indir:@tabchar{}<indir>@{gnu@}
4154 @result{}
4155 debugmode(`-ms')debugmode(`+q')
4156 @result{}
4157 dumpdef(`foo')
4158 @error{}foo:@tabchar{}`3'
4159 @result{}
4160 @end example
4161
4162 @node Trace
4163 @section Tracing macro calls
4164
4165 @cindex tracing macro expansion
4166 @cindex macro expansion, tracing
4167 @cindex expansion, tracing macro
4168 @cindex standard error, output to
4169 It is possible to trace macro calls and expansions through the builtins
4170 @code{traceon} and @code{traceoff}:
4171
4172 @deffn {Builtin (m4)} traceon (@ovar{names@dots{}})
4173 @deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}})
4174 When called without any arguments, @code{traceon} and @code{traceoff}
4175 will turn tracing on and off, respectively, for all macros, identical to
4176 using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}).
4177
4178 When called with arguments, only the macros listed in @var{names} are
4179 affected, whether or not they are currently defined.  A macro's
4180 expansion will be traced if global tracing is on, or if the individual
4181 macro tracing flag is set; to avoid tracing a macro, both the global
4182 flag and the macro must have tracing off.
4183
4184 The expansion of @code{traceon} and @code{traceoff} is void.
4185 @end deffn
4186
4187 Whenever a traced macro is called and the arguments have been collected,
4188 the call is displayed.  If the expansion of the macro call is not void,
4189 the expansion can be displayed after the call.  The output is printed
4190 to the current debug file (defaulting to standard error,
4191 @pxref{Debugfile}).
4192
4193 @example
4194 $ @kbd{m4 -d}
4195 define(`foo', `Hello World.')
4196 @result{}
4197 define(`echo', `$@@')
4198 @result{}
4199 traceon(`foo', `echo')
4200 @result{}
4201 foo
4202 @error{}m4trace: -1- foo -> `Hello World.'
4203 @result{}Hello World.
4204 echo(`gnus', `and gnats')
4205 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
4206 @result{}gnus,and gnats
4207 @end example
4208
4209 The number between dashes is the depth of the expansion.  It is one most
4210 of the time, signifying an expansion at the outermost level, but it
4211 increases when macro arguments contain unquoted macro calls.  The
4212 maximum number that will appear between dashes is controlled by the
4213 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
4214 , Invoking m4}).  Additionally, the option @option{--trace} (or
4215 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
4216 parsing input.
4217
4218 @comment options: -d-V -L3 -tifelse
4219 @comment status: 1
4220 @example
4221 $ @kbd{m4 -L 3 -t ifelse}
4222 ifelse(`one level')
4223 @error{}m4trace: -1- ifelse
4224 @result{}
4225 ifelse(ifelse(ifelse(`three levels')))
4226 @error{}m4trace: -3- ifelse
4227 @error{}m4trace: -2- ifelse
4228 @error{}m4trace: -1- ifelse
4229 @result{}
4230 ifelse(ifelse(ifelse(ifelse(`four levels'))))
4231 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
4232 @end example
4233
4234 Tracing by name is an attribute that is preserved whether the macro is
4235 defined or not.  This allows the selection of macros to trace before
4236 those macros are defined.
4237
4238 @example
4239 $ @kbd{m4 -d}
4240 traceoff(`foo')
4241 @result{}
4242 traceon(`foo')
4243 @result{}
4244 foo
4245 @result{}foo
4246 defn(`foo')
4247 @error{}m4:stdin:4: warning: defn: undefined macro 'foo'
4248 @result{}
4249 undefine(`foo')
4250 @error{}m4:stdin:5: warning: undefine: undefined macro 'foo'
4251 @result{}
4252 pushdef(`foo')
4253 @result{}
4254 popdef(`foo')
4255 @result{}
4256 popdef(`foo')
4257 @error{}m4:stdin:8: warning: popdef: undefined macro 'foo'
4258 @result{}
4259 define(`foo', `bar')
4260 @result{}
4261 foo
4262 @error{}m4trace: -1- foo -> `bar'
4263 @result{}bar
4264 undefine(`foo')
4265 @result{}
4266 ifdef(`foo', `yes', `no')
4267 @result{}no
4268 indir(`foo')
4269 @error{}m4:stdin:13: warning: indir: undefined macro 'foo'
4270 @result{}
4271 define(`foo', `blah')
4272 @result{}
4273 foo
4274 @error{}m4trace: -1- foo -> `blah'
4275 @result{}blah
4276 @end example
4277
4278 Tracing even works on builtins.  However, @code{defn} (@pxref{Defn})
4279 does not transfer tracing status.
4280
4281 @example
4282 $ @kbd{m4 -d}
4283 traceon(`traceon')
4284 @result{}
4285 traceon(`traceoff')
4286 @error{}m4trace: -1- traceon(`traceoff') -> `'
4287 @result{}
4288 traceoff(`traceoff')
4289 @error{}m4trace: -1- traceoff(`traceoff') -> `'
4290 @result{}
4291 traceoff(`traceon')
4292 @result{}
4293 traceon(`eval', `m4_divnum')
4294 @result{}
4295 define(`m4_eval', defn(`eval'))
4296 @result{}
4297 define(`m4_divnum', defn(`divnum'))
4298 @result{}
4299 eval(divnum)
4300 @error{}m4trace: -1- eval(`0') -> `0'
4301 @result{}0
4302 m4_eval(m4_divnum)
4303 @error{}m4trace: -2- m4_divnum -> `0'
4304 @result{}0
4305 @end example
4306
4307 As of @acronym{GNU} M4 2.0, named macro tracing is independent of global
4308 tracing status; calling @code{traceoff} without arguments turns off the
4309 global trace flag, but does not turn off tracing for macros where
4310 tracing was requested by name.  Likewise, calling @code{traceon} without
4311 arguments will affect tracing of macros that are not defined yet.  This
4312 behavior matches traditional implementations of @code{m4}.
4313
4314 @example
4315 $ @kbd{m4 -d}
4316 traceon
4317 @result{}
4318 define(`foo', `bar')
4319 @error{}m4trace: -1- define(`foo', `bar') -> `'
4320 @result{}
4321 foo # traced, even though foo was not defined at traceon
4322 @error{}m4trace: -1- foo -> `bar'
4323 @result{}bar # traced, even though foo was not defined at traceon
4324 traceoff(`foo')
4325 @error{}m4trace: -1- traceoff(`foo') -> `'
4326 @result{}
4327 foo # traced, since global tracing is still on
4328 @error{}m4trace: -1- foo -> `bar'
4329 @result{}bar # traced, since global tracing is still on
4330 traceon(`foo')
4331 @error{}m4trace: -1- traceon(`foo') -> `'
4332 @result{}
4333 traceoff
4334 @error{}m4trace: -1- traceoff -> `'
4335 @result{}
4336 foo # traced, since foo is now traced by name
4337 @error{}m4trace: -1- foo -> `bar'
4338 @result{}bar # traced, since foo is now traced by name
4339 traceoff(`foo')
4340 @result{}
4341 foo # untraced
4342 @result{}bar # untraced
4343 @end example
4344
4345 However, @acronym{GNU} M4 prior to 2.0 had slightly different
4346 semantics, where @code{traceon} without arguments only affected symbols
4347 that were defined at that moment, and @code{traceoff} without arguments
4348 stopped all tracing, even when tracing was requested by macro name.  The
4349 addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it
4350 possible to write a file that approximates the older semantics
4351 regardless of which version of @acronym{GNU} M4 is in use.
4352
4353 @comment options: -d-V
4354 @example
4355 $ @kbd{m4}
4356 ifdef(`m4symbols',
4357   `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)',
4358     `builtin(`traceon', $@@)')')dnl
4359 define(`traceoff', `ifelse(`$#', `0',
4360     `builtin(`traceoff')builtin(`traceoff', m4symbols)',
4361     `builtin(`traceoff', $@@)')')')dnl
4362 define(`a', `1')
4363 @result{}
4364 traceon # called before b is defined, so b is not traced
4365 @result{} # called before b is defined, so b is not traced
4366 define(`b', `2')
4367 @error{}m4trace: -1- define
4368 @result{}
4369 a b
4370 @error{}m4trace: -1- a
4371 @result{}1 2
4372 traceon(`b')
4373 @error{}m4trace: -1- traceon
4374 @error{}m4trace: -1- ifelse
4375 @error{}m4trace: -1- builtin
4376 @result{}
4377 a b
4378 @error{}m4trace: -1- a
4379 @error{}m4trace: -1- b
4380 @result{}1 2
4381 traceoff # stops tracing b, even though it was traced by name
4382 @error{}m4trace: -1- traceoff
4383 @error{}m4trace: -1- ifelse
4384 @error{}m4trace: -1- builtin
4385 @error{}m4trace: -2- m4symbols
4386 @error{}m4trace: -1- builtin
4387 @result{} # stops tracing b, even though it was traced by name
4388 a b
4389 @result{}1 2
4390 @end example
4391
4392 @xref{Debugmode}, for information on controlling the details of the
4393 display.  The format of the trace output is not specified by
4394 @acronym{POSIX}, and varies between implementations of @code{m4}.
4395
4396 Starting with M4 1.6, tracing also works via @code{indir}
4397 (@pxref{Indir}).  However, since tracing is an attribute tracked by
4398 macro names, and @code{builtin} bypasses macro names (@pxref{Builtin}),
4399 it is not possible for @code{builtin} to trace which subsidiary builtin
4400 it invokes.  If you are worried about tracking all invocations of a
4401 given builtin, you should also trace @code{builtin}, or enable global
4402 tracing (the @samp{t} debug level, @pxref{Debugmode}).
4403
4404 @example
4405 $ @kbd{m4 -d}
4406 define(`my_defn', defn(`defn'))undefine(`defn')
4407 @result{}
4408 define(`foo', `bar')traceon(`foo', `defn', `my_defn')
4409 @result{}
4410 foo
4411 @error{}m4trace: -1- foo -> `bar'
4412 @result{}bar
4413 indir(`foo')
4414 @error{}m4trace: -1- foo -> `bar'
4415 @result{}bar
4416 my_defn(`foo')
4417 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4418 @result{}bar
4419 indir(`my_defn', `foo')
4420 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4421 @result{}bar
4422 builtin(`defn', `foo')
4423 @result{}bar
4424 debugmode(`+cxt')
4425 @result{}
4426 builtin(`defn', builtin(`shift', `', `foo'))
4427 @error{}m4trace: -1- id 12: builtin ... = <builtin>
4428 @error{}m4trace: -2- id 13: builtin ... = <builtin>
4429 @error{}m4trace: -2- id 13: builtin(`shift', `', `foo') -> ``foo''
4430 @error{}m4trace: -1- id 12: builtin(`defn', `foo') -> ``bar''
4431 @result{}bar
4432 indir(`my_defn', indir(`shift', `', `foo'))
4433 @error{}m4trace: -1- id 14: indir ... = <indir>
4434 @error{}m4trace: -2- id 15: indir ... = <indir>
4435 @error{}m4trace: -2- id 15: shift ... = <shift>
4436 @error{}m4trace: -2- id 15: shift(`', `foo') -> ``foo''
4437 @error{}m4trace: -2- id 15: indir(`shift', `', `foo') -> ``foo''
4438 @error{}m4trace: -1- id 14: my_defn ... = <defn>
4439 @error{}m4trace: -1- id 14: my_defn(`foo') -> ``bar''
4440 @error{}m4trace: -1- id 14: indir(`my_defn', `foo') -> ``bar''
4441 @result{}bar
4442 @end example
4443
4444 @node Debugmode
4445 @section Controlling debugging options
4446
4447 @cindex controlling debugging output
4448 @cindex debugging output, controlling
4449 The @option{--debug} option to @code{m4} (also spelled
4450 @option{--debugmode} or @option{-d}, @pxref{Debugging options, ,
4451 Invoking m4}) controls the amount of details presented in three
4452 categories of output.  Trace output is requested by @code{traceon}
4453 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
4454 relation to a macro invocation.  Debug output tracks useful events not
4455 associated with a macro invocation, and each line is prefixed by
4456 @samp{m4debug:}.  Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
4457 affected, with no prefix added to the output lines.
4458
4459 The @var{flags} following the option can be one or more of the
4460 following:
4461
4462 @table @code
4463 @item a
4464 In trace output, show the actual arguments that were collected before
4465 invoking the macro.  Arguments are subject to length truncation
4466 specified by @code{debuglen} (@pxref{Debuglen}).
4467
4468 @item c
4469 In trace output, show an additional line for each macro call, when the
4470 macro is seen, but before the arguments are collected, and show the
4471 definition of the macro that will be used for the expansion.  By
4472 default, only one line is printed, after all arguments are collected and
4473 the expansion determined.  The definition is subject to length
4474 truncation specified by @code{debuglen} (@pxref{Debuglen}).  This is
4475 often used with the @samp{x} flag.
4476
4477 @item d
4478 Output a warning on any attempt to dereference an undefined macro via
4479 @code{builtin}, @code{defn}, @code{dumpdef}, @code{indir},
4480 @code{popdef}, or @code{undefine}.  Note that @code{indef},
4481 @code{m4symbols},
4482 @code{traceon}, and @code{traceoff} do not dereference undefined macros.
4483 Like any other warning, the warnings enabled by this flag go to standard
4484 error regardless of the current @code{debugfile} setting, and will
4485 change exit status if the command line option @option{--fatal-warnings}
4486 was specified.  This flag is useful in diagnosing spelling mistakes in
4487 macro names.  It is enabled by default when neither @option{--debug} nor
4488 @option{--fatal-warnings} are specified on the command line.
4489
4490 @item e
4491 In trace output, show the expansion of each macro call.  The expansion
4492 is subject to length truncation specified by @code{debuglen}
4493 (@pxref{Debuglen}).
4494
4495 @item f
4496 In debug and trace output, include the name of the current input file in
4497 the output line.
4498
4499 @item i
4500 In debug output, print a message each time the current input file is
4501 changed.
4502
4503 @item l
4504 In debug and trace output, include the current input line number in the
4505 output line.
4506
4507 @item m
4508 In debug output, print a message each time a module is manipulated
4509 (@pxref{Modules}).  In trace output when the @samp{c} flag is in effect,
4510 and in dumpdef output, follow builtin macros with their module name,
4511 surrounded by braces (@samp{@{@}}).
4512
4513 @item o
4514 Output @code{dumpdef} data to standard error instead of the current
4515 debug file.  This can be useful when post-processing trace output, where
4516 interleaving dumpdef and trace output can cause ambiguities.
4517
4518 @item p
4519 In debug output, print a message when a named file is found through the
4520 path search mechanism (@pxref{Search Path}), giving the actual file name
4521 used.
4522
4523 @item q
4524 In trace and dumpdef output, quote actual arguments and macro expansions
4525 in the display with the current quotes.  This is useful in connection
4526 with the @samp{a} and @samp{e} flags above.
4527
4528 @item s
4529 In dumpdef output, show the entire stack of definitions associated with
4530 a symbol via @code{pushdef}.
4531
4532 @item t
4533 In trace output, trace all macro calls made in this invocation of
4534 @code{m4}.  This is equivalent to using @code{traceon} without
4535 arguments.
4536
4537 @item x
4538 In trace output, add a unique `macro call id' to each line of the trace
4539 output.  This is useful in connection with the @samp{c} flag above, to
4540 match where a macro is first recognized with where it is finally
4541 expanded, in spite of intermediate expansions that occur while
4542 collecting arguments.  It can also be used in isolation to determine how
4543 many macros have been expanded.
4544
4545 @item V
4546 A shorthand for all of the above flags.
4547 @end table
4548
4549 As special cases, if @var{flags} starts with a @samp{+}, the named flags
4550 are enabled without impacting other flags, and if it starts with a
4551 @samp{-}, the named flags are disabled without impacting other flags.
4552 Without either of these starting characters, @var{flags} simply replaces
4553 the previous setting.
4554 @comment FIXME - should we accept usage like debugmode(+fl-q)?  Also,
4555 @comment should we add debugmode(?) which expands to the current
4556 @comment enabled flags, and debugmode(e?) which expands to e if e is
4557 @comment currently enabled?
4558
4559 If no flags are specified with the @option{--debug} option, the default is
4560 @samp{+adeq}.  Many examples in this manual show their output using
4561 default flags.
4562
4563 @cindex @acronym{GNU} extensions
4564 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4565 the debugging output format:
4566
4567 @deffn {Builtin (gnu)} debugmode (@ovar{flags})
4568 The argument @var{flags} should be a subset of the letters listed above.
4569 If no argument is present, all debugging flags are cleared (as if
4570 @var{flags} were an explicit @samp{-V}).  With an empty argument, the
4571 most common flags are enabled (as if @var{flags} were an explicit
4572 @samp{+adeq}).  If an unknown flag is encountered, an error is issued.
4573
4574 The expansion of @code{debugmode} is void.
4575 @end deffn
4576
4577 @comment options: -d-V
4578 @example
4579 $ @kbd{m4}
4580 define(`foo', `FOO$1')
4581 @result{}
4582 traceon(`foo', `divnum')
4583 @result{}
4584 debugmode()dnl same as debugmode(`+adeq')
4585 foo
4586 @error{}m4trace: -1- foo -> `FOO'
4587 @result{}FOO
4588 debugmode(`V')debugmode(`-q')
4589 @error{}m4trace:stdin:5: -1- id 7: debugmode ... = <debugmode>@{gnu@}
4590 @error{}m4trace:stdin:5: -1- id 7: debugmode(`-q') -> `'
4591 @result{}
4592 foo(
4593 `BAR')
4594 @error{}m4trace:stdin:6: -1- id 8: foo ... = FOO$1
4595 @error{}m4trace:stdin:6: -1- id 8: foo(BAR) -> FOOBAR
4596 @result{}FOOBAR
4597 debugmode`'dnl same as debugmode(`-V')
4598 @error{}m4trace:stdin:8: -1- id 9: debugmode ... = <debugmode>@{gnu@}
4599 @error{}m4trace:stdin:8: -1- id 9: debugmode ->@w{ }
4600 foo
4601 @error{}m4trace: -1- foo
4602 @result{}FOO
4603 debugmode(`+clmx')
4604 @result{}
4605 foo(divnum)
4606 @error{}m4trace:11: -1- id 13: foo ... = FOO$1
4607 @error{}m4trace:11: -2- id 14: divnum ... = <divnum>@{m4@}
4608 @error{}m4trace:11: -2- id 14: divnum
4609 @error{}m4trace:11: -1- id 13: foo
4610 @result{}FOO0
4611 debugmode(`-m')
4612 @result{}
4613 @end example
4614
4615 This example shows the effects of the debug flags that are not related
4616 to macro tracing.
4617
4618 @comment examples
4619 @comment options: -dip
4620 @example
4621 $ @kbd{m4 -dip -I examples}
4622 @error{}m4debug: input read from 'stdin'
4623 define(`foo', `m4wrap(`wrapped text
4624 ')dnl')
4625 @result{}
4626 include(`incl.m4')dnl
4627 @error{}m4debug: path search for 'incl.m4' found 'examples/incl.m4'
4628 @error{}m4debug: input read from 'examples/incl.m4'
4629 @result{}Include file start
4630 @result{}Include file end
4631 @error{}m4debug: input reverted to stdin, line 3
4632 ^D
4633 @error{}m4debug: input exhausted
4634 @error{}m4debug: input from m4wrap recursion level 1
4635 @result{}wrapped text
4636 @error{}m4debug: input from m4wrap exhausted
4637 @end example
4638
4639 @node Debuglen
4640 @section Limiting debug output
4641
4642 @cindex @acronym{GNU} extensions
4643 @cindex arglength
4644 @cindex debuglen
4645 @cindex limiting trace output length
4646 @cindex trace output, limiting length
4647 @cindex dumpdef output, limiting length
4648 When debugging, sometimes it is desirable to reduce the clutter of
4649 arbitrary-length strings, because the prefix carries enough information
4650 to understand the issues.  The builtin macro @code{debuglen}, along with
4651 the command line option counterpart @option{--debuglen} (or @option{-l},
4652 @pxref{Debugging options, , Invoking m4}), allow on-the-fly control of
4653 debugging string lengths:
4654
4655 @deffn {Builtin (gnu)} debuglen (@var{len})
4656 The argument @var{len} is an integer that controls how much of
4657 arbitrary-length strings should be output during trace and dumpdef
4658 output.  If specified to a non-zero value, then strings longer than that
4659 length are truncated, and @samp{...} included in the output to show that
4660 truncation took place.  A warning is issued if @var{len} cannot be
4661 parsed as an integer.
4662 @comment FIXME - make this understand an optional suffix, similar to how
4663 @comment --debuglen does.  Also, we need a section documenting scaling
4664 @comment suffixes.
4665 @comment FIXME - should we allow len to be `?', meaning expand to the
4666 @comment current value?
4667
4668 The macro @code{debuglen} is recognized only with parameters.
4669 @end deffn
4670
4671 The following example demonstrates the behavior of length truncation.
4672 Note that each argument and the final result are individually truncated.
4673 Also, the special tokens for builtin functions are not truncated.
4674
4675 @comment options: -l6 -techo -tdefn
4676 @example
4677 $ @kbd{m4 -d -l 6 -t echo -t defn}
4678 debuglen(`oops')
4679 @error{}m4:stdin:1: warning: debuglen: non-numeric argument 'oops'
4680 @result{}
4681 define(`echo', `$@@')
4682 @result{}
4683 echo(`1', `long string')
4684 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4685 @result{}1,long string
4686 indir(`echo', defn(`changequote'))
4687 @error{}m4trace: -2- defn(`change...') -> `<changequote>'
4688 @error{}m4trace: -1- echo(<changequote>) -> ``<changequote>''
4689 @result{}
4690 debuglen
4691 @result{}debuglen
4692 debuglen(`0')
4693 @result{}
4694 echo(`long string')
4695 @error{}m4trace: -1- echo(`long string') -> ``long string''
4696 @result{}long string
4697 debuglen(`12')
4698 @result{}
4699 echo(`long string')
4700 @error{}m4trace: -1- echo(`long string') -> ``long string...'
4701 @result{}long string
4702 @end example
4703
4704 @node Debugfile
4705 @section Saving debugging output
4706
4707 @cindex saving debugging output
4708 @cindex debugging output, saving
4709 @cindex output, saving debugging
4710 @cindex @acronym{GNU} extensions
4711 Debug and tracing output can be redirected to files using either the
4712 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4713 Invoking m4}), or with the builtin macro @code{debugfile}:
4714
4715 @deffn {Builtin (gnu)} debugfile (@ovar{file})
4716 Send all further debug and trace output to @var{file}, opened in append
4717 mode.  If @var{file} is the empty string, debug and trace output are
4718 discarded.  If @code{debugfile} is called without any arguments, debug
4719 and trace output are sent to standard error.  Output from @code{dumpdef}
4720 is sent to this file if the debug level @code{o} is not set
4721 (@pxref{Debugmode}).  This does not affect
4722 warnings, error messages, or @code{errprint} output, which are
4723 always sent to standard error.  If @var{file} cannot be opened, the
4724 current debug file is unchanged, and an error is issued.
4725
4726 When the @option{--safer} option (@pxref{Operation modes, , Invoking
4727 m4}) is in effect, @var{file} must be empty or omitted, since otherwise
4728 an input file could cause the modification of arbitrary files.
4729
4730 The expansion of @code{debugfile} is void.
4731 @end deffn
4732
4733 @example
4734 $ @kbd{m4 -d}
4735 traceon(`divnum')
4736 @result{}
4737 divnum(`extra')
4738 @error{}m4:stdin:2: warning: divnum: extra arguments ignored: 1 > 0
4739 @error{}m4trace: -1- divnum(`extra') -> `0'
4740 @result{}0
4741 debugfile()
4742 @result{}
4743 divnum(`extra')
4744 @error{}m4:stdin:4: warning: divnum: extra arguments ignored: 1 > 0
4745 @result{}0
4746 debugfile
4747 @result{}
4748 divnum
4749 @error{}m4trace: -1- divnum -> `0'
4750 @result{}0
4751 @end example
4752
4753 Although the @option{--safer} option cripples @code{debugfile} to a
4754 limited subset of capabilities, you may still use the @option{--debugfile}
4755 option from the command line with no restrictions.
4756
4757 @comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l
4758 @comment status: 1
4759 @example
4760 $ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq}
4761 foo # traced to `trace'
4762 @result{}bar # traced to `trace'
4763 debugfile(`file')
4764 @error{}m4:stdin:2: debugfile: disabled by --safer
4765 @result{}
4766 foo # traced to `trace'
4767 @result{}bar # traced to `trace'
4768 debugfile()
4769 @result{}
4770 foo # trace discarded
4771 @result{}bar # trace discarded
4772 debugfile
4773 @result{}
4774 foo # traced to stderr
4775 @error{}m4trace:7: -1- foo -> `bar'
4776 @result{}bar # traced to stderr
4777 undivert(`trace')dnl
4778 @result{}m4trace:1: -1- foo -> `bar'
4779 @result{}m4trace:3: -1- foo -> `bar'
4780 @end example
4781
4782 Sometimes it is useful to post-process trace output, even though there
4783 is no standardized format for trace output.  In this situation, forcing
4784 @code{dumpdef} to output to standard error instead of the default of the
4785 current debug file will avoid any ambiguities between the two types of
4786 output; it also allows debugging via @code{dumpdef} when debug output is
4787 discarded.
4788
4789 @example
4790 $ @kbd{m4 -d}
4791 traceon(`divnum')
4792 @result{}
4793 divnum
4794 @error{}m4trace: -1- divnum -> `0'
4795 @result{}0
4796 dumpdef(`divnum')
4797 @error{}divnum:@tabchar{}<divnum>
4798 @result{}
4799 debugfile(`')
4800 @result{}
4801 divnum
4802 @result{}0
4803 dumpdef(`divnum')
4804 @result{}
4805 debugmode(`+o')
4806 @result{}
4807 divnum
4808 @result{}0
4809 dumpdef(`divnum')
4810 @error{}divnum:@tabchar{}<divnum>
4811 @result{}
4812 @end example
4813
4814 @node Input Control
4815 @chapter Input control
4816
4817 This chapter describes various builtin macros for controlling the input
4818 to @code{m4}.
4819
4820 @menu
4821 * Dnl::                         Deleting whitespace in input
4822 * Changequote::                 Changing the quote characters
4823 * Changecom::                   Changing the comment delimiters
4824 * Changeresyntax::              Changing the regular expression syntax
4825 * Changesyntax::                Changing the lexical structure of the input
4826 * M4wrap::                      Saving text until end of input
4827 @end menu
4828
4829 @node Dnl
4830 @section Deleting whitespace in input
4831
4832 @cindex deleting whitespace in input
4833 @cindex discarding input
4834 @cindex input, discarding
4835 The builtin @code{dnl} stands for ``Discard to Next Line'':
4836
4837 @deffn {Builtin (m4)} dnl
4838 All characters, up to and including the next newline, are discarded
4839 without performing any macro expansion.  A warning is issued if the end
4840 of the file is encountered without a newline.
4841
4842 The expansion of @code{dnl} is void.
4843 @end deffn
4844
4845 It is often used in connection with @code{define}, to remove the
4846 newline that follows the call to @code{define}.  Thus
4847
4848 @example
4849 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4850 foo
4851 @result{}Macro foo.
4852 @end example
4853
4854 The input up to and including the next newline is discarded, as opposed
4855 to the way comments are treated (@pxref{Comments}), when the command
4856 line option @option{--discard-comments} is not in effect
4857 (@pxref{Operation modes, , Invoking m4}).
4858
4859 Usually, @code{dnl} is immediately followed by an end of line or some
4860 other whitespace.  @acronym{GNU} @code{m4} will produce a warning diagnostic if
4861 @code{dnl} is followed by an open parenthesis.  In this case, @code{dnl}
4862 will collect and process all arguments, looking for a matching close
4863 parenthesis.  All predictable side effects resulting from this
4864 collection will take place.  @code{dnl} will return no output.  The
4865 input following the matching close parenthesis up to and including the
4866 next newline, on whatever line containing it, will still be discarded.
4867
4868 @example
4869 dnl(`args are ignored, but side effects occur',
4870 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4871 @error{}m4:stdin:1: warning: dnl: extra arguments ignored: 2 > 0
4872 See how `foo' was defined, foo?
4873 @result{}See how foo was defined, like this?
4874 @end example
4875
4876 If the end of file is encountered without a newline character, a
4877 warning is issued and dnl stops consuming input.
4878
4879 @example
4880 m4wrap(`m4wrap(`2 hi
4881 ')0 hi dnl 1 hi')
4882 @result{}
4883 define(`hi', `HI')
4884 @result{}
4885 ^D
4886 @error{}m4:stdin:1: warning: dnl: end of file treated as newline
4887 @result{}0 HI 2 HI
4888 @end example
4889
4890 @node Changequote
4891 @section Changing the quote characters
4892
4893 @cindex changing quote delimiters
4894 @cindex quote delimiters, changing
4895 @cindex delimiters, changing
4896 The default quote delimiters can be changed with the builtin
4897 @code{changequote}:
4898
4899 @deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '})
4900 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4901 the new end-quote delimiter.  If both arguments are missing, the default
4902 quotes (@code{`} and @code{'}) are used.  If @var{start} is void, then
4903 quoting is disabled.  Otherwise, if @var{end} is missing or void, the
4904 default end-quote delimiter (@code{'}) is used.  The quote delimiters
4905 can be of any length.
4906
4907 The expansion of @code{changequote} is void.
4908 @end deffn
4909
4910 @example
4911 changequote(`[', `]')
4912 @result{}
4913 define([foo], [Macro [foo].])
4914 @result{}
4915 foo
4916 @result{}Macro foo.
4917 @end example
4918
4919 The quotation strings can safely contain eight-bit characters.
4920 If no single character is appropriate, @var{start} and @var{end} can be
4921 of any length.  Other implementations cap the delimiter length to five
4922 characters, but @acronym{GNU} has no inherent limit.
4923
4924 @example
4925 changequote(`[[[', `]]]')
4926 @result{}
4927 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4928 @result{}
4929 foo
4930 @result{}Macro [[foo]].
4931 @end example
4932
4933 Calling @code{changequote} with @var{start} as the empty string will
4934 effectively disable the quoting mechanism, leaving no way to quote text.
4935 However, using an empty string is not portable, as some other
4936 implementations of @code{m4} revert to the default quoting, while others
4937 preserve the prior non-empty delimiter.  If @var{start} is not empty,
4938 then an empty @var{end} will use the default end-quote delimiter of
4939 @samp{'}, as otherwise, it would be impossible to end a quoted string.
4940 Again, this is not portable, as some other @code{m4} implementations
4941 reuse @var{start} as the end-quote delimiter, while others preserve the
4942 previous non-empty value.  Omitting both arguments restores the default
4943 begin-quote and end-quote delimiters; fortunately this behavior is
4944 portable to all implementations of @code{m4}.
4945
4946 @example
4947 define(`foo', `Macro `FOO'.')
4948 @result{}
4949 changequote(`', `')
4950 @result{}
4951 foo
4952 @result{}Macro `FOO'.
4953 `foo'
4954 @result{}`Macro `FOO'.'
4955 changequote(`,)
4956 @result{}
4957 foo
4958 @result{}Macro FOO.
4959 @end example
4960
4961 There is no way in @code{m4} to quote a string containing an unmatched
4962 begin-quote, except using @code{changequote} to change the current
4963 quotes.
4964
4965 If the quotes should be changed from, say, @samp{[} to @samp{[[},
4966 temporary quote characters have to be defined.  To achieve this, two
4967 calls of @code{changequote} must be made, one for the temporary quotes
4968 and one for the new quotes.
4969
4970 Macros are recognized in preference to the begin-quote string, so if a
4971 prefix of @var{start} can be recognized as part of a potential macro
4972 name, the quoting mechanism is effectively disabled.  Unless you use
4973 @code{changesyntax} (@pxref{Changesyntax}), this means that @var{start}
4974 should not begin with a letter, digit, or @samp{_} (underscore).
4975 However, even though quoted strings are not recognized, the quote
4976 characters can still be discerned in macro expansion and in trace
4977 output.
4978
4979 @example
4980 define(`echo', `$@@')
4981 @result{}
4982 define(`hi', `HI')
4983 @result{}
4984 changequote(`q', `Q')
4985 @result{}
4986 q hi Q hi
4987 @result{}q HI Q HI
4988 echo(hi)
4989 @result{}qHIQ
4990 changequote
4991 @result{}
4992 changequote(`-', `EOF')
4993 @result{}
4994 - hi EOF hi
4995 @result{} hi  HI
4996 changequote
4997 @result{}
4998 changequote(`1', `2')
4999 @result{}
5000 hi1hi2
5001 @result{}hi1hi2
5002 hi 1hi2
5003 @result{}HI hi
5004 @end example
5005
5006 Quotes are recognized in preference to argument collection.  In
5007 particular, if @var{start} is a single @samp{(}, then argument
5008 collection is effectively disabled.  For portability with other
5009 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5010 @samp{)} as the first character in @var{start}.
5011
5012 @example
5013 define(`echo', `$#:$@@:')
5014 @result{}
5015 define(`hi', `HI')
5016 @result{}
5017 changequote(`(',`)')
5018 @result{}
5019 echo(hi)
5020 @result{}0::hi
5021 changequote
5022 @result{}
5023 changequote(`((', `))')
5024 @result{}
5025 echo(hi)
5026 @result{}1:HI:
5027 echo((hi))
5028 @result{}0::hi
5029 changequote
5030 @result{}
5031 changequote(`,', `)')
5032 @result{}
5033 echo(hi,hi)bye)
5034 @result{}1:HIhibye:
5035 @end example
5036
5037 However, if you are not worried about portability, using @samp{(} and
5038 @samp{)} as quoting characters has an interesting property---you can use
5039 it to compute a quoted string containing the expansion of any quoted
5040 text, as long as the expansion results in both balanced quotes and
5041 balanced parentheses.  The trick is realizing @code{expand} uses
5042 @samp{$1} unquoted, to trigger its expansion using the normal quoting
5043 characters, but uses extra parentheses to group unquoted commas that
5044 occur in the expansion without consuming whitespace following those
5045 commas.  Then @code{_expand} uses @code{changequote} to convert the
5046 extra parentheses back into quoting characters.  Note that it takes two
5047 more @code{changequote} invocations to restore the original quotes.
5048 Contrast the behavior on whitespace when using @samp{$*}, via
5049 @code{quote}, to attempt the same task.
5050
5051 @example
5052 changequote(`[', `]')dnl
5053 define([a], [1, (b)])dnl
5054 define([b], [2])dnl
5055 define([quote], [[$*]])dnl
5056 define([expand], [_$0(($1))])dnl
5057 define([_expand],
5058   [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
5059 expand([a, a, [a, a], [[a, a]]])
5060 @result{}1, (2), 1, (2), a, a, [a, a]
5061 quote(a, a, [a, a], [[a, a]])
5062 @result{}1,(2),1,(2),a, a,[a, a]
5063 @end example
5064
5065 If @var{end} is a prefix of @var{start}, the end-quote will be
5066 recognized in preference to a nested begin-quote.  In particular,
5067 changing the quotes to have the same string for @var{start} and
5068 @var{end} disables nesting of quotes.  When quote nesting is disabled,
5069 it is impossible to double-quote strings across macro expansions, so
5070 using the same string is not done very often.
5071
5072 @example
5073 define(`hi', `HI')
5074 @result{}
5075 changequote(`""', `"')
5076 @result{}
5077 ""hi"""hi"
5078 @result{}hihi
5079 ""hi" ""hi"
5080 @result{}hi hi
5081 ""hi"" "hi"
5082 @result{}hi" "HI"
5083 changequote
5084 @result{}
5085 `hi`hi'hi'
5086 @result{}hi`hi'hi
5087 changequote(`"', `"')
5088 @result{}
5089 "hi"hi"hi"
5090 @result{}hiHIhi
5091 @end example
5092
5093 It is an error if the end of file occurs within a quoted string.
5094
5095 @comment status: 1
5096 @example
5097 `hello world'
5098 @result{}hello world
5099 `dangling quote
5100 ^D
5101 @error{}m4:stdin:2: end of file in string
5102 @end example
5103
5104 @comment status: 1
5105 @example
5106 ifelse(`dangling quote
5107 ^D
5108 @error{}m4:stdin:1: ifelse: end of file in string
5109 @end example
5110
5111 @node Changecom
5112 @section Changing the comment delimiters
5113
5114 @cindex changing comment delimiters
5115 @cindex comment delimiters, changing
5116 @cindex delimiters, changing
5117 The default comment delimiters can be changed with the builtin
5118 macro @code{changecom}:
5119
5120 @deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}})
5121 This sets @var{start} as the new begin-comment delimiter and @var{end}
5122 as the new end-comment delimiter.  If both arguments are missing, or
5123 @var{start} is void, then comments are disabled.  Otherwise, if
5124 @var{end} is missing or void, the default end-comment delimiter of
5125 newline is used.  The comment delimiters can be of any length.
5126
5127 The expansion of @code{changecom} is void.
5128 @end deffn
5129
5130 @example
5131 define(`comment', `COMMENT')
5132 @result{}
5133 # A normal comment
5134 @result{}# A normal comment
5135 changecom(`/*', `*/')
5136 @result{}
5137 # Not a comment anymore
5138 @result{}# Not a COMMENT anymore
5139 But: /* this is a comment now */ while this is not a comment
5140 @result{}But: /* this is a comment now */ while this is not a COMMENT
5141 @end example
5142
5143 @cindex comments, copied to output
5144 Note how comments are copied to the output, much as if they were quoted
5145 strings.  If you want the text inside a comment expanded, quote the
5146 begin-comment delimiter.
5147
5148 Calling @code{changecom} without any arguments, or with @var{start} as
5149 the empty string, will effectively disable the commenting mechanism.  To
5150 restore the original comment start of @samp{#}, you must explicitly ask
5151 for it.  If @var{start} is not empty, then an empty @var{end} will use
5152 the default end-comment delimiter of newline, as otherwise, it would be
5153 impossible to end a comment.  However, this is not portable, as some
5154 other @code{m4} implementations preserve the previous non-empty
5155 delimiters instead.
5156
5157 @example
5158 define(`comment', `COMMENT')
5159 @result{}
5160 changecom
5161 @result{}
5162 # Not a comment anymore
5163 @result{}# Not a COMMENT anymore
5164 changecom(`#', `')
5165 @result{}
5166 # comment again
5167 @result{}# comment again
5168 @end example
5169
5170 The comment strings can safely contain eight-bit characters.
5171 If no single character is appropriate, @var{start} and @var{end} can be
5172 of any length.  Other implementations cap the delimiter length to five
5173 characters, but @acronym{GNU} has no inherent limit.
5174
5175 As of M4 1.6, macros and quotes are recognized in preference to
5176 comments, so if a prefix of @var{start} can be recognized as part of a
5177 potential macro name, or confused with a quoted string, the comment
5178 mechanism is effectively disabled (earlier versions of @acronym{GNU} M4
5179 favored comments, but this was inconsistent with other implementations).
5180 Unless you use @code{changesyntax} (@pxref{Changesyntax}), this means
5181 that @var{start} should not begin with a letter, digit, or @samp{_}
5182 (underscore), and that neither the start-quote nor the start-comment
5183 string should be a prefix of the other.
5184
5185 @example
5186 define(`hi', `HI')
5187 @result{}
5188 define(`hi1hi2', `hello')
5189 @result{}
5190 changecom(`q', `Q')
5191 @result{}
5192 q hi Q hi
5193 @result{}q HI Q HI
5194 changecom(`1', `2')
5195 @result{}
5196 hi1hi2
5197 @result{}hello
5198 hi 1hi2
5199 @result{}HI 1hi2
5200 changecom(`[[', `]]')
5201 @result{}
5202 changequote(`[[[', `]]]')
5203 @result{}
5204 [hi]
5205 @result{}[HI]
5206 [[hi]]
5207 @result{}[[hi]]
5208 [[[hi]]]
5209 @result{}hi
5210 changequote
5211 @result{}
5212 changecom(`[[[', `]]]')
5213 @result{}
5214 changequote(`[[', `]]')
5215 @result{}
5216 [[hi]]
5217 @result{}hi
5218 [[[hi]]]
5219 @result{}[hi]
5220 @end example
5221
5222 Comments are recognized in preference to argument collection.  In
5223 particular, if @var{start} is a single @samp{(}, then argument
5224 collection is effectively disabled.  For portability with other
5225 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5226 @samp{)} as the first character in @var{start}.
5227
5228 @example
5229 define(`echo', `$#:$*:$@@:')
5230 @result{}
5231 define(`hi', `HI')
5232 @result{}
5233 changecom(`(',`)')
5234 @result{}
5235 echo(hi)
5236 @result{}0:::(hi)
5237 changecom
5238 @result{}
5239 changecom(`((', `))')
5240 @result{}
5241 echo(hi)
5242 @result{}1:HI:HI:
5243 echo((hi))
5244 @result{}0:::((hi))
5245 changecom(`,', `)')
5246 @result{}
5247 echo(hi,hi)bye)
5248 @result{}1:HI,hi)bye:HI,hi)bye:
5249 changecom
5250 @result{}
5251 echo(hi,`,`'hi',hi)
5252 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
5253 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
5254 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
5255 @end example
5256
5257 It is an error if the end of file occurs within a comment.
5258
5259 @comment status: 1
5260 @example
5261 changecom(`/*', `*/')
5262 @result{}
5263 /*dangling comment
5264 ^D
5265 @error{}m4:stdin:2: end of file in comment
5266 @end example
5267
5268 @comment status: 1
5269 @example
5270 changecom(`/*', `*/')
5271 @result{}
5272 len(/*dangling comment
5273 ^D
5274 @error{}m4:stdin:2: len: end of file in comment
5275 @end example
5276
5277 @node Changeresyntax
5278 @section Changing the regular expression syntax
5279
5280 @cindex regular expression syntax, changing
5281 @cindex basic regular expressions
5282 @cindex extended regular expressions
5283 @cindex regular expressions
5284 @cindex expressions, regular
5285 @cindex syntax, changing regular expression
5286 @cindex flavors of regular expressions
5287 @cindex @acronym{GNU} extensions
5288 The @acronym{GNU} extensions @code{patsubst}, @code{regexp}, and more
5289 recently, @code{renamesyms} each deal with regular expressions.  There
5290 are multiple flavors of regular expressions, so the
5291 @code{changeresyntax} builtin exists to allow choosing the default
5292 flavor:
5293
5294 @deffn {Builtin (gnu)} changeresyntax (@var{resyntax})
5295 Changes the default regular expression syntax used by M4 according to
5296 the value of @var{resyntax}, equivalent to passing @var{resyntax} as the
5297 argument to the command line option @option{--regexp-syntax}
5298 (@pxref{Operation modes, , Invoking m4}).  If @var{resyntax} is empty,
5299 the default flavor is reverted to the @code{GNU_M4} style, compatible
5300 with emacs.
5301
5302 @var{resyntax} can be any one of the values in the table below.  Case is
5303 not important, and @samp{-} or @samp{ } can be substituted for @samp{_} in
5304 the given names.  If @var{resyntax} is unrecognized, a warning is
5305 issued and the default flavor is not changed.
5306
5307 @table @dfn
5308 @item AWK
5309 @xref{awk regular expression syntax}, for details.
5310
5311 @item BASIC
5312 @itemx ED
5313 @itemx POSIX_BASIC
5314 @itemx SED
5315 @xref{posix-basic regular expression syntax}, for details.
5316
5317 @item BSD_M4
5318 @item EXTENDED
5319 @itemx POSIX_EXTENDED
5320 @xref{posix-extended regular expression syntax}, for details.
5321
5322 @item GNU_AWK
5323 @itemx GAWK
5324 @xref{gnu-awk regular expression syntax}, for details.
5325
5326 @item GNU_EGREP
5327 @itemx EGREP
5328 @xref{egrep regular expression syntax}, for details.
5329
5330 @item GNU_M4
5331 @item EMACS
5332 @itemx GNU_EMACS
5333 @xref{emacs regular expression syntax}, for details.  This is the
5334 default regular expression flavor.
5335
5336 @item GREP
5337 @xref{grep regular expression syntax}, for details.
5338
5339 @item MINIMAL
5340 @itemx POSIX_MINIMAL
5341 @itemx POSIX_MINIMAL_BASIC
5342 @xref{posix-minimal-basic regular expression syntax}, for details.
5343
5344 @item POSIX_AWK
5345 @xref{posix-awk regular expression syntax}, for details.
5346
5347 @item POSIX_EGREP
5348 @xref{posix-egrep regular expression syntax}, for details.
5349 @end table
5350
5351 The expansion of @code{changeresyntax} is void.
5352 The macro @code{changeresyntax} is recognized only with parameters.
5353 This macro was added in M4 2.0.
5354 @end deffn
5355
5356 For an example of how @var{resyntax} is recognized, the first three
5357 usages select the @samp{GNU_M4} regular expression flavor:
5358
5359 @example
5360 changeresyntax(`gnu m4')
5361 @result{}
5362 changeresyntax(`GNU-m4')
5363 @result{}
5364 changeresyntax(`Gnu_M4')
5365 @result{}
5366 changeresyntax(`unknown')
5367 @error{}m4:stdin:4: warning: changeresyntax: bad syntax-spec: 'unknown'
5368 @result{}
5369 @end example
5370
5371 Using @code{changeresyntax} makes it possible to omit the optional
5372 @var{resyntax} parameter to other macros, while still using a different
5373 regular expression flavor.
5374
5375 @example
5376 patsubst(`ab', `a|b', `c')
5377 @result{}ab
5378 patsubst(`ab', `a\|b', `c')
5379 @result{}cc
5380 patsubst(`ab', `a|b', `c', `EXTENDED')
5381 @result{}cc
5382 changeresyntax(`EXTENDED')
5383 @result{}
5384 patsubst(`ab', `a|b', `c')
5385 @result{}cc
5386 patsubst(`ab', `a\|b', `c')
5387 @result{}ab
5388 @end example
5389
5390 @node Changesyntax
5391 @section Changing the lexical structure of the input
5392
5393 @cindex lexical structure of the input
5394 @cindex input, lexical structure of the
5395 @cindex syntax table
5396 @cindex changing syntax
5397 @cindex @acronym{GNU} extensions
5398 @quotation
5399 The macro @code{changesyntax} and all associated functionality is
5400 experimental (@pxref{Experiments}).  The functionality might change in
5401 the future.  Please direct your comments about it the same way you would
5402 do for bugs.
5403 @end quotation
5404
5405 The input to @code{m4} is read character by character, and these
5406 characters are grouped together to form input tokens (such as macro
5407 names, strings, comments, etc.).
5408
5409 Each token is parsed according to certain rules.  For example, a macro
5410 name starts with a letter or @samp{_} and consists of the longest
5411 possible string of letters, @samp{_} and digits.  But who is to decide
5412 what characters are letters, digits, quotes, white space?  Earlier the
5413 operating system decided, now you do.  The builtin macro
5414 @code{changesyntax} is used to change the way @code{m4} parses the input
5415 stream into tokens.
5416
5417 @deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
5418 Each @var{syntax-spec} is a two-part string.  The first part is a
5419 command, consisting of a single character describing a syntax category,
5420 and an optional one-character action.  The action can be @samp{-} to
5421 remove the listed characters from that category, @samp{=} to set the
5422 category to the listed characters
5423 and reassign all other characters previously in that category to
5424 `Other', or @samp{+} to add the listed characters to the category
5425 without affecting other characters.  If an action is not specified, but
5426 additional characters are present, then @samp{=} is assumed.
5427
5428 The remaining characters of each @var{syntax-spec} form the set of
5429 characters to perform the action on for that syntax category.  Character
5430 ranges are expanded as for @code{translit} (@pxref{Translit}).  To start
5431 the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
5432 be specified.
5433
5434 If @var{syntax-spec} is just a category, and no action or characters
5435 were specified, then all characters in that category are reset to their
5436 default state.  A warning is issued if the category character is not
5437 valid.  If @var{syntax-spec} is the empty string, then all categories
5438 are reset to their default state.
5439
5440 Syntax categories are divided into basic and context.  Every input
5441 byte belongs to exactly one basic syntax category.  Additionally, any
5442 byte can be assigned to a context category regardless of its current
5443 basic category.  Context categories exist because a character can
5444 behave differently when parsed in isolation than when it occurs in
5445 context to close out a token started by another basic category (for
5446 example, @kbd{newline} defaults to the basic category `Whitespace' as
5447 well as the context category `End comment').
5448
5449 The following table describes the case-insensitive designation for each
5450 syntax category (the first byte in @var{syntax-spec}), and a description
5451 of what each category controls.
5452
5453 @multitable @columnfractions .06 .20 .13 .55
5454 @headitem Code @tab Category @tab Type @tab Description
5455
5456 @item @kbd{W} @tab @dfn{Words} @tab Basic
5457 @tab Characters that can start a macro name.  Defaults to the letters as
5458 defined by the locale, and the character @samp{_}.
5459
5460 @item @kbd{D} @tab @dfn{Digits} @tab Basic
5461 @tab Characters that, together with the letters, form the remainder of a
5462 macro name.  Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any
5463 other digits defined by the locale.
5464
5465 @item @kbd{S} @tab @dfn{White space} @tab Basic
5466 @tab Characters that should be trimmed from the beginning of each argument to
5467 a macro call.  The defaults are space, tab, newline, carriage return,
5468 form feed, and vertical tab, and any others as defined by the locale.
5469
5470 @item @kbd{(} @tab @dfn{Open parenthesis} @tab Basic
5471 @tab Characters that open the argument list of a macro call.  The default is
5472 the single character @samp{(}.
5473
5474 @item @kbd{)} @tab @dfn{Close parenthesis} @tab Basic
5475 @tab Characters that close the argument list of a macro call.  The default
5476 is the single character @samp{)}.
5477
5478 @item @kbd{,} @tab @dfn{Argument separator} @tab Basic
5479 @tab Characters that separate the arguments of a macro call.  The default is
5480 the single character @samp{,}.
5481
5482 @item @kbd{L} @tab @dfn{Left quote} @tab Basic
5483 @tab The set of characters that can start a single-character quoted string.
5484 The default is the single character @samp{`}.  For multiple-character
5485 quote delimiters, use @code{changequote} (@pxref{Changequote}).
5486
5487 @item @kbd{R} @tab @dfn{Right quote} @tab Context
5488 @tab The set of characters that can end a single-character quoted string.
5489 The default is the single character @samp{'}.  For multiple-character
5490 quote delimiters, use @code{changequote} (@pxref{Changequote}).  Note
5491 that @samp{'} also defaults to the syntax category `Other', when it
5492 appears in isolation.
5493
5494 @item @kbd{B} @tab @dfn{Begin comment} @tab Basic
5495 @tab The set of characters that can start a single-character comment.  The
5496 default is the single character @samp{#}.  For multiple-character
5497 comment delimiters, use @code{changecom} (@pxref{Changecom}).
5498
5499 @item @kbd{E} @tab @dfn{End comment} @tab Context
5500 @tab The set of characters that can end a single-character comment.  The
5501 default is the single character @kbd{newline}.  For multiple-character
5502 comment delimiters, use @code{changecom} (@pxref{Changecom}).  Note that
5503 newline also defaults to the syntax category `White space', when it
5504 appears in isolation.
5505
5506 @item @kbd{$} @tab @dfn{Dollar} @tab Context
5507 @tab Characters that can introduce an argument reference in the body of a
5508 macro.  The default is the single character @samp{$}.
5509
5510 @comment FIXME - implement ${10} argument parsing.
5511 @item @kbd{@{} @tab @dfn{Left brace} @tab Context
5512 @tab Characters that introduce an extended argument reference in the body of
5513 a macro immediately after a character in the Dollar category.  The
5514 default is the single character @samp{@{}.
5515
5516 @item @kbd{@}} @tab @dfn{Right brace} @tab Context
5517 @tab Characters that conclude an extended argument reference in the body of a
5518 macro.  The default is the single character @samp{@}}.
5519
5520 @item @kbd{O} @tab @dfn{Other} @tab Basic
5521 @tab Characters that have no special syntactical meaning to @code{m4}.
5522 Defaults to all characters except those in the categories above.
5523
5524 @item @kbd{A} @tab @dfn{Active} @tab Basic
5525 @tab Characters that themselves, alone, form macro names.  This is a
5526 @acronym{GNU} extension, and active characters have lower precedence
5527 than comments.  By default, no characters are active.
5528
5529 @item @kbd{@@} @tab @dfn{Escape} @tab Basic
5530 @tab Characters that must precede macro names for them to be recognized.
5531 This is a @acronym{GNU} extension.  When an escape character is defined,
5532 then macros are not recognized unless the escape character is present;
5533 however, the macro name, visible by @samp{$0} in macro definitions, does
5534 not include the escape character.  By default, no characters are
5535 escapes.
5536
5537 @comment FIXME - we should also consider supporting:
5538 @comment @item @kbd{I} @tab @dfn{Ignore} @tab Basic
5539 @comment @tab Characters that are ignored if they appear in
5540 @comment the input; perhaps defaulting to '\0'.
5541 @end multitable
5542
5543 The expansion of @code{changesyntax} is void.
5544 The macro @code{changesyntax} is recognized only with parameters.  Use
5545 this macro with caution, as it is possible to change the syntax in such
5546 a way that no further macros can be recognized by @code{m4}.
5547 This macro was added in M4 2.0.
5548 @end deffn
5549
5550 With @code{changesyntax} we can modify what characters form a word.  For
5551 example, we can make @samp{.} a valid character in a macro name, or even
5552 start a macro name with a number.
5553
5554 @example
5555 define(`test.1', `TEST ONE')
5556 @result{}
5557 define(`1', `one')
5558 @result{}
5559 __file__
5560 @result{}stdin
5561 test.1
5562 @result{}test.1
5563 dnl Add `.' and remove `_'.
5564 changesyntax(`W+.', `W-_')
5565 @result{}
5566 __file__
5567 @result{}__file__
5568 test.1
5569 @result{}TEST ONE
5570 dnl Set words to include numbers.
5571 changesyntax(`W=a-zA-Z0-9_')
5572 @result{}
5573 __file__
5574 @result{}stdin
5575 test.1
5576 @result{}test.one
5577 dnl Reset words to default (a-zA-Z_).
5578 changesyntax(`W')
5579 @result{}
5580 __file__
5581 @result{}stdin
5582 test.1
5583 @result{}test.1
5584 @end example
5585
5586 Another possibility is to change the syntax of a macro call.
5587
5588 @example
5589 define(`test', `$#')
5590 @result{}
5591 test(a, b, c)
5592 @result{}3
5593 dnl Change macro syntax.
5594 changesyntax(`(<', `,|', `)>')
5595 @result{}
5596 test(a, b, c)
5597 @result{}0(a, b, c)
5598 test<a|b|c>
5599 @result{}3
5600 @end example
5601
5602 Leading spaces are always removed from macro arguments in @code{m4}, but
5603 by changing the syntax categories we can avoid it.  The use of
5604 @code{format} is an alternative to using a literal tab character.
5605
5606 @example
5607 define(`test', `$1$2$3')
5608 @result{}
5609 test(`a', `b', `c')
5610 @result{}abc
5611 dnl Don't ignore whitespace.
5612 changesyntax(`O 'format(``%c'', `9')`
5613 ')
5614 @result{}
5615 test(a, b,
5616 c)
5617 @result{}a b
5618 @result{}c
5619 @end example
5620
5621 It is possible to redefine the @samp{$} used to indicate macro arguments
5622 in user defined macros.  Dollar class syntax elements are copied to the
5623 output if there is no valid expansion.
5624
5625 @example
5626 define(`argref', `Dollar: $#, Question: ?#')
5627 @result{}
5628 argref(1, 2, 3)
5629 @result{}Dollar: 3, Question: ?#
5630 dnl Change argument identifier.
5631 changesyntax(`$?')
5632 @result{}
5633 argref(1,2,3)
5634 @result{}Dollar: $#, Question: 3
5635 define(`escape', `$?`'1$?1?')
5636 @result{}
5637 escape(foo)
5638 @result{}$?1$foo?
5639 dnl Multiple argument identifiers.
5640 changesyntax(`$+$')
5641 @result{}
5642 argref(1, 2, 3)
5643 @result{}Dollar: 3, Question: 3
5644 @end example
5645
5646 Macro calls can be given a @TeX{} or Texinfo like syntax using an
5647 escape.  If one or more characters are defined as escapes, macro names
5648 are only recognized if preceded by an escape character.
5649
5650 If the escape is not followed by what is normally a word (a letter
5651 optionally followed by letters and/or numerals), that single character
5652 is returned as a macro name.
5653
5654 As always, words without a macro definition cause no error message.
5655 They and the escape character are simply output.
5656
5657 @example
5658 define(`foo', `bar')
5659 @result{}
5660 dnl Require @@ escape before any macro.
5661 changesyntax(`@@@@')
5662 @result{}
5663 foo
5664 @result{}foo
5665 @@foo
5666 @result{}bar
5667 @@bar
5668 @result{}@@bar
5669 @@dnl Change escape character.
5670 @@changesyntax(`@@\', `O@@')
5671 @result{}
5672 foo
5673 @result{}foo
5674 @@foo
5675 @result{}@@foo
5676 \foo
5677 @result{}bar
5678 define(`#', `No comment')
5679 @result{}define(#, No comment)
5680 \define(`#', `No comment')
5681 @result{}
5682 \# \foo # Comment \foo
5683 @result{}No comment bar # Comment \foo
5684 @end example
5685
5686 Active characters are known from @TeX{}.  In @code{m4} an active
5687 character is always seen as a one-letter word, and so, if it has a macro
5688 definition, the macro will be called.
5689
5690 @example
5691 define(`@@', `TEST')
5692 @result{}
5693 define(`a@@a', `hello')
5694 @result{}
5695 define(`a', `A')
5696 @result{}
5697 @@
5698 @result{}@@
5699 a@@a
5700 @result{}A@@A
5701 dnl Make @@ active.
5702 changesyntax(`A@@')
5703 @result{}
5704 @@
5705 @result{}TEST
5706 a@@a
5707 @result{}ATESTa
5708 @end example
5709
5710 There is obviously an overlap between @code{changesyntax} and
5711 @code{changequote}, since there are now two ways to modify quote
5712 delimiters.  To avoid incompatibilities, if the quotes are modified by
5713 @code{changequote}, any characters previously set to either quote
5714 delimiter by @code{changesyntax} are first demoted to the other category
5715 (@samp{O}), so the result is only a single set of quotes.  In the other
5716 direction, if quotes were already disabled, or if both the start and end
5717 delimiter set by @code{changequote} are single bytes, then
5718 @code{changesyntax} preserves those settings.  But if either delimiter
5719 occupies multiple bytes, @code{changesyntax} first disables both
5720 delimiters.  Quotes can be disabled via @code{changesyntax} by emptying
5721 the left quote basic category (@samp{L}).  Meanwhile, the right quote
5722 context category (@samp{R}) will never be empty; if a
5723 @code{changesyntax} action would otherwise leave that category empty,
5724 then the default end delimiter from @code{changequote} (@samp{'}) is
5725 used; thus, it is never possible to get @code{m4} in a state where a
5726 quoted string cannot be terminated.  These interactions apply to comment
5727 delimiters as well, @i{mutatis mutandis} with @code{changecom}.
5728
5729 @example
5730 define(`test', `TEST')
5731 @result{}
5732 dnl Add additional single-byte delimiters.
5733 changesyntax(`L+<', `R+>')
5734 @result{}
5735 <test> `test' [test] <<test>>
5736 @result{}test test [TEST] <test>
5737 dnl Use standard interface, overriding changesyntax settings.
5738 changequote(<[>, `]')
5739 @result{}
5740 <test> `test' [test] <<test>>
5741 @result{}<TEST> `TEST' test <<TEST>>
5742 dnl Introduce multi-byte delimiters.
5743 changequote([<<], [>>])
5744 @result{}
5745 <test> `test' [test] <<test>>
5746 @result{}<TEST> `TEST' [TEST] test
5747 dnl Change end quote, effectively disabling quotes.
5748 changesyntax(<<R]>>)
5749 @result{}
5750 <test> `test' [test] <<test>>
5751 @result{}<TEST> `TEST' [TEST] <<TEST>>
5752 dnl Change beginning quote, make ] normal, thus making ' end quote.
5753 changesyntax(L`, R-])
5754 @result{}
5755 <test> `test' [test] <<test>>
5756 @result{}<TEST> test [TEST] <<TEST>>
5757 dnl Set multi-byte quote; unrelated changes don't impact it.
5758 changequote(`<<', `>>')changesyntax(<<@@\>>)
5759 @result{}
5760 <\test> `\test' [\test] <<\test>>
5761 @result{}<TEST> `TEST' [TEST] \test
5762 @end example
5763
5764 If several characters are assigned to a category that forms single
5765 character tokens, all such characters are treated as equal.  Any open
5766 parenthesis will match any close parenthesis, etc.
5767
5768 @example
5769 dnl Go crazy with symbols.
5770 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
5771 @result{}
5772 eval@{2**4-1; 2: 8>
5773 @result{}00001111
5774 @end example
5775
5776 The syntax table is initialized to be backwards compatible, so if you
5777 never call @code{changesyntax}, nothing will have changed.
5778
5779 For now, debugging output continues to use @kbd{(}, @kbd{,} and @kbd{)}
5780 to show macro calls; and macro expansions that result in a list of
5781 arguments (such as @samp{$@@} or @code{shift}) use @samp{,}, regardless
5782 of the current syntax settings.  However, this is likely to change in a
5783 future release, so it should not be relied on, particularly since it is
5784 next to impossible to write recursive macros if the argument separator
5785 doesn't match between expansion and rescanning.
5786
5787 @c FIXME - changing syntax of , should not break iterative macros.
5788 @example
5789 $ @kbd{m4 -d}
5790 changesyntax(`,=|')traceon(`foo')define(`foo'|`$#:$@@')
5791 @result{}
5792 foo(foo(1|2|3))
5793 @error{}m4trace: -2- foo(`1', `2', `3') -> `3:`1',`2',`3''
5794 @error{}m4trace: -1- foo(`3:1,2,3') -> `1:`3:1,2,3''
5795 @result{}1:3:1,2,3
5796 @end example
5797
5798 @node M4wrap
5799 @section Saving text until end of input
5800
5801 @cindex saving input
5802 @cindex input, saving
5803 @cindex deferring expansion
5804 @cindex expansion, deferring
5805 It is possible to `save' some text until the end of the normal input has
5806 been seen.  Text can be saved, to be read again by @code{m4} when the
5807 normal input has been exhausted.  This feature is normally used to
5808 initiate cleanup actions before normal exit, e.g., deleting temporary
5809 files.
5810
5811 To save input text, use the builtin @code{m4wrap}:
5812
5813 @deffn {Builtin (m4)} m4wrap (@var{string}, @dots{})
5814 Stores @var{string} in a safe place, to be reread when end of input is
5815 reached.  As a @acronym{GNU} extension, additional arguments are
5816 concatenated with a space to the @var{string}.
5817
5818 Successive invocations of @code{m4wrap} accumulate saved text in
5819 first-in, first-out order, as required by @acronym{POSIX}.
5820
5821 The expansion of @code{m4wrap} is void.
5822 The macro @code{m4wrap} is recognized only with parameters.
5823 @end deffn
5824
5825 @example
5826 define(`cleanup', `This is the `cleanup' action.
5827 ')
5828 @result{}
5829 m4wrap(`cleanup')
5830 @result{}
5831 This is the first and last normal input line.
5832 @result{}This is the first and last normal input line.
5833 ^D
5834 @result{}This is the cleanup action.
5835 @end example
5836
5837 The saved input is only reread when the end of normal input is seen, and
5838 not if @code{m4exit} is used to exit @code{m4}.
5839
5840 It is safe to call @code{m4wrap} from wrapped text, where all the
5841 recursively wrapped text is deferred until the current wrapped text is
5842 exhausted.  As of M4 1.6, when @code{m4wrap} is not used recursively,
5843 the saved pieces of text are reread in the same order in which they were
5844 saved (FIFO---first in, first out), as required by @acronym{POSIX}.
5845
5846 @example
5847 m4wrap(`1
5848 ')
5849 @result{}
5850 m4wrap(`2', `3
5851 ')
5852 @result{}
5853 ^D
5854 @result{}1
5855 @result{}2 3
5856 @end example
5857
5858 However, earlier versions had reverse ordering (LIFO---last in, first
5859 out), as this behavior is more like the semantics of the C function
5860 @code{atexit}.  It is possible to emulate @acronym{POSIX} behavior even
5861 with older versions of @acronym{GNU} M4 by including the file
5862 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
5863 distribution:
5864
5865 @comment examples
5866 @example
5867 $ @kbd{m4 -I examples}
5868 undivert(`wrapfifo.m4')dnl
5869 @result{}dnl Redefine m4wrap to have FIFO semantics.
5870 @result{}define(`_m4wrap_level', `0')dnl
5871 @result{}define(`m4wrap',
5872 @result{}`ifdef(`m4wrap'_m4wrap_level,
5873 @result{}       `define(`m4wrap'_m4wrap_level,
5874 @result{}               defn(`m4wrap'_m4wrap_level)`$1')',
5875 @result{}       `builtin(`m4wrap', `define(`_m4wrap_level',
5876 @result{}                                  incr(_m4wrap_level))dnl
5877 @result{}m4wrap'_m4wrap_level)dnl
5878 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5879 include(`wrapfifo.m4')
5880 @result{}
5881 m4wrap(`a`'m4wrap(`c
5882 ', `d')')m4wrap(`b')
5883 @result{}
5884 ^D
5885 @result{}abc
5886 @end example
5887
5888 It is likewise possible to emulate LIFO behavior without resorting to
5889 the @acronym{GNU} M4 extension of @code{builtin}, by including the file
5890 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
5891 distribution.  (Unfortunately, both examples shown here share some
5892 subtle bugs.  See if you can find and correct them; or @pxref{Improved
5893 m4wrap, , Answers}).
5894
5895 @comment examples
5896 @example
5897 $ @kbd{m4 -I examples}
5898 undivert(`wraplifo.m4')dnl
5899 @result{}dnl Redefine m4wrap to have LIFO semantics.
5900 @result{}define(`_m4wrap_level', `0')dnl
5901 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
5902 @result{}define(`m4wrap',
5903 @result{}`ifdef(`m4wrap'_m4wrap_level,
5904 @result{}       `define(`m4wrap'_m4wrap_level,
5905 @result{}               `$1'defn(`m4wrap'_m4wrap_level))',
5906 @result{}       `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
5907 @result{}m4wrap'_m4wrap_level)dnl
5908 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5909 include(`wraplifo.m4')
5910 @result{}
5911 m4wrap(`a`'m4wrap(`c
5912 ', `d')')m4wrap(`b')
5913 @result{}
5914 ^D
5915 @result{}bac
5916 @end example
5917
5918 Here is an example of implementing a factorial function using
5919 @code{m4wrap}:
5920
5921 @example
5922 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
5923 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
5924 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
5925 @result{}
5926 f(`10')
5927 @result{}
5928 ^D
5929 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
5930 @end example
5931
5932 Invocations of @code{m4wrap} at the same recursion level are
5933 concatenated and rescanned as usual:
5934
5935 @example
5936 define(`ab', `AB
5937 ')
5938 @result{}
5939 m4wrap(`a')m4wrap(`b')
5940 @result{}
5941 ^D
5942 @result{}AB
5943 @end example
5944
5945 @noindent
5946 however, the transition between recursion levels behaves like an end of
5947 file condition between two input files.
5948
5949 @comment status: 1
5950 @example
5951 m4wrap(`m4wrap(`)')len(abc')
5952 @result{}
5953 ^D
5954 @error{}m4:stdin:1: len: end of file in argument list
5955 @end example
5956
5957 As of M4 1.6, @code{m4wrap} transparently handles builtin tokens
5958 generated by @code{defn} (@pxref{Defn}).  However, for portability, it
5959 is better to defer the evaluation of @code{defn} along with the rest of
5960 the wrapped text, as is done for @code{foo} in the example below, rather
5961 than computing the builtin token up front, as is done for @code{bar}.
5962
5963 @example
5964 m4wrap(`define(`foo', defn(`divnum'))foo
5965 ')
5966 @result{}
5967 m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar
5968 ')
5969 @result{}
5970 ^D
5971 @result{}0
5972 @result{}0
5973 @end example
5974
5975 @node File Inclusion
5976 @chapter File inclusion
5977
5978 @cindex file inclusion
5979 @cindex inclusion, of files
5980 @code{m4} allows you to include named files at any point in the input.
5981
5982 @menu
5983 * Include::                     Including named files
5984 * Search Path::                 Searching for include files
5985 @end menu
5986
5987 @node Include
5988 @section Including named files
5989
5990 There are two builtin macros in @code{m4} for including files:
5991
5992 @deffn {Builtin (m4)} include (@var{file})
5993 @deffnx {Builtin (m4)} sinclude (@var{file})
5994 Both macros cause the file named @var{file} to be read by
5995 @code{m4}.  When the end of the file is reached, input is resumed from
5996 the previous input file.
5997
5998 The expansion of @code{include} and @code{sinclude} is therefore the
5999 contents of @var{file}.
6000
6001 If @var{file} does not exist, is a directory, or cannot otherwise be
6002 read, the expansion is void,
6003 and @code{include} will fail with an error while @code{sinclude} is
6004 silent.  The empty string counts as a file that does not exist.
6005
6006 The macros @code{include} and @code{sinclude} are recognized only with
6007 parameters.
6008 @end deffn
6009
6010 @comment status: 1
6011 @example
6012 include(`n')
6013 @error{}m4:stdin:1: include: cannot open 'n': No such file or directory
6014 @result{}
6015 include()
6016 @error{}m4:stdin:2: include: cannot open '': No such file or directory
6017 @result{}
6018 sinclude(`n')
6019 @result{}
6020 sinclude()
6021 @result{}
6022 @end example
6023
6024 This section uses the @option{--include} command-line option (or
6025 @option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab
6026 files from the @file{m4-@value{VERSION}/@/examples}
6027 directory shipped as part of the @acronym{GNU} @code{m4} package.  The
6028 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
6029 contains the lines:
6030
6031 @comment ignore
6032 @example
6033 $ @kbd{cat examples/incl.m4}
6034 @result{}Include file start
6035 @result{}foo
6036 @result{}Include file end
6037 @end example
6038
6039 Normally file inclusion is used to insert the contents of a file
6040 into the input stream.  The contents of the file will be read by
6041 @code{m4} and macro calls in the file will be expanded:
6042
6043 @comment examples
6044 @example
6045 $ @kbd{m4 -I examples}
6046 define(`foo', `FOO')
6047 @result{}
6048 include(`incl.m4')
6049 @result{}Include file start
6050 @result{}FOO
6051 @result{}Include file end
6052 @result{}
6053 @end example
6054
6055 The fact that @code{include} and @code{sinclude} expand to the contents
6056 of the file can be used to define macros that operate on entire files.
6057 Here is an example, which defines @samp{bar} to expand to the contents
6058 of @file{incl.m4}:
6059
6060 @comment examples
6061 @example
6062 $ @kbd{m4 -I examples}
6063 define(`bar', include(`incl.m4'))
6064 @result{}
6065 This is `bar':  >>bar<<
6066 @result{}This is bar:  >>Include file start
6067 @result{}foo
6068 @result{}Include file end
6069 @result{}<<
6070 @end example
6071
6072 This use of @code{include} is not trivial, though, as files can contain
6073 quotes, commas, and parentheses, which can interfere with the way the
6074 @code{m4} parser works.  @acronym{GNU} @code{m4} seamlessly concatenates
6075 the file contents with the next character, even if the included file
6076 ended in the middle of a comment, string, or macro call.  These
6077 conditions are only treated as end of file errors if specified as input
6078 files on the command line.
6079
6080 In @acronym{GNU} @code{m4}, an alternative method of reading files is
6081 using @code{undivert} (@pxref{Undivert}) on a named file.
6082
6083 @node Search Path
6084 @section Searching for include files
6085
6086 @cindex search path for included files
6087 @cindex included files, search path for
6088 @cindex @acronym{GNU} extensions
6089 @acronym{GNU} @code{m4} allows included files to be found in other directories
6090 than the current working directory.
6091
6092 @cindex @env{M4PATH}
6093 If the @option{--prepend-include} or @option{-B} command-line option was
6094 provided (@pxref{Preprocessor features, , Invoking m4}), those
6095 directories are searched first, in reverse order that those options were
6096 listed on the command line.  Then @code{m4} looks in the current working
6097 directory.  Next comes the directories specified with the
6098 @option{--include} or @option{-I} option, in the order found on the
6099 command line.  Finally, if the @env{M4PATH} environment variable is set,
6100 it is expected to contain a colon-separated list of directories, which
6101 will be searched in order.
6102
6103 If the automatic search for include-files causes trouble, the @samp{p}
6104 debug flag (@pxref{Debugmode}) can help isolate the problem.
6105
6106 @node Diversions
6107 @chapter Diverting and undiverting output
6108
6109 @cindex deferring output
6110 Diversions are a way of temporarily saving output.  The output of
6111 @code{m4} can at any time be diverted to a temporary file, and be
6112 reinserted into the output stream, @dfn{undiverted}, again at a later
6113 time.
6114
6115 @cindex @env{TMPDIR}
6116 Numbered diversions are counted from 0 upwards, diversion number 0
6117 being the normal output stream.  @acronym{GNU}
6118 @code{m4} tries to keep diversions in memory.  However, there is a
6119 limit to the overall memory usable by all diversions taken together
6120 (512K, currently).  When this maximum is about to be exceeded,
6121 a temporary file is opened to receive the contents of the biggest
6122 diversion still in memory, freeing this memory for other diversions.
6123 When creating the temporary file, @code{m4} honors the value of the
6124 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
6125 Thus, the amount of available disk space provides the only real limit on
6126 the number and aggregate size of diversions.
6127
6128 Diversions make it possible to generate output in a different order than
6129 the input was read.  It is possible to implement topological sorting
6130 dependencies.  For example, @acronym{GNU} Autoconf makes use of
6131 diversions under the hood to ensure that the expansion of a prerequisite
6132 macro appears in the output prior to the expansion of a dependent macro,
6133 regardless of which order the two macros were invoked in the user's
6134 input file.
6135
6136 @menu
6137 * Divert::                      Diverting output
6138 * Undivert::                    Undiverting output
6139 * Divnum::                      Diversion numbers
6140 * Cleardivert::                 Discarding diverted text
6141 @end menu
6142
6143 @node Divert
6144 @section Diverting output
6145
6146 @cindex diverting output to files
6147 @cindex output, diverting to files
6148 @cindex files, diverting output to
6149 Output is diverted using @code{divert}:
6150
6151 @deffn {Builtin (m4)} divert (@dvar{number, 0}, @ovar{text})
6152 The current diversion is changed to @var{number}.  If @var{number} is left
6153 out or empty, it is assumed to be zero.  If @var{number} cannot be
6154 parsed, the diversion is unchanged.
6155
6156 @cindex @acronym{GNU} extensions
6157 As a @acronym{GNU} extension, if optional @var{text} is supplied and
6158 @var{number} was valid, then @var{text} is immediately output to the
6159 new diversion, regardless of whether the expansion of @code{divert}
6160 occurred while collecting arguments for another macro.
6161
6162 The expansion of @code{divert} is void.
6163 @end deffn
6164
6165 When all the @code{m4} input will have been processed, all existing
6166 diversions are automatically undiverted, in numerical order.
6167
6168 @example
6169 divert(`1')
6170 This text is diverted.
6171 divert
6172 @result{}
6173 This text is not diverted.
6174 @result{}This text is not diverted.
6175 ^D
6176 @result{}
6177 @result{}This text is diverted.
6178 @end example
6179
6180 Several calls of @code{divert} with the same argument do not overwrite
6181 the previous diverted text, but append to it.  Diversions are printed
6182 after any wrapped text is expanded.
6183
6184 @example
6185 define(`text', `TEXT')
6186 @result{}
6187 divert(`1')`diverted text.'
6188 divert
6189 @result{}
6190 m4wrap(`Wrapped text precedes ')
6191 @result{}
6192 ^D
6193 @result{}Wrapped TEXT precedes diverted text.
6194 @end example
6195
6196 @cindex discarding input
6197 @cindex input, discarding
6198 If output is diverted to a negative diversion, it is simply discarded.
6199 This can be used to suppress unwanted output.  A common example of
6200 unwanted output is the trailing newlines after macro definitions.  Here
6201 is a common programming idiom in @code{m4} for avoiding them.
6202
6203 @example
6204 divert(`-1')
6205 define(`foo', `Macro `foo'.')
6206 define(`bar', `Macro `bar'.')
6207 divert
6208 @result{}
6209 @end example
6210
6211 @cindex @acronym{GNU} extensions
6212 Traditional implementations only supported ten diversions.  But as a
6213 @acronym{GNU} extension, diversion numbers can be as large as positive
6214 integers will allow, rather than treating a multi-digit diversion number
6215 as a request to discard text.
6216
6217 @example
6218 divert(eval(`1<<28'))world
6219 divert(`2')hello
6220 ^D
6221 @result{}hello
6222 @result{}world
6223 @end example
6224
6225 The ability to immediately output extra text is a @acronym{GNU}
6226 extension, but it can prove useful for ensuring that text goes to a
6227 particular diversion no matter how many pending macro expansions are in
6228 progress.  For a demonstration of why this is useful, it is important to
6229 understand in the example below why @samp{one} is output in diversion 2,
6230 not diversion 1, while @samp{three} and @samp{five} both end up in the
6231 correctly numbered diversion.  The key point is that when @code{divert}
6232 is executed unquoted as part of the argument collection of another
6233 macro, the side effect takes place immediately, but the text @samp{one}
6234 is not passed to any diversion until after the @samp{divert(`2')} and
6235 the enclosing @code{echo} have also taken place.  The example with
6236 @samp{three} shows how following the quoting rule of thumb delays the
6237 invocation of @code{divert} until it is not nested in any argument
6238 collection context, while the example with @samp{five} shows the use of
6239 the optional argument to speed up the output process.
6240
6241 @example
6242 define(`echo', `$1')
6243 @result{}
6244 echo(divert(`1')`one'divert(`2'))`'dnl
6245 echo(`divert(`3')three`'divert(`4')')`'dnl
6246 echo(divert(`5', `five')divert(`6'))`'dnl
6247 divert
6248 @result{}
6249 undivert(`1')
6250 @result{}
6251 undivert(`2')
6252 @result{}one
6253 undivert(`3')
6254 @result{}three
6255 undivert(`4')
6256 @result{}
6257 undivert(`5')
6258 @result{}five
6259 undivert(`6')
6260 @result{}
6261 @end example
6262
6263 Note that @code{divert} is an English word, but also an active macro
6264 without arguments.  When processing plain text, the word might appear in
6265 normal text and be unintentionally swallowed as a macro invocation.  One
6266 way to avoid this is to use the @option{-P} option to rename all
6267 builtins (@pxref{Operation modes, , Invoking m4}).  Another is to write
6268 a wrapper that requires a parameter to be recognized.
6269
6270 @example
6271 We decided to divert the stream for irrigation.
6272 @result{}We decided to  the stream for irrigation.
6273 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
6274 @result{}
6275 divert(`-1')
6276 Ignored text.
6277 divert(`0')
6278 @result{}
6279 We decided to divert the stream for irrigation.
6280 @result{}We decided to divert the stream for irrigation.
6281 @end example
6282
6283 @node Undivert
6284 @section Undiverting output
6285
6286 Diverted text can be undiverted explicitly using the builtin
6287 @code{undivert}:
6288
6289 @deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}})
6290 Undiverts the numeric @var{diversions} given by the arguments, in the
6291 order given.  If no arguments are supplied, all diversions are
6292 undiverted, in numerical order.
6293
6294 @cindex file inclusion
6295 @cindex inclusion, of files
6296 @cindex @acronym{GNU} extensions
6297 As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
6298 strings, which are treated as the names of files to copy into the output
6299 without expansion.  A warning is issued if a file could not be opened.
6300
6301 The expansion of @code{undivert} is void.
6302 @end deffn
6303
6304 @example
6305 divert(`1')
6306 This text is diverted.
6307 divert
6308 @result{}
6309 This text is not diverted.
6310 @result{}This text is not diverted.
6311 undivert(`1')
6312 @result{}
6313 @result{}This text is diverted.
6314 @result{}
6315 @end example
6316
6317 Notice the last two blank lines.  One of them comes from the newline
6318 following @code{undivert}, the other from the newline that followed the
6319 @code{divert}!  A diversion often starts with a blank line like this.
6320
6321 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
6322 but rather copied directly to the current output, and it is therefore
6323 not an error to undivert into a diversion.  Undiverting the empty string
6324 is the same as specifying diversion 0; in either case nothing happens
6325 since the output has already been flushed.
6326
6327 @example
6328 divert(`1')diverted text
6329 divert
6330 @result{}
6331 undivert()
6332 @result{}
6333 undivert(`0')
6334 @result{}
6335 undivert
6336 @result{}diverted text
6337 @result{}
6338 divert(`1')more
6339 divert(`2')undivert(`1')diverted text`'divert
6340 @result{}
6341 undivert(`1')
6342 @result{}
6343 undivert(`2')
6344 @result{}more
6345 @result{}diverted text
6346 @end example
6347
6348 When a diversion has been undiverted, the diverted text is discarded,
6349 and it is not possible to bring back diverted text more than once.
6350
6351 @example
6352 divert(`1')
6353 This text is diverted first.
6354 divert(`0')undivert(`1')dnl
6355 @result{}
6356 @result{}This text is diverted first.
6357 undivert(`1')
6358 @result{}
6359 divert(`1')
6360 This text is also diverted but not appended.
6361 divert(`0')undivert(`1')dnl
6362 @result{}
6363 @result{}This text is also diverted but not appended.
6364 @end example
6365
6366 Attempts to undivert the current diversion are silently ignored.  Thus,
6367 when the current diversion is not 0, the current diversion does not get
6368 rearranged among the other diversions.
6369
6370 @example
6371 divert(`1')one
6372 divert(`2')two
6373 divert(`3')three
6374 divert(`4')four
6375 divert(`5')five
6376 divert(`2')undivert(`5', `2', `4')dnl
6377 undivert`'dnl effectively undivert(`1', `2', `3', `4', `5')
6378 divert`'undivert`'dnl
6379 @result{}two
6380 @result{}five
6381 @result{}four
6382 @result{}one
6383 @result{}three
6384 @end example
6385
6386 @cindex @acronym{GNU} extensions
6387 @cindex file inclusion
6388 @cindex inclusion, of files
6389 @acronym{GNU} @code{m4} allows named files to be undiverted.  Given a
6390 non-numeric argument, the contents of the file named will be copied,
6391 uninterpreted, to the current output.  This complements the builtin
6392 @code{include} (@pxref{Include}).  To illustrate the difference, assume
6393 the file @file{foo} contains:
6394
6395 @comment file: foo
6396 @example
6397 $ @kbd{cat foo}
6398 bar
6399 @end example
6400
6401 @noindent
6402 then
6403
6404 @example
6405 define(`bar', `BAR')
6406 @result{}
6407 undivert(`foo')
6408 @result{}bar
6409 @result{}
6410 include(`foo')
6411 @result{}BAR
6412 @result{}
6413 @end example
6414
6415 If the file is not found (or cannot be read), an error message is
6416 issued, and the expansion is void.  It is possible to intermix files
6417 and diversion numbers.
6418
6419 @example
6420 divert(`1')diversion one
6421 divert(`2')undivert(`foo')dnl
6422 divert(`3')diversion three
6423 divert`'dnl
6424 undivert(`1', `2', `foo', `3')dnl
6425 @result{}diversion one
6426 @result{}bar
6427 @result{}bar
6428 @result{}diversion three
6429 @end example
6430
6431 @node Divnum
6432 @section Diversion numbers
6433
6434 @cindex diversion numbers
6435 The current diversion is tracked by the builtin @code{divnum}:
6436
6437 @deffn {Builtin (m4)} divnum
6438 Expands to the number of the current diversion.
6439 @end deffn
6440
6441 @example
6442 Initial divnum
6443 @result{}Initial 0
6444 divert(`1')
6445 Diversion one: divnum
6446 divert(`2')
6447 Diversion two: divnum
6448 ^D
6449 @result{}
6450 @result{}Diversion one: 1
6451 @result{}
6452 @result{}Diversion two: 2
6453 @end example
6454
6455 @node Cleardivert
6456 @section Discarding diverted text
6457
6458 @cindex discarding diverted text
6459 @cindex diverted text, discarding
6460 Often it is not known, when output is diverted, whether the diverted
6461 text is actually needed.  Since all non-empty diversion are brought back
6462 on the main output stream when the end of input is seen, a method of
6463 discarding a diversion is needed.  If all diversions should be
6464 discarded, the easiest is to end the input to @code{m4} with
6465 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
6466
6467 @example
6468 divert(`1')
6469 Diversion one: divnum
6470 divert(`2')
6471 Diversion two: divnum
6472 divert(`-1')
6473 undivert
6474 ^D
6475 @end example
6476
6477 @noindent
6478 No output is produced at all.
6479
6480 Clearing selected diversions can be done with the following macro:
6481
6482 @deffn Composite cleardivert (@ovar{diversions@dots{}})
6483 Discard the contents of each of the listed numeric @var{diversions}.
6484 @end deffn
6485
6486 @example
6487 define(`cleardivert',
6488 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
6489 @result{}
6490 @end example
6491
6492 It is called just like @code{undivert}, but the effect is to clear the
6493 diversions, given by the arguments.  (This macro has a nasty bug!  You
6494 should try to see if you can find it and correct it; or @pxref{Improved
6495 cleardivert, , Answers}).
6496
6497 @node Modules
6498 @chapter Extending M4 with dynamic runtime modules
6499
6500 @cindex modules
6501 @cindex dynamic modules
6502 @cindex loadable modules
6503 @acronym{GNU} M4 1.4.x had a monolithic architecture.  All of its
6504 functionality was contained in a single binary, and additional macros
6505 could be added only by writing more code in the M4 language, or at the
6506 extreme by hacking the sources and recompiling the whole thing to make
6507 a custom M4 installation.
6508
6509 Starting with release 2.0, M4 uses Libtool's @code{libltdl} facilities
6510 (@pxref{Using libltdl, , libltdl, libtool, The GNU Libtool Manual})
6511 to move all of M4's builtins out to pluggable modules.  Unless compile
6512 time options are set to change the default build, the installed M4 2.0
6513 binary is virtually identical to 1.4.x, supporting the same builtins.
6514 However, an optional module can be loaded into the running M4 interpreter
6515 to provide a new @code{load} builtin.  This facilitates runtime
6516 extension of the M4 builtin macro list using compiled C code linked
6517 against a new shared library, typically named @file{libm4.so}.
6518
6519 For example, you might want to add a @code{setenv} builtin to M4, to
6520 use before invoking @code{esyscmd}.  We might write a @file{setenv.c}
6521 something like this:
6522
6523 @comment ignore
6524 @example
6525 #include "m4module.h"
6526
6527 M4BUILTIN(setenv);
6528
6529 m4_builtin m4_builtin_table[] =
6530 @{
6531   /* name      handler         flags             minargs maxargs */
6532   @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2,      3 @},
6533
6534   @{ NULL,     NULL,           0,                0,      0 @}
6535 @};
6536
6537 /**
6538  * setenv(NAME, VALUE, [OVERWRITE])
6539  **/
6540 M4BUILTIN_HANDLER (setenv)
6541 @{
6542   int overwrite = 1;
6543
6544   if (argc >= 4)
6545     if (!m4_numeric_arg (context, argc, argv, 3, &overwrite))
6546       return;
6547
6548   setenv (M4ARG (1), M4ARG (2), overwrite);
6549 @}
6550 @end example
6551
6552 Then, having compiled and linked the module, in (somewhat contrived)
6553 M4 code:
6554
6555 @comment ignore
6556 @example
6557 $ @kbd{M4MODPATH=`pwd` m4 --load-module=setenv}
6558 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6559 @result{}
6560 esyscmd(`ifconfig -a')dnl
6561 @result{}@dots{}
6562 @end example
6563
6564 Or instead of loading the module from the M4 invocation, you can use
6565 the new @code{load} builtin:
6566
6567 @comment ignore
6568 @example
6569 $ @kbd{M4MODPATH=`pwd` m4 --load-module=load}
6570 load(`setenv')
6571 @result{}
6572 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6573 @result{}
6574 @end example
6575
6576 Also, at build time, you can choose which modules to build into
6577 the core (so that they will be available without dynamic loading).
6578 SUSv3 M4 functionality is contained in the module @samp{m4}, @acronym{GNU}
6579 extensions in the module @samp{gnu}, the @code{load} builtin in the
6580 module @samp{load} and so on.
6581
6582 We hinted earlier that the @code{m4} and @code{gnu} modules are
6583 preloaded into the installed M4 binary, but it is possible to install
6584 a @emph{thinner} binary; for example, omitting the @acronym{GNU}
6585 extensions by configuring the distribution with @kbd{./configure
6586 --with-modules=m4}.  For a binary built with that option to understand
6587 code that uses @acronym{GNU} extensions, you must then run @kbd{m4
6588 --load-module=gnu}.  It is also possible to build a @emph{fatter}
6589 binary with additional modules preloaded: adding, say, the @code{load}
6590 builtin using @kbd{./configure --with-modules="m4 gnu load"}.
6591
6592 @acronym{GNU} M4 now has a facility for defining additional builtins without
6593 recompiling the sources.  In actual fact, all of the builtins provided
6594 by @acronym{GNU} M4 are loaded from such modules.  All of the builtin
6595 descriptions in this manual are annotated with the module from which
6596 they are loaded -- mostly from the module @samp{m4}.
6597
6598 When you start @acronym{GNU} M4, the modules @samp{m4} and @samp{gnu} are
6599 loaded by default.  If you supply the @option{-G} option at startup, the
6600 module @samp{traditional} is loaded instead of @samp{gnu}.
6601 @xref{Compatibility}, for more details on the differences between these
6602 two modes of startup.
6603
6604 @menu
6605 * M4modules::                   Listing loaded modules
6606 * Load::                        Loading additional modules
6607 * Unload::                      Removing loaded modules
6608 * Refcount::                    Tracking module references
6609 * Standard Modules::            Standard bundled modules
6610 @end menu
6611
6612 @node M4modules
6613 @section Listing loaded modules
6614
6615 @deffn {Builtin (load)} m4modules
6616 Expands to a quoted ordered list of currently loaded modules,
6617 with the most recently loaded module at the front of the list.  Loading
6618 a module multiple times will not affect the order of this list, the
6619 position depends on when the module was @emph{first} loaded.
6620 @end deffn
6621
6622 For example, if @acronym{GNU} @code{m4} is started with the
6623 @option{-m load} option to load the module @samp{load} and make this
6624 builtin available, @code{m4modules} will yield the following:
6625
6626 @comment options: -m load
6627 @example
6628 $ @kbd{m4 -m load}
6629 m4modules
6630 @result{}load,gnu,m4
6631 @end example
6632
6633 @node Load
6634 @section Loading additional modules
6635
6636 @deffn {Builtin (load)} load (@var{module-name})
6637 @var{module-name} will be searched for along the module search path
6638 (@pxref{Standard Modules}) and loaded if found.  Loading a module
6639 consists of running its initialization function (if any) and then adding
6640 any macros it provides to the internal table.
6641
6642 The macro @code{load} is recognized only with parameters.
6643 @end deffn
6644
6645 Once the @code{load} module has successfully loaded, use of the
6646 @samp{load} macro is entirely equivalent to the @option{-m} command line
6647 option.
6648
6649 @c The -mmpeval/--unload=mpeval pair allows the testsuite to skip this
6650 @c test if mpeval was not configured for usage.
6651 @comment options: -m load -m mpeval --unload-module=mpeval
6652 @example
6653 $ @kbd{m4 -m load}
6654 m4modules
6655 @result{}load,gnu,m4
6656 load(`mpeval')
6657 @result{}
6658 m4modules
6659 @result{}mpeval,load,gnu,m4
6660 @end example
6661
6662 @node Unload
6663 @section Removing loaded modules
6664
6665 @deffn {Builtin (load)} unload (@var{module-name})
6666 Any loaded modules that can be listed by the @code{m4modules} macro can be
6667 removed by naming them as the @var{module-name} parameter of the
6668 @code{unload} macro.  Unloading a module consists of removing all of the
6669 macros it provides from the internal table of visible macros, and
6670 running the module's finalization method (if any).
6671
6672 The macro @code{unload} is recognized only with parameters.
6673 @end deffn
6674
6675 @comment options: -m mpeval -m load
6676 @example
6677 $ @kbd{m4 -m mpeval -m load}
6678 m4modules
6679 @result{}load,mpeval,gnu,m4
6680 unload(`mpeval')
6681 @result{}
6682 m4modules
6683 @result{}load,gnu,m4
6684 @end example
6685
6686 @node Refcount
6687 @section Tracking module references
6688
6689 @deffn {Builtin (load)} refcount (@var{module-name})
6690 This macro expands to an integer representing the number of times
6691 @var{module-name} has been loaded but not yet unloaded.  No warning is
6692 issued, even if @var{module-name} does not represent a valid module.
6693
6694 The macro @code{refcount} is recognized only with parameters.
6695 @end deffn
6696
6697 This example demonstrates tracking the reference count of the gnu
6698 module.
6699
6700 @comment options: -m load
6701 @example
6702 $ @kbd{m4 -m load}
6703 m4modules
6704 @result{}load,gnu,m4
6705 refcount(`gnu')
6706 @result{}1
6707 m4modules
6708 @result{}load,gnu,m4
6709 load(`gnu')
6710 @result{}
6711 refcount(`gnu')
6712 @result{}2
6713 unload(`gnu')
6714 @result{}
6715 m4modules
6716 @result{}load,gnu,m4
6717 refcount(`gnu')
6718 @result{}1
6719 unload(`gnu')
6720 @result{}
6721 m4modules
6722 @result{}load,m4
6723 refcount(`gnu')
6724 @result{}0
6725 refcount(`NoSuchModule')
6726 @result{}0
6727 @end example
6728
6729 @node Standard Modules
6730 @section Standard bundled modules
6731
6732 @acronym{GNU} @code{m4} ships with several bundled modules as standard.
6733 By convention, these modules define a text macro that can be tested
6734 with @code{ifdef} when they are loaded; only the @code{m4} module lacks
6735 this feature test macro, since it is not permitted by @acronym{POSIX}.
6736 Each of the feature test macros are intended to be used without
6737 arguments.
6738
6739 @table @code
6740 @item m4
6741 Provides all of the builtins defined by @acronym{POSIX}.  This module
6742 is always loaded --- @acronym{GNU} @code{m4} would only be a very slow
6743 version of @command{cat} without the builtins supplied by this module.
6744
6745 @item gnu
6746 Provides all of the @acronym{GNU} extensions, as defined by
6747 @acronym{GNU} M4 through the 1.4.x release series.  It also provides a
6748 couple of feature test macros:
6749
6750 @deffn {Macro (gnu)} __gnu__
6751 Expands to the empty string, as an indication that the @samp{gnu}
6752 module is loaded.
6753 @end deffn
6754
6755 @deffn {Macro (gnu)} __m4_version__
6756 Expands to an unquoted string containing the release version number of
6757 the running @acronym{GNU} @code{m4} executable.
6758 @end deffn
6759
6760 This module is always loaded, unless the @option{-G} command line
6761 option is supplied at startup (@pxref{Limits control, , Invoking m4}).
6762
6763 @item traditional
6764 This module provides compatibility with System V @code{m4}, for anything
6765 not specified by @acronym{POSIX}, and is loaded instead of the
6766 @samp{gnu} module if the @option{-G} command line option is specified.
6767
6768 @deffn {Macro (traditional)} __traditional__
6769 Expands to the empty string, as an indication that the
6770 @samp{traditional} module is loaded.
6771 @end deffn
6772
6773 @item load
6774 This module supplies the builtins required to use modules from within a
6775 @acronym{GNU} @code{m4} program.  @xref{Modules}, for more details.  The
6776 module also defines the following macro:
6777
6778 @deffn {Macro (load)} __load__
6779 Expands to the empty string, as an indication that the @samp{load}
6780 module is loaded.
6781 @end deffn
6782
6783 @item mpeval
6784 This module provides the implementation for the experimental
6785 @code{mpeval} feature.  If the host machine does not have the
6786 @acronym{GNU} gmp library, the builtin will generate an error if called.
6787 @xref{Mpeval}, for more details.  The module also defines the following
6788 macro:
6789
6790 @deffn {Macro (mpeval)} __mpeval__
6791 Expands to the empty string, as an indication that the @samp{mpeval}
6792 module is loaded.
6793 @end deffn
6794 @end table
6795
6796 Here is an example of using the feature test macros.
6797
6798 @example
6799 $ @kbd{m4}
6800 __gnu__-__traditional__
6801 @result{}-__traditional__
6802 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6803 @result{}Extensions are active
6804 __gnu__(`ignored')
6805 @error{}m4:stdin:3: warning: __gnu__: extra arguments ignored: 1 > 0
6806 @result{}
6807 @end example
6808
6809 @comment options: -G
6810 @example
6811 $ @kbd{m4 --traditional}
6812 __gnu__-__traditional__
6813 @result{}__gnu__-
6814 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6815 @result{}Minimal features
6816 @end example
6817
6818 Since the version string is unquoted and can potentially contain macro
6819 names (for example, a beta release could be numbered @samp{1.9b}), or be
6820 impacted by the use of @code{changesyntax}), the
6821 @code{__m4_version__} macro should generally be used via @code{defn}
6822 rather than directly invoked (@pxref{Defn}).  In general, feature tests
6823 are more reliable than version number checks, so exercise caution when
6824 using this macro.
6825
6826 @comment This test is excluded from the testsuite since it depends on a
6827 @comment texinfo macro; but builtins.at covers the same thing.
6828 @comment ignore
6829 @example
6830 defn(`__m4_version__')
6831 @result{}@value{VERSION}
6832 @end example
6833
6834 @node Text handling
6835 @chapter Macros for text handling
6836
6837 There are a number of builtins in @code{m4} for manipulating text in
6838 various ways, extracting substrings, searching, substituting, and so on.
6839
6840 @menu
6841 * Len::                         Calculating length of strings
6842 * Index macro::                 Searching for substrings
6843 * Regexp::                      Searching for regular expressions
6844 * Substr::                      Extracting substrings
6845 * Translit::                    Translating characters
6846 * Patsubst::                    Substituting text by regular expression
6847 * Format::                      Formatting strings (printf-like)
6848 @end menu
6849
6850 @node Len
6851 @section Calculating length of strings
6852
6853 @cindex length of strings
6854 @cindex strings, length of
6855 The length of a string can be calculated by @code{len}:
6856
6857 @deffn {Builtin (m4)} len (@var{string})
6858 Expands to the length of @var{string}, as a decimal number.
6859
6860 The macro @code{len} is recognized only with parameters.
6861 @end deffn
6862
6863 @example
6864 len()
6865 @result{}0
6866 len(`abcdef')
6867 @result{}6
6868 @end example
6869
6870 @node Index macro
6871 @section Searching for substrings
6872
6873 @cindex substrings, locating
6874 Searching for substrings is done with @code{index}:
6875
6876 @deffn {Builtin (m4)} index (@var{string}, @var{substring}, @ovar{offset})
6877 Expands to the index of the first occurrence of @var{substring} in
6878 @var{string}.  The first character in @var{string} has index 0.  If
6879 @var{substring} does not occur in @var{string}, @code{index} expands to
6880 @samp{-1}.  If @var{offset} is provided, it determines the index at
6881 which the search starts; a negative @var{offset} specifies the offset
6882 relative to the end of @var{string}.
6883
6884 The macro @code{index} is recognized only with parameters.
6885 @end deffn
6886
6887 @example
6888 index(`gnus, gnats, and armadillos', `nat')
6889 @result{}7
6890 index(`gnus, gnats, and armadillos', `dag')
6891 @result{}-1
6892 @end example
6893
6894 Omitting @var{substring} evokes a warning, but still produces output;
6895 contrast this with an empty @var{substring}.
6896
6897 @example
6898 index(`abc')
6899 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
6900 @result{}0
6901 index(`abc', `')
6902 @result{}0
6903 index(`abc', `b')
6904 @result{}1
6905 @end example
6906
6907 @cindex @acronym{GNU} extensions
6908 As an extension, an @var{offset} can be provided to limit the search to
6909 the tail of the @var{string}.  A negative offset is interpreted relative
6910 to the end of @var{string}, and it is not an error if @var{offset}
6911 exceeds the bounds of @var{string}.
6912
6913 @example
6914 index(`aba', `a', `1')
6915 @result{}2
6916 index(`ababa', `ba', `-3')
6917 @result{}3
6918 index(`abc', `ab', `4')
6919 @result{}-1
6920 index(`abc', `bc', `-4')
6921 @result{}1
6922 @end example
6923
6924 @node Regexp
6925 @section Searching for regular expressions
6926
6927 @cindex regular expressions
6928 @cindex expressions, regular
6929 @cindex @acronym{GNU} extensions
6930 Searching for regular expressions is done with the builtin
6931 @code{regexp}:
6932
6933 @deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax})
6934 @deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @
6935   @ovar{replacement}, @ovar{resyntax})
6936 Searches for @var{regexp} in @var{string}.
6937
6938 If @var{resyntax} is given, the particular flavor of regular expression
6939 understood with respect to @var{regexp} can be changed from the current
6940 default.  @xref{Changeresyntax}, for details of the values that can be
6941 given for this argument.  If exactly three arguments given, then the
6942 third argument is treated as @var{resyntax} only if it matches a known
6943 syntax name, otherwise it is treated as @var{replacement}.
6944
6945 If @var{replacement} is omitted, @code{regexp} expands to the index of
6946 the first match of @var{regexp} in @var{string}.  If @var{regexp} does
6947 not match anywhere in @var{string}, it expands to -1.
6948
6949 If @var{replacement} is supplied, and there was a match, @code{regexp}
6950 changes the expansion to this argument, with @samp{\@var{n}} substituted
6951 by the text matched by the @var{n}th parenthesized sub-expression of
6952 @var{regexp}, up to nine sub-expressions.  The escape @samp{\&} is
6953 replaced by the text of the entire regular expression matched.  For
6954 all other characters, @samp{\} treats the next character literally.  A
6955 warning is issued if there were fewer sub-expressions than the
6956 @samp{\@var{n}} requested, or if there is a trailing @samp{\}.  If there
6957 was no match, @code{regexp} expands to the empty string.
6958
6959 The macro @code{regexp} is recognized only with parameters.
6960 @end deffn
6961
6962 @example
6963 regexp(`GNUs not Unix', `\<[a-z]\w+')
6964 @result{}5
6965 regexp(`GNUs not Unix', `\<Q\w*')
6966 @result{}-1
6967 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
6968 @result{}*** Unix *** nix ***
6969 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
6970 @result{}
6971 @end example
6972
6973 Here are some more examples on the handling of backslash:
6974
6975 @example
6976 regexp(`abc', `\(b\)', `\\\10\a')
6977 @result{}\b0a
6978 regexp(`abc', `b', `\1\')
6979 @error{}m4:stdin:2: warning: regexp: sub-expression 1 not present
6980 @error{}m4:stdin:2: warning: regexp: trailing \ ignored in replacement
6981 @result{}
6982 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
6983 @error{}m4:stdin:3: warning: regexp: sub-expression 4 not present
6984 @error{}m4:stdin:3: warning: regexp: sub-expression 5 not present
6985 @error{}m4:stdin:3: warning: regexp: sub-expression 6 not present
6986 @result{}c
6987 @end example
6988
6989 Omitting @var{regexp} evokes a warning, but still produces output;
6990 contrast this with an empty @var{regexp} argument.
6991
6992 @example
6993 regexp(`abc')
6994 @error{}m4:stdin:1: warning: regexp: too few arguments: 1 < 2
6995 @result{}0
6996 regexp(`abc', `')
6997 @result{}0
6998 regexp(`abc', `', `\\def')
6999 @result{}\def
7000 @end example
7001
7002 If @var{resyntax} is given, @var{regexp} must be given according to
7003 the syntax chosen, though the default regular expression syntax
7004 remains unchanged for other invocations:
7005
7006 @example
7007 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***',
7008        `POSIX_EXTENDED')
7009 @result{}*** Unix *** nix ***
7010 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***')
7011 @result{}
7012 @end example
7013
7014 Occasionally, you might want to pass an @var{resyntax} argument without
7015 wishing to give @var{replacement}.  If there are exactly three
7016 arguments, and the last argument is a valid @var{resyntax}, it is used
7017 as such, rather than as a replacement.
7018
7019 @example
7020 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED')
7021 @result{}9
7022 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED')
7023 @result{}POSIX_EXTENDED
7024 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `')
7025 @result{}
7026 regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `')
7027 @result{}POSIX_EXTENDED
7028 @end example
7029
7030 @node Substr
7031 @section Extracting substrings
7032
7033 @cindex extracting substrings
7034 @cindex substrings, extracting
7035 Substrings are extracted with @code{substr}:
7036
7037 @deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length}, @
7038   @ovar{replace})
7039 Performs a substring operation on @var{string}.  If @var{from} is
7040 positive, it represents the 0-based index where the substring begins.
7041 If @var{length} is omitted, the substring ends at the end of
7042 @var{string}; if it is positive, @var{length} is added to the starting
7043 index to determine the ending index.
7044
7045 @cindex @acronym{GNU} extensions
7046 As a @acronym{GNU} extension, if @var{from} is negative, it is added to
7047 the length of @var{string} to determine the starting index; if it is
7048 empty, the start of the string is used.  Likewise, if @var{length} is
7049 negative, it is added to the length of @var{string} to determine the
7050 ending index, and an emtpy @var{length} behaves like an omitted
7051 @var{length}.  It is not an error if either of the resulting indices lie
7052 outside the string, but the selected substring only contains the bytes
7053 of @var{string} that overlap the selected indices.  If the end point
7054 lies before the beginning point, the substring chosen is the empty
7055 string located at the starting index.
7056
7057 If @var{replace} is omitted, then the expansion is only the selected
7058 substring, which may be empty.  As a @acronym{GNU} extension,if
7059 @var{replace} is provided, then the expansion is the original
7060 @var{string} with the selected substring replaced by @var{replace}.  The
7061 expansion is empty and a warning issued if @var{from} or @var{length}
7062 cannot be parsed, or if @var{replace} is provided but the selected
7063 indices do not overlap with @var{string}.
7064
7065 The macro @code{substr} is recognized only with parameters.
7066 @end deffn
7067
7068 @example
7069 substr(`gnus, gnats, and armadillos', `6')
7070 @result{}gnats, and armadillos
7071 substr(`gnus, gnats, and armadillos', `6', `5')
7072 @result{}gnats
7073 @end example
7074
7075 Omitting @var{from} evokes a warning, but still produces output.  On the
7076 other hand, selecting a @var{from} or @var{length} that lies beyond
7077 @var{string} is not a problem.
7078
7079 @example
7080 substr(`abc')
7081 @error{}m4:stdin:1: warning: substr: too few arguments: 1 < 2
7082 @result{}abc
7083 substr(`abc', `')
7084 @result{}abc
7085 substr(`abc', `4')
7086 @result{}
7087 substr(`abc', `1', `4')
7088 @result{}bc
7089 @end example
7090
7091 Using negative values for @var{from} or @var{length} are @acronym{GNU}
7092 extensions, useful for accessing a fixed size tail of an
7093 arbitrary-length string.  Prior to M4 1.6, using these values would
7094 silently result in the empty string.  Some other implementations crash
7095 on negative values, and many treat an explicitly empty @var{length} as
7096 0, which is different from the omitted @var{length} implying the rest of
7097 the original @var{string}.
7098
7099 @example
7100 substr(`abcde', `2', `')
7101 @result{}cde
7102 substr(`abcde', `-3')
7103 @result{}cde
7104 substr(`abcde', `', `-3')
7105 @result{}ab
7106 substr(`abcde', `-6')
7107 @result{}abcde
7108 substr(`abcde', `-6', `5')
7109 @result{}abcd
7110 substr(`abcde', `-7', `1')
7111 @result{}
7112 substr(`abcde', `1', `-2')
7113 @result{}bc
7114 substr(`abcde', `-4', `-1')
7115 @result{}bcd
7116 substr(`abcde', `4', `-3')
7117 @result{}
7118 substr(`abcdefghij', `-09', `08')
7119 @result{}bcdefghi
7120 @end example
7121
7122 Another useful @acronym{GNU} extension, also added in M4 1.6, is the
7123 ability to replace a substring within the original @var{string}.  An
7124 empty length substring at the beginning or end of @var{string} is valid,
7125 but selecting a substring that does not overlap @var{string} causes a
7126 warning.
7127
7128 @example
7129 substr(`abcde', `1', `3', `t')
7130 @result{}ate
7131 substr(`abcde', `5', `', `f')
7132 @result{}abcdef
7133 substr(`abcde', `-3', `-4', `f')
7134 @result{}abfcde
7135 substr(`abcde', `-6', `1', `f')
7136 @result{}fabcde
7137 substr(`abcde', `-7', `1', `f')
7138 @error{}m4:stdin:5: warning: substr: substring out of range
7139 @result{}
7140 substr(`abcde', `6', `', `f')
7141 @error{}m4:stdin:6: warning: substr: substring out of range
7142 @result{}
7143 @end example
7144
7145 If backwards compabitility to M4 1.4.x behavior is necessary, the
7146 following macro is sufficient to do the job (mimicking warnings about
7147 empty @var{from} or @var{length} or an ignored fourth argument is left
7148 as an exercise to the reader).
7149
7150 @example
7151 define(`substr', `ifelse(`$#', `0', ``$0'',
7152   eval(`2 < $#')`$3', `1', `',
7153   index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')')
7154 @result{}
7155 substr(`abcde', `3')
7156 @result{}de
7157 substr(`abcde', `3', `')
7158 @result{}
7159 substr(`abcde', `-1')
7160 @result{}
7161 substr(`abcde', `1', `-1')
7162 @result{}
7163 substr(`abcde', `2', `1', `C')
7164 @result{}c
7165 @end example
7166
7167 On the other hand, it is possible to portably emulate the @acronym{GNU}
7168 extension of negative @var{from} and @var{length} arguments across all
7169 @code{m4} implementations, albeit with a lot more overhead.  This
7170 example uses @code{incr} and @code{decr} to normalize @samp{-08} to
7171 something that a later @code{eval} will treat as a decimal value, rather
7172 than looking like an invalid octal number, while avoiding using these
7173 macros on an empty string.  The helper macro @code{_substr_normalize} is
7174 recursive, since it is easier to fix @var{length} after @var{from} has
7175 been normalized, with the final iteration supplying two non-negative
7176 arguments to the original builtin, now named @code{_substr}.
7177
7178 @comment options: -daq -t_substr
7179 @example
7180 $ @kbd{m4 -daq -t _substr}
7181 define(`_substr', defn(`substr'))dnl
7182 define(`substr', `ifelse(`$#', `0', ``$0'',
7183   `_$0(`$1', _$0_normalize(len(`$1'),
7184     ifelse(`$2', `', `0', `incr(decr(`$2'))'),
7185     ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl
7186 define(`_substr_normalize', `ifelse(
7187   eval(`$2 < 0 && $1 + $2 >= 0'), `1',
7188     `$0(`$1', eval(`$1 + $2'), `$3')',
7189   eval(`$2 < 0')`$3', `1', ``0', `$1'',
7190   eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1',
7191     `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))',
7192   eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'',
7193   eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')',
7194   `$3', `', ``$2', `$1'',
7195   eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1',
7196     ``$2', eval(`$1 - $2 + $3')',
7197   eval(`$3 - 0 < 0'), `1', ``$2', `0'',
7198   ``$2', `$3'')')dnl
7199 substr(`abcde', `2', `')
7200 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7201 @result{}cde
7202 substr(`abcde', `-3')
7203 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7204 @result{}cde
7205 substr(`abcde', `', `-3')
7206 @error{}m4trace: -1- _substr(`abcde', `0', `2')
7207 @result{}ab
7208 substr(`abcde', `-6')
7209 @error{}m4trace: -1- _substr(`abcde', `0', `5')
7210 @result{}abcde
7211 substr(`abcde', `-6', `5')
7212 @error{}m4trace: -1- _substr(`abcde', `0', `4')
7213 @result{}abcd
7214 substr(`abcde', `-7', `1')
7215 @error{}m4trace: -1- _substr(`abcde', `0', `0')
7216 @result{}
7217 substr(`abcde', `1', `-2')
7218 @error{}m4trace: -1- _substr(`abcde', `1', `2')
7219 @result{}bc
7220 substr(`abcde', `-4', `-1')
7221 @error{}m4trace: -1- _substr(`abcde', `1', `3')
7222 @result{}bcd
7223 substr(`abcde', `4', `-3')
7224 @error{}m4trace: -1- _substr(`abcde', `4', `0')
7225 @result{}
7226 substr(`abcdefghij', `-09', `08')
7227 @error{}m4trace: -1- _substr(`abcdefghij', `1', `8')
7228 @result{}bcdefghi
7229 @end example
7230
7231 @node Translit
7232 @section Translating characters
7233
7234 @cindex translating characters
7235 @cindex characters, translating
7236 Character translation is done with @code{translit}:
7237
7238 @deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement})
7239 Expands to @var{string}, with each character that occurs in
7240 @var{chars} translated into the character from @var{replacement} with
7241 the same index.
7242
7243 If @var{replacement} is shorter than @var{chars}, the excess characters
7244 of @var{chars} are deleted from the expansion; if @var{chars} is
7245 shorter, the excess characters in @var{replacement} are silently
7246 ignored.  If @var{replacement} is omitted, all characters in
7247 @var{string} that are present in @var{chars} are deleted from the
7248 expansion.  If a character appears more than once in @var{chars}, only
7249 the first instance is used in making the translation.  Only a single
7250 translation pass is made, even if characters in @var{replacement} also
7251 appear in @var{chars}.
7252
7253 As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
7254 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
7255 letters) or @samp{0-9} (meaning all digits).  To include a dash @samp{-}
7256 in @var{chars} or @var{replacement}, place it first or last in the
7257 entire string, or as the last character of a range.  Back-to-back ranges
7258 can share a common endpoint.  It is not an error for the last character
7259 in the range to be `larger' than the first.  In that case, the range
7260 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
7261 The expansion of a range is dependent on the underlying encoding of
7262 characters, so using ranges is not always portable between machines.
7263
7264 The macro @code{translit} is recognized only with parameters.
7265 @end deffn
7266
7267 @example
7268 translit(`GNUs not Unix', `A-Z')
7269 @result{}s not nix
7270 translit(`GNUs not Unix', `a-z', `A-Z')
7271 @result{}GNUS NOT UNIX
7272 translit(`GNUs not Unix', `A-Z', `z-a')
7273 @result{}tmfs not fnix
7274 translit(`+,-12345', `+--1-5', `<;>a-c-a')
7275 @result{}<;>abcba
7276 translit(`abcdef', `aabdef', `bcged')
7277 @result{}bgced
7278 @end example
7279
7280 In the @sc{ascii} encoding, the first example deletes all uppercase
7281 letters, the second converts lowercase to uppercase, and the third
7282 `mirrors' all uppercase letters, while converting them to lowercase.
7283 The two first cases are by far the most common, even though they are not
7284 portable to @sc{ebcdic} or other encodings.  The fourth example shows a
7285 range ending in @samp{-}, as well as back-to-back ranges.  The final
7286 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
7287 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
7288 @samp{e} are swapped, and the @samp{f} is discarded.
7289
7290 Omitting @var{chars} evokes a warning, but still produces output.
7291
7292 @example
7293 translit(`abc')
7294 @error{}m4:stdin:1: warning: translit: too few arguments: 1 < 2
7295 @result{}abc
7296 @end example
7297
7298 @node Patsubst
7299 @section Substituting text by regular expression
7300
7301 @cindex regular expressions
7302 @cindex expressions, regular
7303 @cindex pattern substitution
7304 @cindex substitution by regular expression
7305 @cindex @acronym{GNU} extensions
7306 Global substitution in a string is done by @code{patsubst}:
7307
7308 @deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @
7309   @ovar{replacement}, @ovar{resyntax})
7310 Searches @var{string} for matches of @var{regexp}, and substitutes
7311 @var{replacement} for each match.
7312
7313 If @var{resyntax} is given, the particular flavor of regular expression
7314 understood with respect to @var{regexp} can be changed from the current
7315 default.  @xref{Changeresyntax}, for details of the values that can be
7316 given for this argument.  Unlike @var{regexp}, if exactly three
7317 arguments given, the third argument is always treated as
7318 @var{replacement}, even if it matches a known syntax name.
7319
7320 The parts of @var{string} that are not covered by any match of
7321 @var{regexp} are copied to the expansion.  Whenever a match is found, the
7322 search proceeds from the end of the match, so a character from
7323 @var{string} will never be substituted twice.  If @var{regexp} matches a
7324 string of zero length, the start position for the search is incremented,
7325 to avoid infinite loops.
7326
7327 When a replacement is to be made, @var{replacement} is inserted into
7328 the expansion, with @samp{\@var{n}} substituted by the text matched by
7329 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
7330 nine sub-expressions.  The escape @samp{\&} is replaced by the text of
7331 the entire regular expression matched.  For all other characters,
7332 @samp{\} treats the next character literally.  A warning is issued if
7333 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
7334 if there is a trailing @samp{\}.
7335
7336 The @var{replacement} argument can be omitted, in which case the text
7337 matched by @var{regexp} is deleted.
7338
7339 The macro @code{patsubst} is recognized only with parameters.
7340 @end deffn
7341
7342 When used with two arguments, @code{regexp} returns the position of the
7343 match, but @code{patsubst} deletes the match:
7344
7345 @example
7346 patsubst(`GNUs not Unix', `^', `OBS: ')
7347 @result{}OBS: GNUs not Unix
7348 patsubst(`GNUs not Unix', `\<', `OBS: ')
7349 @result{}OBS: GNUs OBS: not OBS: Unix
7350 patsubst(`GNUs not Unix', `\w*', `(\&)')
7351 @result{}(GNUs)() (not)() (Unix)()
7352 patsubst(`GNUs not Unix', `\w+', `(\&)')
7353 @result{}(GNUs) (not) (Unix)
7354 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
7355 @result{}GN not@w{ }
7356 patsubst(`GNUs not Unix', `not', `NOT\')
7357 @error{}m4:stdin:6: warning: patsubst: trailing \ ignored in replacement
7358 @result{}GNUs NOT Unix
7359 @end example
7360
7361 Here is a slightly more realistic example, which capitalizes individual
7362 words or whole sentences, by substituting calls of the macros
7363 @code{upcase} and @code{downcase} into the strings.
7364
7365 @deffn Composite upcase (@var{text})
7366 @deffnx Composite downcase (@var{text})
7367 @deffnx Composite capitalize (@var{text})
7368 Expand to @var{text}, but with capitalization changed: @code{upcase}
7369 changes all letters to upper case, @code{downcase} changes all letters
7370 to lower case, and @code{capitalize} changes the first character of each
7371 word to upper case and the remaining characters to lower case.
7372 @end deffn
7373
7374 First, an example of their usage, using implementations distributed in
7375 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
7376
7377 @comment examples
7378 @example
7379 $ @kbd{m4 -I examples}
7380 include(`capitalize.m4')
7381 @result{}
7382 upcase(`GNUs not Unix')
7383 @result{}GNUS NOT UNIX
7384 downcase(`GNUs not Unix')
7385 @result{}gnus not unix
7386 capitalize(`GNUs not Unix')
7387 @result{}Gnus Not Unix
7388 @end example
7389
7390 Now for the implementation.  There is a helper macro @code{_capitalize}
7391 which puts only its first word in mixed case.  Then @code{capitalize}
7392 merely parses out the words, and replaces them with an invocation of
7393 @code{_capitalize}.  (As presented here, the @code{capitalize} macro has
7394 some subtle flaws.  You should try to see if you can find and correct
7395 them; or @pxref{Improved capitalize, , Answers}).
7396
7397 @comment examples
7398 @example
7399 $ @kbd{m4 -I examples}
7400 undivert(`capitalize.m4')dnl
7401 @result{}divert(`-1')
7402 @result{}# upcase(text)
7403 @result{}# downcase(text)
7404 @result{}# capitalize(text)
7405 @result{}#   change case of text, simple version
7406 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
7407 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
7408 @result{}define(`_capitalize',
7409 @result{}       `regexp(`$1', `^\(\w\)\(\w*\)',
7410 @result{}               `upcase(`\1')`'downcase(`\2')')')
7411 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
7412 @result{}divert`'dnl
7413 @end example
7414
7415 If @var{resyntax} is given, @var{regexp} must be given according to
7416 the syntax chosen, though the default regular expression syntax
7417 remains unchanged for other invocations:
7418
7419 @example
7420 define(`epatsubst',
7421        `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl
7422 epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_')
7423 @result{}_bar_ _baz_
7424 patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_')
7425 @result{}_bar_ _baz_
7426 @end example
7427
7428 While @code{regexp} replaces the whole input with the replacement as
7429 soon as there is a match, @code{patsubst} replaces each
7430 @emph{occurrence} of a match and preserves non-matching pieces:
7431
7432 @example
7433 define(`patreg',
7434 `patsubst($@@)
7435 regexp($@@)')dnl
7436 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
7437 @result{}bar FOO baz FOO
7438 @result{}FOO
7439 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
7440 @result{}bab abb 212
7441 @result{}bab
7442 @end example
7443
7444 Omitting @var{regexp} evokes a warning, but still produces output;
7445 contrast this with an empty @var{regexp} argument.
7446
7447 @example
7448 patsubst(`abc')
7449 @error{}m4:stdin:1: warning: patsubst: too few arguments: 1 < 2
7450 @result{}abc
7451 patsubst(`abc', `')
7452 @result{}abc
7453 patsubst(`abc', `', `\\-')
7454 @result{}\-a\-b\-c\-
7455 @end example
7456
7457 @node Format
7458 @section Formatting strings (printf-like)
7459
7460 @cindex formatted output
7461 @cindex output, formatted
7462 @cindex @acronym{GNU} extensions
7463 Formatted output can be made with @code{format}:
7464
7465 @deffn {Builtin (gnu)} format (@var{format-string}, @dots{})
7466 Works much like the C function @code{printf}.  The first argument
7467 @var{format-string} can contain @samp{%} specifications which are
7468 satisfied by additional arguments, and the expansion of @code{format} is
7469 the formatted string.
7470
7471 The macro @code{format} is recognized only with parameters.
7472 @end deffn
7473
7474 Its use is best described by a few examples:
7475
7476 @comment This test is a bit fragile, if someone tries to port to a
7477 @comment platform without infinity.
7478 @example
7479 define(`foo', `The brown fox jumped over the lazy dog')
7480 @result{}
7481 format(`The string "%s" uses %d characters', foo, len(foo))
7482 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
7483 format(`%*.*d', `-1', `-1', `1')
7484 @result{}1
7485 format(`%.0f', `56789.9876')
7486 @result{}56790
7487 len(format(`%-*X', `5000', `1'))
7488 @result{}5000
7489 ifelse(format(`%010F', `infinity'), `       INF', `success',
7490        format(`%010F', `infinity'), `  INFINITY', `success',
7491        format(`%010F', `infinity'))
7492 @result{}success
7493 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
7494        format(`%.1A', `1.999'), `0X2.0P+0', `success',
7495        format(`%.1A', `1.999'))
7496 @result{}success
7497 format(`%g', `0xa.P+1')
7498 @result{}20
7499 @end example
7500
7501 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
7502 example shows how @code{format} can be used to produce tabular output.
7503
7504 @comment examples
7505 @example
7506 $ @kbd{m4 -I examples}
7507 include(`forloop.m4')
7508 @result{}
7509 forloop(`i', `1', `10', `format(`%6d squared is %10d
7510 ', i, eval(i**2))')
7511 @result{}     1 squared is          1
7512 @result{}     2 squared is          4
7513 @result{}     3 squared is          9
7514 @result{}     4 squared is         16
7515 @result{}     5 squared is         25
7516 @result{}     6 squared is         36
7517 @result{}     7 squared is         49
7518 @result{}     8 squared is         64
7519 @result{}     9 squared is         81
7520 @result{}    10 squared is        100
7521 @result{}
7522 @end example
7523
7524 The builtin @code{format} is modeled after the ANSI C @samp{printf}
7525 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
7526 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
7527 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
7528 @samp{%}; it supports field widths and precisions, and the flags
7529 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}.  For
7530 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
7531 @samp{l} are recognized, and for floating point specifiers, the width
7532 modifier @samp{l} is recognized.  Items not yet supported include
7533 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
7534 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
7535 modifiers, and any platform extensions available in the native
7536 @code{printf}.  For more details on the functioning of @code{printf},
7537 see the C Library Manual, or the @acronym{POSIX} specification (for
7538 example, @samp{%a} is supported even on platforms that haven't yet
7539 implemented C99 hexadecimal floating point output natively).
7540
7541 @c FIXME - format still needs some improvements.
7542 Warnings are issued for unrecognized specifiers, an improper number of
7543 arguments, or difficulty parsing an argument according to the format
7544 string (such as overflow or extra characters).  It is anticipated that a
7545 future release of @acronym{GNU} @code{m4} will support more specifiers.
7546 Likewise, escape sequences are not yet recognized.
7547
7548 @example
7549 format(`%p', `0')
7550 @error{}m4:stdin:1: warning: format: unrecognized specifier in '%p'
7551 @result{}p
7552 format(`%*d', `')
7553 @error{}m4:stdin:2: warning: format: empty string treated as 0
7554 @error{}m4:stdin:2: warning: format: too few arguments: 2 < 3
7555 @result{}0
7556 format(`%.1f', `2a')
7557 @error{}m4:stdin:3: warning: format: non-numeric argument '2a'
7558 @result{}2.0
7559 @end example
7560
7561 @node Arithmetic
7562 @chapter Macros for doing arithmetic
7563
7564 @cindex arithmetic
7565 @cindex integer arithmetic
7566 Integer arithmetic is included in @code{m4}, with a C-like syntax.  As
7567 convenient shorthands, there are builtins for simple increment and
7568 decrement operations.
7569
7570 @menu
7571 * Incr::                        Decrement and increment operators
7572 * Eval::                        Evaluating integer expressions
7573 * Mpeval::                      Multiple precision arithmetic
7574 @end menu
7575
7576 @node Incr
7577 @section Decrement and increment operators
7578
7579 @cindex decrement operator
7580 @cindex increment operator
7581 Increment and decrement of integers are supported using the builtins
7582 @code{incr} and @code{decr}:
7583
7584 @deffn {Builtin (m4)} incr (@var{number})
7585 @deffnx {Builtin (m4)} decr (@var{number})
7586 Expand to the numerical value of @var{number}, incremented
7587 or decremented, respectively, by one.  Except for the empty string, the
7588 expansion is empty if @var{number} could not be parsed.
7589
7590 The macros @code{incr} and @code{decr} are recognized only with
7591 parameters.
7592 @end deffn
7593
7594 @example
7595 incr(`4')
7596 @result{}5
7597 decr(`7')
7598 @result{}6
7599 incr()
7600 @error{}m4:stdin:3: warning: incr: empty string treated as 0
7601 @result{}1
7602 decr()
7603 @error{}m4:stdin:4: warning: decr: empty string treated as 0
7604 @result{}-1
7605 @end example
7606
7607 The builtin macros @code{incr} and @code{decr} are recognized only when
7608 given arguments.
7609
7610 @node Eval
7611 @section Evaluating integer expressions
7612
7613 @cindex integer expression evaluation
7614 @cindex evaluation, of integer expressions
7615 @cindex expressions, evaluation of integer
7616 Integer expressions are evaluated with @code{eval}:
7617
7618 @deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
7619 Expands to the value of @var{expression}.  The expansion is empty
7620 if a problem is encountered while parsing the arguments.  If specified,
7621 @var{radix} and @var{width} control the format of the output.
7622
7623 Calculations are done with signed numbers, using at least 31-bit
7624 precision, but as a @acronym{GNU} extension, @code{m4} will use wider
7625 integers if available.  Precision is finite, based on the platform's
7626 notion of @code{intmax_t}, and overflow silently results in wraparound.
7627 A warning is issued if division by zero is attempted, or if
7628 @var{expression} could not be parsed.
7629
7630 Expressions can contain the following operators, listed in order of
7631 decreasing precedence.
7632
7633 @table @samp
7634 @item ()
7635 Parentheses
7636 @item +  -  ~  !
7637 Unary plus and minus, and bitwise and logical negation
7638 @item **
7639 Exponentiation
7640 @item *  /  %  \
7641 Multiplication, division, modulo, and ratio
7642 @item +  -
7643 Addition and subtraction
7644 @item <<  >>  >>>
7645 Shift left, shift right, unsigned shift right
7646 @item >  >=  <  <=
7647 Relational operators
7648 @item ==  !=
7649 Equality operators
7650 @item &
7651 Bitwise and
7652 @item ^
7653 Bitwise exclusive-or
7654 @item |
7655 Bitwise or
7656 @item &&
7657 Logical and
7658 @item ||
7659 Logical or
7660 @item ?:
7661 Conditional ternary
7662 @item ,
7663 Sequential evaluation
7664 @end table
7665
7666 The macro @code{eval} is recognized only with parameters.
7667 @end deffn
7668
7669 All binary operators, except exponentiation, are left associative.  C
7670 operators that perform variable assignment, such as @samp{+=} or
7671 @samp{--}, are not implemented, since @code{eval} only operates on
7672 constants, not variables.  Attempting to use them results in an error.
7673 @comment FIXME - since XCU ERN 137 is approved, we could provide an
7674 @comment extension that supported assignment operators.
7675
7676 Note that some older @code{m4} implementations use @samp{^} as an
7677 alternate operator for the exponentiation, although @acronym{POSIX}
7678 requires the C behavior of bitwise exclusive-or.  The precedence of the
7679 negation operators, @samp{~} and @samp{!}, was traditionally lower than
7680 equality.  The unary operators could not be used reliably more than once
7681 on the same term without intervening parentheses.  The traditional
7682 precedence of the equality operators @samp{==} and @samp{!=} was
7683 identical instead of lower than the relational operators such as
7684 @samp{<}, even through @acronym{GNU} M4 1.4.8.  Starting with version
7685 1.4.9, @acronym{GNU} M4 correctly follows @acronym{POSIX} precedence
7686 rules.  M4 scripts designed to be portable between releases must be
7687 aware that parentheses may be required to enforce C precedence rules.
7688 Likewise, division by zero, even in the unused branch of a
7689 short-circuiting operator, is not always well-defined in other
7690 implementations.
7691
7692 Following are some examples where the current version of M4 follows C
7693 precedence rules, but where older versions and some other
7694 implementations of @code{m4} require explicit parentheses to get the
7695 correct result:
7696
7697 @example
7698 eval(`1 == 2 > 0')
7699 @result{}1
7700 eval(`(1 == 2) > 0')
7701 @result{}0
7702 eval(`! 0 * 2')
7703 @result{}2
7704 eval(`! (0 * 2)')
7705 @result{}1
7706 eval(`1 | 1 ^ 1')
7707 @result{}1
7708 eval(`(1 | 1) ^ 1')
7709 @result{}0
7710 eval(`+ + - ~ ! ~ 0')
7711 @result{}1
7712 eval(`++0')
7713 @error{}m4:stdin:8: warning: eval: invalid operator: '++0'
7714 @result{}
7715 eval(`1 = 1')
7716 @error{}m4:stdin:9: warning: eval: invalid operator: '1 = 1'
7717 @result{}
7718 eval(`0 |= 1')
7719 @error{}m4:stdin:10: warning: eval: invalid operator: '0 |= 1'
7720 @result{}
7721 eval(`2 || 1 / 0')
7722 @result{}1
7723 eval(`0 || 1 / 0')
7724 @error{}m4:stdin:12: warning: eval: divide by zero: '0 || 1 / 0'
7725 @result{}
7726 eval(`0 && 1 % 0')
7727 @result{}0
7728 eval(`2 && 1 % 0')
7729 @error{}m4:stdin:14: warning: eval: modulo by zero: '2 && 1 % 0'
7730 @result{}
7731 @end example
7732
7733 @cindex @acronym{GNU} extensions
7734 As a @acronym{GNU} extension, @code{eval} supports several operators
7735 that do not appear in C@.  A right-associative exponentiation operator
7736 @samp{**} computes the value of the left argument raised to the right,
7737 modulo the numeric precision width.  If evaluated, the exponent must be
7738 non-negative, and at least one of the arguments must be non-zero, or a
7739 warning is issued.  An unsigned shift operator @samp{>>>} allows
7740 shifting a negative number as though it were an unsigned bit pattern,
7741 which shifts in 0 bits rather than twos-complement sign-extension.  A
7742 ratio operator @samp{\} behaves like normal division @samp{/} on
7743 integers, but is provided for symmetry with @code{mpeval}.
7744 Additionally, the C operators @samp{,} and @samp{?:} are supported.
7745
7746 @example
7747 eval(`2 ** 3 ** 2')
7748 @result{}512
7749 eval(`(2 ** 3) ** 2')
7750 @result{}64
7751 eval(`0 ** 1')
7752 @result{}0
7753 eval(`2 ** 0')
7754 @result{}1
7755 eval(`0 ** 0')
7756 @result{}
7757 @error{}m4:stdin:5: warning: eval: divide by zero: '0 ** 0'
7758 eval(`4 ** -2')
7759 @error{}m4:stdin:6: warning: eval: negative exponent: '4 ** -2'
7760 @result{}
7761 eval(`2 || 4 ** -2')
7762 @result{}1
7763 eval(`(-1 >> 1) == -1')
7764 @result{}1
7765 eval(`(-1 >>> 1) > (1 << 30)')
7766 @result{}1
7767 eval(`6 \ 3')
7768 @result{}2
7769 eval(`1 ? 2 : 3')
7770 @result{}2
7771 eval(`0 ? 2 : 3')
7772 @result{}3
7773 eval(`1 ? 2 : 1/0')
7774 @result{}2
7775 eval(`0 ? 1/0 : 3')
7776 @result{}3
7777 eval(`4, 5')
7778 @result{}5
7779 @end example
7780
7781 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
7782 without a special prefix are decimal.  A simple @samp{0} prefix
7783 introduces an octal number.  @samp{0x} introduces a hexadecimal number.
7784 As @acronym{GNU} extensions, @samp{0b} introduces a binary number.
7785 @samp{0r} introduces a number expressed in any radix between 1 and 36:
7786 the prefix should be immediately followed by the decimal expression of
7787 the radix, a colon, then the digits making the number.  For radix 1,
7788 leading zeros are ignored, and all remaining digits must be @samp{1};
7789 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
7790 @dots{}.  Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
7791 to @samp{z}.  Lower and upper case letters can be used interchangeably
7792 in numbers prefixes and as number digits.
7793
7794 Parentheses may be used to group subexpressions whenever needed.  For the
7795 relational operators, a true relation returns @code{1}, and a false
7796 relation return @code{0}.
7797
7798 Here are a few examples of use of @code{eval}.
7799
7800 @example
7801 eval(`-3 * 5')
7802 @result{}-15
7803 eval(`-99 / 10')
7804 @result{}-9
7805 eval(`-99 % 10')
7806 @result{}-9
7807 eval(`99 % -10')
7808 @result{}9
7809 eval(index(`Hello world', `llo') >= 0)
7810 @result{}1
7811 eval(`0r1:0111 + 0b100 + 0r3:12')
7812 @result{}12
7813 define(`square', `eval(`($1) ** 2')')
7814 @result{}
7815 square(`9')
7816 @result{}81
7817 square(square(`5')` + 1')
7818 @result{}676
7819 define(`foo', `666')
7820 @result{}
7821 eval(`foo / 6')
7822 @error{}m4:stdin:11: warning: eval: bad expression: 'foo / 6'
7823 @result{}
7824 eval(foo / 6)
7825 @result{}111
7826 @end example
7827
7828 As the last two lines show, @code{eval} does not handle macro
7829 names, even if they expand to a valid expression (or part of a valid
7830 expression).  Therefore all macros must be expanded before they are
7831 passed to @code{eval}.
7832 @comment update this if we add support for variables.
7833
7834 Some calculations are not portable to other implementations, since they
7835 have undefined semantics in C, but @acronym{GNU} @code{m4} has
7836 well-defined behavior on overflow.  When shifting, an out-of-range shift
7837 amount is implicitly brought into the range of the precision using
7838 modulo arithmetic (for example, on 32-bit integers, this would be an
7839 implicit bit-wise and with 0x1f).  This example should work whether your
7840 platform uses 32-bit integers, 64-bit integers, or even some other
7841 atypical size.
7842
7843 @example
7844 define(`max_int', eval(`-1 >>> 1'))
7845 @result{}
7846 define(`min_int', eval(max_int` + 1'))
7847 @result{}
7848 eval(min_int` < 0')
7849 @result{}1
7850 eval(max_int` > 0')
7851 @result{}1
7852 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
7853 @result{}overflow occurred
7854 eval(`0x80000000 % -1')
7855 @result{}0
7856 eval(`-4 >> 1')
7857 @result{}-2
7858 eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2'))
7859 @result{}-2
7860 @end example
7861
7862 If @var{radix} is specified, it specifies the radix to be used in the
7863 expansion.  The default radix is 10; this is also the case if
7864 @var{radix} is the empty string.  A warning results if the radix is
7865 outside the range of 1 through 36, inclusive.  The result of @code{eval}
7866 is always taken to be signed.  No radix prefix is output, and for
7867 radices greater than 10, the digits are lower case (although some
7868 other implementations use upper case).  The output is unquoted, and
7869 subject to further macro expansion.  The @var{width}
7870 argument specifies the minimum output width, excluding any negative
7871 sign.  The result is zero-padded to extend the expansion to the
7872 requested width.  A warning results if the width is negative.  If
7873 @var{radix} or @var{width} is out of bounds, the expansion of
7874 @code{eval} is empty.
7875
7876 @example
7877 eval(`666', `10')
7878 @result{}666
7879 eval(`666', `11')
7880 @result{}556
7881 eval(`666', `6')
7882 @result{}3030
7883 eval(`666', `6', `10')
7884 @result{}0000003030
7885 eval(`-666', `6', `10')
7886 @result{}-0000003030
7887 eval(`10', `', `0')
7888 @result{}10
7889 `0r1:'eval(`10', `1', `11')
7890 @result{}0r1:01111111111
7891 eval(`10', `16')
7892 @result{}a
7893 eval(`1', `37')
7894 @error{}m4:stdin:9: warning: eval: radix out of range: 37
7895 @result{}
7896 eval(`1', , `-1')
7897 @error{}m4:stdin:10: warning: eval: negative width: -1
7898 @result{}
7899 eval()
7900 @error{}m4:stdin:11: warning: eval: empty string treated as 0
7901 @result{}0
7902 eval(` ')
7903 @error{}m4:stdin:12: warning: eval: empty string treated as 0
7904 @result{}0
7905 define(`a', `hi')eval(` 10 ', `16')
7906 @result{}hi
7907 @end example
7908
7909 @node Mpeval
7910 @section Multiple precision arithmetic
7911
7912 When @code{m4} is compiled with a multiple precision arithmetic library
7913 (@pxref{Experiments}), a builtin @code{mpeval} is defined.
7914
7915 @deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @
7916   @ovar{width})
7917 Behaves similarly to @code{eval}, except the calculations are done with
7918 infinite precision, and rational numbers are supported.  Numbers may be
7919 of any length.
7920
7921 The macro @code{mpeval} is recognized only with parameters.
7922 @end deffn
7923
7924 For the most part, using @code{mpeval} is similar to using @code{eval}:
7925
7926 @comment options: -m mpeval
7927 @example
7928 $ @kbd{m4 -m mpeval}
7929 mpeval(`(1 << 70) + 2 ** 68 * 3', `16')
7930 @result{}700000000000000000
7931 `0r24:'mpeval(`0r36:zYx', `24', `5')
7932 @result{}0r24:038m9
7933 @end example
7934
7935 The ratio operator, @samp{\}, is provided with the same precedence as
7936 division, and rationally divides two numbers and canonicalizes the
7937 result, whereas the division operator @samp{/} always returns the
7938 integer quotient of the division.  To convert a rational value to
7939 integral, divide (@samp{/}) by 1.  Some operators, such as @samp{%},
7940 @samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate
7941 only on integers and will truncate any rational remainder.  The unsigned
7942 shift operator, @samp{>>>}, behaves identically with regular right
7943 shifts, @samp{>>}, since with infinite precision, it is not possible to
7944 convert a negative number to a positive using shifts.  The
7945 exponentiation operator, @samp{**}, assumes that the exponent is
7946 integral, but allows negative exponents.  With the short-circuit logical
7947 operators, @samp{||} and @samp{&&}, a non-zero result preserves the
7948 value of the argument that ended evaluation, rather than collapsing to
7949 @samp{1}.  The operators @samp{?:} and @samp{,} are always available,
7950 even in @acronym{POSIX} mode, since @code{mpeval} does not have to
7951 conform to the @acronym{POSIX} rules for @code{eval}.
7952
7953 @comment options: -m mpeval
7954 @example
7955 $ @kbd{m4 -m mpeval}
7956 mpeval(`2 / 4')
7957 @result{}0
7958 mpeval(`2 \ 4')
7959 @result{}1\2
7960 mpeval(`2 || 3')
7961 @result{}2
7962 mpeval(`1 && 3')
7963 @result{}3
7964 mpeval(`-1 >> 1')
7965 @result{}-1
7966 mpeval(`-1 >>> 1')
7967 @result{}-1
7968 @end example
7969
7970 @node Shell commands
7971 @chapter Macros for running shell commands
7972
7973 @cindex UNIX commands, running
7974 @cindex executing shell commands
7975 @cindex running shell commands
7976 @cindex shell commands, running
7977 @cindex commands, running shell
7978 There are a few builtin macros in @code{m4} that allow you to run shell
7979 commands from within @code{m4}.
7980
7981 Note that the definition of a valid shell command is system dependent.
7982 On UNIX systems, this is the typical @command{/bin/sh}.  But on other
7983 systems, such as native Windows, the shell has a different syntax of
7984 commands that it understands.  Some examples in this chapter assume
7985 @command{/bin/sh}, and also demonstrate how to quit early with a known
7986 exit value if this is not the case.
7987
7988 @menu
7989 * Platform macros::             Determining the platform
7990 * Syscmd::                      Executing simple commands
7991 * Esyscmd::                     Reading the output of commands
7992 * Sysval::                      Exit status
7993 * Mkstemp::                     Making temporary files
7994 * Mkdtemp::                     Making temporary directories
7995 @end menu
7996
7997 @node Platform macros
7998 @section Determining the platform
7999
8000 @cindex platform macros
8001 Sometimes it is desirable for an input file to know which platform
8002 @code{m4} is running on.  @acronym{GNU} @code{m4} provides several
8003 macros that are predefined to expand to the empty string; checking for
8004 their existence will confirm platform details.
8005
8006 @deffn {Optional builtin (gnu)} __os2__
8007 @deffnx {Optional builtin (traditional)} os2
8008 @deffnx {Optional builtin (gnu)} __unix__
8009 @deffnx {Optional builtin (traditional)} unix
8010 @deffnx {Optional builtin (gnu)} __windows__
8011 @deffnx {Optional builtin (traditional)} windows
8012 Each of these macros is conditionally defined as needed to describe the
8013 environment of @code{m4}.  If defined, each macro expands to the empty
8014 string.
8015 @end deffn
8016
8017 On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
8018 in the @samp{gnu} module, and @code{unix} in the @samp{traditional}
8019 module.
8020
8021 On native Windows systems, @acronym{GNU} @code{m4} will define
8022 @code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in
8023 the @samp{traditional} module.
8024
8025 On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
8026 in the @samp{gnu} module, and @code{os2} in the @samp{traditional}
8027 module.
8028
8029 If @acronym{GNU} M4 does not provide a platform macro for your system,
8030 please report that as a bug.
8031
8032 @example
8033 define(`provided', `0')
8034 @result{}
8035 ifdef(`__unix__', `define(`provided', incr(provided))')
8036 @result{}
8037 ifdef(`__windows__', `define(`provided', incr(provided))')
8038 @result{}
8039 ifdef(`__os2__', `define(`provided', incr(provided))')
8040 @result{}
8041 provided
8042 @result{}1
8043 @end example
8044
8045 @node Syscmd
8046 @section Executing simple commands
8047
8048 Any shell command can be executed, using @code{syscmd}:
8049
8050 @deffn {Builtin (m4)} syscmd (@var{shell-command})
8051 Executes @var{shell-command} as a shell command.
8052
8053 The expansion of @code{syscmd} is void, @emph{not} the output from
8054 @var{shell-command}!  Output or error messages from @var{shell-command}
8055 are not read by @code{m4}.  @xref{Esyscmd}, if you need to process the
8056 command output.
8057
8058 Prior to executing the command, @code{m4} flushes its buffers.
8059 The default standard input, output and error of @var{shell-command} are
8060 the same as those of @code{m4}.
8061
8062 By default, the @var{shell-command} will be used as the argument to the
8063 @option{-c} option of the @command{/bin/sh} shell (or the version of
8064 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8065 supports that).  If you prefer a different shell, the
8066 @command{configure} script can be given the option
8067 @option{--with-syscmd-shell=@var{location}} to set the location of an
8068 alternative shell at @acronym{GNU} @code{m4} installation; the
8069 alternative shell must still support @option{-c}.
8070
8071 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8072 m4}) is in effect, @code{syscmd} results in an error, since otherwise an
8073 input file could execute arbitrary code.
8074
8075 The macro @code{syscmd} is recognized only with parameters.
8076 @end deffn
8077
8078 @example
8079 define(`foo', `FOO')
8080 @result{}
8081 syscmd(`echo foo')
8082 @result{}foo
8083 @result{}
8084 @end example
8085
8086 Note how the expansion of @code{syscmd} keeps the trailing newline of
8087 the command, as well as using the newline that appeared after the macro.
8088
8089 The following is an example of @var{shell-command} using the same
8090 standard input as @code{m4}:
8091
8092 @comment The testsuite does not know how to parse pipes from the
8093 @comment texinfo.  Fortunately, there are other tests in the testsuite
8094 @comment that test this same feature.
8095 @comment ignore
8096 @example
8097 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
8098 @result{}
8099 @end example
8100
8101 It tells @code{m4} to read all of its input before executing the wrapped
8102 text, then hands a valid (albeit emptied) pipe as standard input for the
8103 @code{cat} subcommand.  Therefore, you should be careful when using
8104 standard input (either by specifying no files, or by passing @samp{-} as
8105 a file name on the command line, @pxref{Command line files, , Invoking
8106 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
8107 that consume data from standard input.  When standard input is a
8108 seekable file, the subprocess will pick up with the next character not
8109 yet processed by @code{m4}; when it is a pipe or other non-seekable
8110 file, there is no guarantee how much data will already be buffered by
8111 @code{m4} and thus unavailable to the child.
8112
8113 Following is an example of how potentially unsafe actions can be
8114 suppressed.
8115
8116 @comment options: --safer
8117 @comment status: 1
8118 @example
8119 $ @kbd{m4 --safer}
8120 syscmd(`echo hi')
8121 @error{}m4:stdin:1: syscmd: disabled by --safer
8122 @result{}
8123 @end example
8124
8125 @node Esyscmd
8126 @section Reading the output of commands
8127
8128 @cindex @acronym{GNU} extensions
8129 If you want @code{m4} to read the output of a shell command, use
8130 @code{esyscmd}:
8131
8132 @deffn {Builtin (gnu)} esyscmd (@var{shell-command})
8133 Expands to the standard output of the shell command
8134 @var{shell-command}.
8135
8136 Prior to executing the command, @code{m4} flushes its buffers.
8137 The default standard input and standard error of @var{shell-command} are
8138 the same as those of @code{m4}.  The error output of @var{shell-command}
8139 is not a part of the expansion: it will appear along with the error
8140 output of @code{m4}.
8141
8142 By default, the @var{shell-command} will be used as the argument to the
8143 @option{-c} option of the @command{/bin/sh} shell (or the version of
8144 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8145 supports that).  If you prefer a different shell, the
8146 @command{configure} script can be given the option
8147 @option{--with-syscmd-shell=@var{location}} to set the location of an
8148 alternative shell at @acronym{GNU} @code{m4} installation; the
8149 alternative shell must still support @option{-c}.
8150
8151 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8152 m4}) is in effect, @code{esyscmd} results in an error, since otherwise
8153 an input file could execute arbitrary code.
8154
8155 The macro @code{esyscmd} is recognized only with parameters.
8156 @end deffn
8157
8158 @example
8159 define(`foo', `FOO')
8160 @result{}
8161 esyscmd(`echo foo')
8162 @result{}FOO
8163 @result{}
8164 @end example
8165
8166 Note how the expansion of @code{esyscmd} keeps the trailing newline of
8167 the command, as well as using the newline that appeared after the macro.
8168
8169 Just as with @code{syscmd}, care must be exercised when sharing standard
8170 input between @code{m4} and the child process of @code{esyscmd}.
8171 Likewise, potentially unsafe actions can be suppressed.
8172
8173 @comment options: --safer
8174 @comment status: 1
8175 @example
8176 $ @kbd{m4 --safer}
8177 esyscmd(`echo hi')
8178 @error{}m4:stdin:1: esyscmd: disabled by --safer
8179 @result{}
8180 @end example
8181
8182 @node Sysval
8183 @section Exit status
8184
8185 @cindex UNIX commands, exit status from
8186 @cindex exit status from shell commands
8187 @cindex shell commands, exit status from
8188 @cindex commands, exit status from shell
8189 @cindex status of shell commands
8190 To see whether a shell command succeeded, use @code{sysval}:
8191
8192 @deffn {Builtin (m4)} sysval
8193 Expands to the exit status of the last shell command run with
8194 @code{syscmd} or @code{esyscmd}.  Expands to 0 if no command has been
8195 run yet.
8196 @end deffn
8197
8198 @example
8199 sysval
8200 @result{}0
8201 syscmd(`false')
8202 @result{}
8203 ifelse(sysval, `0', `zero', `non-zero')
8204 @result{}non-zero
8205 syscmd(`exit 2')
8206 @result{}
8207 sysval
8208 @result{}2
8209 syscmd(`true')
8210 @result{}
8211 sysval
8212 @result{}0
8213 esyscmd(`false')
8214 @result{}
8215 ifelse(sysval, `0', `zero', `non-zero')
8216 @result{}non-zero
8217 esyscmd(`echo dnl && exit 127')
8218 @result{}
8219 sysval
8220 @result{}127
8221 esyscmd(`true')
8222 @result{}
8223 sysval
8224 @result{}0
8225 @end example
8226
8227 @code{sysval} results in 127 if there was a problem executing the
8228 command, for example, if the system-imposed argument length is exceeded,
8229 or if there were not enough resources to fork.  It is not possible to
8230 distinguish between failed execution and successful execution that had
8231 an exit status of 127, unless there was output from the child process.
8232
8233 On UNIX platforms, where it is possible to detect when command execution
8234 is terminated by a signal, rather than a normal exit, the result is the
8235 signal number shifted left by eight bits.
8236
8237 @comment This test has difficulties being portable, even on platforms
8238 @comment where syscmd invokes /bin/sh.  Kill is not portable with signal
8239 @comment names.  According to autoconf, the only portable signal numbers
8240 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM).  But
8241 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
8242 @comment exits normally rather than letting the signal terminate it).
8243 @comment Also, TERM is flaky, as it can also kill the running m4 on
8244 @comment systems where /bin/sh does not create its own process group.
8245 @comment And PIPE is unreliable, since people tend to run with it
8246 @comment ignored, with m4 inheriting that choice.  That leaves KILL as
8247 @comment the only signal we can reliably test.
8248 @example
8249 dnl This test assumes kill is a shell builtin, and that signals are
8250 dnl recognizable.
8251 ifdef(`__unix__', ,
8252       `errprint(` skipping: syscmd does not have unix semantics
8253 ')m4exit(`77')')dnl
8254 syscmd(`kill -9 $$')
8255 @result{}
8256 sysval
8257 @result{}2304
8258 syscmd()
8259 @result{}
8260 sysval
8261 @result{}0
8262 esyscmd(`kill -9 $$')
8263 @result{}
8264 sysval
8265 @result{}2304
8266 @end example
8267
8268 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8269 m4}) is in effect, @code{sysval} will always remain at its default value
8270 of zero.
8271
8272 @comment options: --safer
8273 @comment status: 1
8274 @example
8275 $ @kbd{m4 --safer}
8276 sysval
8277 @result{}0
8278 syscmd(`false')
8279 @error{}m4:stdin:2: syscmd: disabled by --safer
8280 @result{}
8281 sysval
8282 @result{}0
8283 @end example
8284
8285 @node Mkstemp
8286 @section Making temporary files
8287
8288 @cindex temporary file names
8289 @cindex files, names of temporary
8290 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8291 temporary file, for output or for some other purpose.  There is a
8292 builtin macro, @code{mkstemp}, for making a temporary file:
8293
8294 @deffn {Builtin (m4)} mkstemp (@var{template})
8295 @deffnx {Builtin (m4)} maketemp (@var{template})
8296 Expands to the quoted name of a new, empty file, made from the string
8297 @var{template}, which should end with the string @samp{XXXXXX}.  The six
8298 @samp{X} characters are then replaced with random characters matching
8299 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
8300 name unique.  If fewer than six @samp{X} characters are found at the end
8301 of @code{template}, the result will be longer than the template.  The
8302 created file will have access permissions as if by @kbd{chmod =rw,go=},
8303 meaning that the current umask of the @code{m4} process is taken into
8304 account, and at most only the current user can read and write the file.
8305
8306 The traditional behavior, standardized by @acronym{POSIX}, is that
8307 @code{maketemp} merely replaces the trailing @samp{X} with the process
8308 id, without creating a file or quoting the expansion, and without
8309 ensuring that the resulting
8310 string is a unique file name.  In part, this means that using the same
8311 @var{template} twice in the same input file will result in the same
8312 expansion.  This behavior is a security hole, as it is very easy for
8313 another process to guess the name that will be generated, and thus
8314 interfere with a subsequent use of @code{syscmd} trying to manipulate
8315 that file name.  Hence, @acronym{POSIX} has recommended that all new
8316 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
8317 and that users of @code{m4} check for its existence.
8318
8319 The expansion is void and an error issued if a temporary file could
8320 not be created.
8321
8322 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8323 is in effect, @code{mkstemp} and @acronym{GNU}-mode @code{maketemp}
8324 result in an error, since otherwise an input file could perform a mild
8325 denial-of-service attack by filling up a disk with multiple empty files.
8326
8327 The macros @code{mkstemp} and @code{maketemp} are recognized only with
8328 parameters.
8329 @end deffn
8330
8331 If you try this next example, you will most likely get different output
8332 for the two file names, since the replacement characters are randomly
8333 chosen:
8334
8335 @comment ignore
8336 @example
8337 $ @kbd{m4}
8338 define(`tmp', `oops')
8339 @result{}
8340 maketemp(`/tmp/fooXXXXXX')
8341 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8342 @result{}/tmp/fooa07346
8343 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
8344       `define(`mkstemp', defn(`maketemp'))dnl
8345 errprint(`warning: potentially insecure maketemp implementation
8346 ')')
8347 @result{}
8348 mkstemp(`doc')
8349 @result{}docQv83Uw
8350 @end example
8351
8352 @comment options: --safer
8353 @comment status: 1
8354 @example
8355 $ @kbd{m4 --safer}
8356 maketemp(`/tmp/fooXXXXXX')
8357 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8358 @error{}m4:stdin:1: maketemp: disabled by --safer
8359 @result{}
8360 mkstemp(`/tmp/fooXXXXXX')
8361 @error{}m4:stdin:2: mkstemp: disabled by --safer
8362 @result{}
8363 @end example
8364
8365 @cindex @acronym{GNU} extensions
8366 Unless you use the @option{--traditional} command line option (or
8367 @option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
8368 version of @code{maketemp} is secure.  This means that using the same
8369 template to multiple calls will generate multiple files.  However, we
8370 recommend that you use the new @code{mkstemp} macro, introduced in
8371 @acronym{GNU} M4 1.4.8, which is secure even in traditional mode.  Also,
8372 as of M4 1.4.11, the secure implementation quotes the resulting file
8373 name, so that you are guaranteed to know what file was created even if
8374 the random file name happens to match an existing macro.  Notice that
8375 this example is careful to use @code{defn} to avoid unintended expansion
8376 of @samp{foo}.
8377
8378 @example
8379 $ @kbd{m4}
8380 define(`foo', `errprint(`oops')')
8381 @result{}
8382 syscmd(`rm -f foo-??????')sysval
8383 @result{}0
8384 define(`file1', maketemp(`foo-XXXXXX'))dnl
8385 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8386 ifelse(esyscmd(`echo \` foo-?????? \''), `foo-??????',
8387        `no file', `created')
8388 @result{}created
8389 define(`file2', maketemp(`foo-XX'))dnl
8390 @error{}m4:stdin:6: warning: maketemp: recommend using mkstemp instead
8391 define(`file3', mkstemp(`foo-XXXXXX'))dnl
8392 ifelse(len(defn(`file1')), len(defn(`file2')),
8393        `same length', `different')
8394 @result{}same length
8395 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
8396 @result{}different file
8397 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
8398 @result{}different file
8399 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
8400 @result{}different file
8401 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
8402 @result{}
8403 sysval
8404 @result{}0
8405 @end example
8406
8407 @comment options: -G
8408 @example
8409 $ @kbd{m4 -G}
8410 syscmd(`rm -f foo-*')sysval
8411 @result{}0
8412 define(`file1', maketemp(`foo-XXXXXX'))dnl
8413 @error{}m4:stdin:2: warning: maketemp: recommend using mkstemp instead
8414 define(`file2', maketemp(`foo-XXXXXX'))dnl
8415 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8416 ifelse(file1, file2, `same', `different file')
8417 @result{}same
8418 len(maketemp(`foo-XXXXX'))
8419 @error{}m4:stdin:5: warning: maketemp: recommend using mkstemp instead
8420 @result{}9
8421 define(`abc', `def')
8422 @result{}
8423 maketemp(`foo-abc')
8424 @result{}foo-def
8425 @error{}m4:stdin:7: warning: maketemp: recommend using mkstemp instead
8426 syscmd(`test -f foo-*')sysval
8427 @result{}1
8428 @end example
8429
8430 @node Mkdtemp
8431 @section Making temporary directories
8432
8433 @cindex temporary directory
8434 @cindex directories, temporary
8435 @cindex @acronym{GNU} extensions
8436 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8437 temporary directory, for holding multiple temporary files; such a
8438 directory can be created with @code{mkdtemp}:
8439
8440 @deffn {Builtin (gnu)} mkdtemp (@var{template})
8441 Expands to the quoted name of a new, empty directory, made from the string
8442 @var{template}, which should end with the string @samp{XXXXXX}.  The six
8443 @samp{X} characters are then replaced with random characters matching
8444 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name
8445 unique.  If fewer than six @samp{X} characters are found at the end of
8446 @code{template}, the result will be longer than the template.  The
8447 created directory will have access permissions as if by @kbd{chmod
8448 =rwx,go=}, meaning that the current umask of the @code{m4} process is
8449 taken into account, and at most only the current user can read, write,
8450 and search the directory.
8451
8452 The expansion is void and an error issued if a temporary directory could
8453 not be created.
8454
8455 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8456 is in effect, @code{mkdtemp} results in an error, since otherwise an
8457 input file could perform a mild denial-of-service attack by filling up a
8458 disk with multiple directories.
8459
8460 The macro @code{mkdtemp} is recognized only with parameters.
8461 This macro was added in M4 2.0.
8462 @end deffn
8463
8464 If you try this next example, you will most likely get different output
8465 for the directory names, since the replacement characters are randomly
8466 chosen:
8467
8468 @comment ignore
8469 @example
8470 $ @kbd{m4}
8471 define(`tmp', `oops')
8472 @result{}
8473 mkdtemp(`/tmp/fooXXXXXX')
8474 @result{}/tmp/foo2h89Vo
8475 mkdtemp(`dir)
8476 @result{}dirrg079A
8477 @end example
8478
8479 @comment options: --safer
8480 @comment status: 1
8481 @example
8482 $ @kbd{m4 --safer}
8483 mkdtemp(`/tmp/fooXXXXXX')
8484 @error{}m4:stdin:1: mkdtemp: disabled by --safer
8485 @result{}
8486 @end example
8487
8488 Multiple calls with the same template will generate multiple
8489 directories.
8490
8491 @example
8492 $ @kbd{m4}
8493 syscmd(`echo foo??????')dnl
8494 @result{}foo??????
8495 define(`dir1', mkdtemp(`fooXXXXXX'))dnl
8496 ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created')
8497 @result{}created
8498 define(`dir2', mkdtemp(`fooXXXXXX'))dnl
8499 ifelse(dir1, dir2, `same', `different directories')
8500 @result{}different directories
8501 syscmd(`rmdir 'dir1 dir2)
8502 @result{}
8503 sysval
8504 @result{}0
8505 @end example
8506
8507 @node Miscellaneous
8508 @chapter Miscellaneous builtin macros
8509
8510 This chapter describes various builtins, that do not really belong in
8511 any of the previous chapters.
8512
8513 @menu
8514 * Errprint::                    Printing error messages
8515 * Location::                    Printing current location
8516 * M4exit::                      Exiting from @code{m4}
8517 * Syncoutput::                  Turning on and off sync lines
8518 @end menu
8519
8520 @node Errprint
8521 @section Printing error messages
8522
8523 @cindex printing error messages
8524 @cindex error messages, printing
8525 @cindex messages, printing error
8526 @cindex standard error, output to
8527 You can print error messages using @code{errprint}:
8528
8529 @deffn {Builtin (m4)} errprint (@var{message}, @dots{})
8530 Prints @var{message} and the rest of the arguments to standard error,
8531 separated by spaces.  Standard error is used, regardless of the
8532 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
8533
8534 The expansion of @code{errprint} is void.
8535 The macro @code{errprint} is recognized only with parameters.
8536 @end deffn
8537
8538 @example
8539 errprint(`Invalid arguments to forloop
8540 ')
8541 @error{}Invalid arguments to forloop
8542 @result{}
8543 errprint(`1')errprint(`2',`3
8544 ')
8545 @error{}12 3
8546 @result{}
8547 @end example
8548
8549 A trailing newline is @emph{not} printed automatically, so it should be
8550 supplied as part of the argument, as in the example.  Unfortunately, the
8551 exact output of @code{errprint} is not very portable to other @code{m4}
8552 implementations: @acronym{POSIX} requires that all arguments be printed,
8553 but some implementations of @code{m4} only print the first.
8554 Furthermore, some @acronym{BSD} implementations always append a newline
8555 for each @code{errprint} call, regardless of whether the last argument
8556 already had one, and @acronym{POSIX} is silent on whether this is
8557 acceptable.
8558
8559 @node Location
8560 @section Printing current location
8561
8562 @cindex location, input
8563 @cindex input location
8564 To make it possible to specify the location of an error, three
8565 utility builtins exist:
8566
8567 @deffn {Builtin (gnu)} __file__
8568 @deffnx {Builtin (gnu)} __line__
8569 @deffnx {Builtin (gnu)} __program__
8570 Expand to the quoted name of the current input file, the
8571 current input line number in that file, and the quoted name of the
8572 current invocation of @code{m4}.
8573 @end deffn
8574
8575 @example
8576 errprint(__program__:__file__:__line__: `input error
8577 ')
8578 @error{}m4:stdin:1: input error
8579 @result{}
8580 @end example
8581
8582 Line numbers start at 1 for each file.  If the file was found due to the
8583 @option{-I} option or @env{M4PATH} environment variable, that is
8584 reflected in the file name.  Synclines, via @code{syncoutput}
8585 (@pxref{Syncoutput}) or the command line option @option{--synclines}
8586 (or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the
8587 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}),
8588 also use this notion of current file and line.  Redefining the three
8589 location macros has no effect on syncline, debug, warning, or error
8590 message output.
8591
8592 This example reuses the file @file{incl.m4} mentioned earlier
8593 (@pxref{Include}):
8594
8595 @comment examples
8596 @example
8597 $ @kbd{m4 -I examples}
8598 define(`foo', ``$0' called at __file__:__line__')
8599 @result{}
8600 foo
8601 @result{}foo called at stdin:2
8602 include(`incl.m4')
8603 @result{}Include file start
8604 @result{}foo called at examples/incl.m4:2
8605 @result{}Include file end
8606 @result{}
8607 @end example
8608
8609 The location of macros invoked during the rescanning of macro expansion
8610 text corresponds to the location in the file where the expansion was
8611 triggered, regardless of how many newline characters the expansion text
8612 contains.  As of @acronym{GNU} M4 1.4.8, the location of text wrapped
8613 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
8614 @code{m4wrap} was invoked.  Previous versions, however, behaved as
8615 though wrapped text came from line 0 of the file ``''.
8616
8617 @example
8618 define(`echo', `$@@')
8619 @result{}
8620 define(`foo', `echo(__line__
8621 __line__)')
8622 @result{}
8623 echo(__line__
8624 __line__)
8625 @result{}4
8626 @result{}5
8627 m4wrap(`foo
8628 ')
8629 @result{}
8630 foo(errprint(__line__
8631 __line__
8632 ))
8633 @error{}8
8634 @error{}9
8635 @result{}8
8636 @result{}8
8637 __line__
8638 @result{}11
8639 m4wrap(`__line__
8640 ')
8641 @result{}
8642 ^D
8643 @result{}6
8644 @result{}6
8645 @result{}12
8646 @end example
8647
8648 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
8649 terminology.  If you invoke @code{m4} through an absolute path or a link
8650 with a different spelling, rather than by relying on a @env{PATH} search
8651 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
8652 The intent is that you can use it to produce error messages with the
8653 same formatting that @code{m4} produces internally.  It can also be used
8654 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
8655 @code{m4} that is currently running, rather than whatever version of
8656 @code{m4} happens to be first in @env{PATH}.  It was first introduced in
8657 @acronym{GNU} M4 1.4.6.
8658
8659 @node M4exit
8660 @section Exiting from @code{m4}
8661
8662 @cindex exiting from @code{m4}
8663 @cindex status, setting @code{m4} exit
8664 If you need to exit from @code{m4} before the entire input has been
8665 read, you can use @code{m4exit}:
8666
8667 @deffn {Builtin (m4)} m4exit (@ovar{code})
8668 Causes @code{m4} to exit, with exit status @var{code}.  If @var{code} is
8669 left out, the exit status is zero.  If @var{code} cannot be parsed, or
8670 is outside the range of 0 to 255, the exit status is one.  No further
8671 input is read, and all wrapped and diverted text is discarded.
8672 @end deffn
8673
8674 @example
8675 m4wrap(`This text is lost due to `m4exit'.')
8676 @result{}
8677 divert(`1') So is this.
8678 divert
8679 @result{}
8680 m4exit And this is never read.
8681 @end example
8682
8683 A common use of this is to abort processing:
8684
8685 @deffn Composite fatal_error (@var{message})
8686 Abort processing with an error message and non-zero status.  Prefix
8687 @var{message} with details about where the error occurred, and print the
8688 resulting string to standard error.
8689 @end deffn
8690
8691 @comment status: 1
8692 @example
8693 define(`fatal_error',
8694        `errprint(__program__:__file__:__line__`: fatal error: $*
8695 ')m4exit(`1')')
8696 @result{}
8697 fatal_error(`this is a BAD one, buster')
8698 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
8699 @end example
8700
8701 After this macro call, @code{m4} will exit with exit status 1.  This macro
8702 is only intended for error exits, since the normal exit procedures are
8703 not followed, i.e., diverted text is not undiverted, and saved text
8704 (@pxref{M4wrap}) is not reread.  (This macro could be made more robust
8705 to earlier versions of @code{m4}.  You should try to see if you can find
8706 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
8707
8708 Note that it is still possible for the exit status to be different than
8709 what was requested by @code{m4exit}.  If @code{m4} detects some other
8710 error, such as a write error on standard output, the exit status will be
8711 non-zero even if @code{m4exit} requested zero.
8712
8713 If standard input is seekable, then the file will be positioned at the
8714 next unread character.  If it is a pipe or other non-seekable file,
8715 then there are no guarantees how much data @code{m4} might have read
8716 into buffers, and thus discarded.
8717
8718 @node Syncoutput
8719 @section Turning on and off sync lines
8720
8721 @cindex toggling synchronization lines
8722 @cindex synchronization lines
8723 @cindex location, input
8724 @cindex input location
8725 It is possible to adjust whether synclines are printed to output:
8726
8727 @deffn {Builtin (gnu)} syncoutput (@var{truth})
8728 If @var{truth} matches the extended regular expression
8729 @samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the
8730 form: @samp{#line <number> ["<file>"]}.
8731
8732 If @var{truth} is empty, or matches the extended regular expression
8733 @samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off.
8734
8735 All other arguments are ignored and issue a warning.
8736
8737 The macro @code{syncoutput} is recognized only with parameters.
8738 This macro was added in M4 2.0.
8739 @end deffn
8740
8741 @example
8742 define(`twoline', `1
8743 2')
8744 @result{}
8745 changecom(`/*', `*/')
8746 @result{}
8747 define(`comment', `/*1
8748 2*/')
8749 @result{}
8750 twoline
8751 @result{}1
8752 @result{}2
8753 dnl no line
8754 syncoutput(`on')
8755 @result{}#line 8 "stdin"
8756 @result{}
8757 twoline
8758 @result{}1
8759 @result{}#line 9
8760 @result{}2
8761 dnl no line
8762 hello
8763 @result{}#line 11
8764 @result{}hello
8765 comment
8766 @result{}/*1
8767 @result{}2*/
8768 one comment `two
8769 three'
8770 @result{}#line 13
8771 @result{}one /*1
8772 @result{}2*/ two
8773 @result{}three
8774 goodbye
8775 @result{}#line 15
8776 @result{}goodbye
8777 syncoutput(`off')
8778 @result{}
8779 twoline
8780 @result{}1
8781 @result{}2
8782 syncoutput(`blah')
8783 @error{}m4:stdin:18: warning: syncoutput: unknown directive 'blah'
8784 @result{}
8785 @end example
8786
8787 Notice that a syncline is output any time a single source line expands
8788 to multiple output lines, or any time multiple source lines expand to a
8789 single output line.  When there is a one-for-one correspondence, no
8790 additional synclines are needed.
8791
8792 Synchronization lines can be used to track where input comes from; an
8793 optional file designation is printed when the syncline algorithm
8794 detects that consecutive output lines come from different files.  You
8795 can also use the @option{--synclines} command-line option (or
8796 @option{-s}, @pxref{Preprocessor features, , Invoking m4}) to start
8797 with synchronization on.  This example reuses the file @file{incl.m4}
8798 mentioned earlier (@pxref{Include}):
8799
8800 @comment examples
8801 @comment options: -s
8802 @example
8803 $ @kbd{m4 --synclines -I examples}
8804 include(`incl.m4')
8805 @result{}#line 1 "examples/incl.m4"
8806 @result{}Include file start
8807 @result{}foo
8808 @result{}Include file end
8809 @result{}#line 1 "stdin"
8810 @result{}
8811 @end example
8812
8813 @node Frozen files
8814 @chapter Fast loading of frozen state
8815
8816 Some bigger @code{m4} applications may be built over a common base
8817 containing hundreds of definitions and other costly initializations.
8818 Usually, the common base is kept in one or more declarative files,
8819 which files are listed on each @code{m4} invocation prior to the
8820 user's input file, or else each input file uses @code{include}.
8821
8822 Reading the common base of a big application, over and over again, may
8823 be time consuming.  @acronym{GNU} @code{m4} offers some machinery to
8824 speed up the start of an application using lengthy common bases.
8825
8826 @menu
8827 * Using frozen files::          Using frozen files
8828 * Frozen file format 1::        Frozen file format 1
8829 * Frozen file format 2::        Frozen file format 2
8830 @end menu
8831
8832 @node Using frozen files
8833 @section Using frozen files
8834
8835 @cindex fast loading of frozen files
8836 @cindex frozen files for fast loading
8837 @cindex initialization, frozen state
8838 @cindex dumping into frozen file
8839 @cindex reloading a frozen file
8840 @cindex @acronym{GNU} extensions
8841 Suppose a user has a library of @code{m4} initializations in
8842 @file{base.m4}, which is then used with multiple input files:
8843
8844 @comment ignore
8845 @example
8846 $ @kbd{m4 base.m4 input1.m4}
8847 $ @kbd{m4 base.m4 input2.m4}
8848 $ @kbd{m4 base.m4 input3.m4}
8849 @end example
8850
8851 Rather than spending time parsing the fixed contents of @file{base.m4}
8852 every time, the user might rather execute:
8853
8854 @comment ignore
8855 @example
8856 $ @kbd{m4 -F base.m4f base.m4}
8857 @end example
8858
8859 @noindent
8860 once, and further execute, as often as needed:
8861
8862 @comment ignore
8863 @example
8864 $ @kbd{m4 -R base.m4f input1.m4}
8865 $ @kbd{m4 -R base.m4f input2.m4}
8866 $ @kbd{m4 -R base.m4f input3.m4}
8867 @end example
8868
8869 @noindent
8870 with the varying input.  The first call, containing the @option{-F}
8871 option, only reads and executes file @file{base.m4}, defining
8872 various application macros and computing other initializations.
8873 Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
8874 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
8875 file which contains a kind of snapshot of the @code{m4} internal state.
8876
8877 Later calls, containing the @option{-R} option, are able to reload
8878 the internal state of @code{m4}, from @file{base.m4f},
8879 @emph{prior} to reading any other input files.  This means
8880 instead of starting with a virgin copy of @code{m4}, input will be
8881 read after having effectively recovered the effect of a prior run.
8882 In our example, the effect is the same as if file @file{base.m4} has
8883 been read anew.  However, this effect is achieved a lot faster.
8884
8885 Only one frozen file may be created or read in any one @code{m4}
8886 invocation.  It is not possible to recover two frozen files at once.
8887 However, frozen files may be updated incrementally, through using
8888 @option{-R} and @option{-F} options simultaneously.  For example, if
8889 some care is taken, the command:
8890
8891 @comment ignore
8892 @example
8893 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
8894 @end example
8895
8896 @noindent
8897 could be broken down in the following sequence, accumulating the same
8898 output:
8899
8900 @comment ignore
8901 @example
8902 $ @kbd{m4 -F file1.m4f file1.m4}
8903 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
8904 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
8905 $ @kbd{m4 -R file3.m4f file4.m4}
8906 @end example
8907
8908 Some care is necessary because the frozen file does not save all state
8909 information.  Stacks of macro definitions via @code{pushdef} are
8910 accurately stored, along with all renamed or undefined builtins, as are
8911 the current syntax rules such as from @code{changequote}.  However, the
8912 value of @code{sysval} and text saved in @code{m4wrap} are not currently
8913 preserved.  Also, changing command line options between runs may cause
8914 unexpected behavior.  A future release of @acronym{GNU} M4 may improve
8915 on the quality of frozen files.
8916
8917 When an @code{m4} run is to be frozen, the automatic undiversion
8918 which takes place at end of execution is inhibited.  Instead, all
8919 positively numbered diversions are saved into the frozen file.
8920 The active diversion number is also transmitted.
8921
8922 A frozen file to be reloaded need not reside in the current directory.
8923 It is looked up the same way as an @code{include} file (@pxref{Search
8924 Path}).
8925
8926 If the frozen file was generated with a newer version of @code{m4}, and
8927 contains directives that an older @code{m4} cannot parse, attempting to
8928 load the frozen file with option @option{-R} will cause @code{m4} to
8929 exit with status 63 to indicate version mismatch.
8930
8931 @node Frozen file format 1
8932 @section Frozen file format 1
8933
8934 @cindex frozen file format 1
8935 @cindex file format, frozen file version 1
8936 Frozen files are sharable across architectures.  It is safe to write
8937 a frozen file on one machine and read it on another, given that the
8938 second machine uses the same or newer version of @acronym{GNU} @code{m4}.
8939 It is conventional, but not required, to give a frozen file the suffix
8940 of @code{.m4f}.
8941
8942 Older versions of @acronym{GNU} @code{m4} create frozen files with
8943 syntax version 1.  These files can be read by the current version, but
8944 are no longer produced.  Version 1 files are mostly text files, although
8945 any macros or diversions that contained nonprintable characters or long
8946 lines cause the resulting frozen file to do likewise, since there are no
8947 escape sequences.  The file can be edited to change the state that
8948 @code{m4} will start with.  It is composed of several directives, each
8949 starting with a single letter and ending with a newline (@key{NL}).
8950 Wherever a directive is expected, the character @samp{#} can be used
8951 instead to introduce a comment line; empty lines are also ignored if
8952 they are not part of an embedded string.
8953
8954 In the following descriptions, each @var{len} refers to the length of a
8955 corresponding subsequent @var{str}.  Numbers are always expressed in
8956 decimal, and an omitted number defaults to 0.  The valid directives in
8957 version 1 are:
8958
8959 @table @code
8960 @item V @var{number} @key{NL}
8961 Confirms the format of the file.  Version 1 is recognized when
8962 @var{number} is 1.  This directive must be the first non-comment in the
8963 file, and may not appear more than once.
8964
8965 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8966 Uses @var{str1} and @var{str2} as the begin-comment and
8967 end-comment strings.  If omitted, then @samp{#} and @key{NL} are the
8968 comment delimiters.
8969
8970 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
8971 Selects diversion @var{number}, making it current, then copy @var{str}
8972 in the current diversion.  @var{number} may be a negative number for a
8973 diversion that discards text.  To merely specify an active selection,
8974 use this command with an empty @var{str}.  With 0 as the diversion
8975 @var{number}, @var{str} will be issued on standard output at reload
8976 time.  @acronym{GNU} @code{m4} will not produce the @samp{D} directive
8977 with non-zero length for diversion 0, but this can be done with manual
8978 edits.  This directive may appear more than once for the same diversion,
8979 in which case the diversion is the concatenation of the various uses.
8980 If omitted, then diversion 0 is current.
8981
8982 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8983 Defines, through @code{pushdef}, a definition for @var{str1} expanding
8984 to the function whose builtin name is @var{str2}.  If the builtin does
8985 not exist (for example, if the frozen file was produced by a copy of
8986 @code{m4} compiled with the now-abandoned @code{changeword} support),
8987 the reload is silent, but any subsequent use of the definition of
8988 @var{str1} will result in a warning.  This directive may appear more
8989 than once for the same name, and its order, along with @samp{T}, is
8990 important.  If omitted, you will have no access to any builtins.
8991
8992 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8993 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
8994 strings.  If omitted, then @samp{`} and @samp{'} are the quote
8995 delimiters.
8996
8997 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8998 Defines, though @code{pushdef}, a definition for @var{str1}
8999 expanding to the text given by @var{str2}.  This directive may appear
9000 more than once for the same name, and its order, along with @samp{F}, is
9001 important.
9002 @end table
9003
9004 When loading format 1, the syntax categories @samp{@{} and @samp{@}} are
9005 disabled (reverting braces to be treated like plain characters).  This
9006 is because frozen files created with M4 1.4.x did not understand
9007 @samp{$@{@dots{}@}} extended argument notation, and a frozen macro that
9008 contained this character sequence should not behave differently just
9009 because a newer version of M4 reloaded the file.
9010
9011 @node Frozen file format 2
9012 @section Frozen file format 2
9013
9014 @cindex frozen file format 2
9015 @cindex file format, frozen file version 2
9016 The syntax of version 1 has some drawbacks; if any macro or diversion
9017 contained non-printable characters or long lines, the resulting frozen
9018 file would not qualify as a text file, making it harder to edit with
9019 some vendor tools.  The concatenation of multiple strings on a single
9020 line, such as for the @samp{T} directive, makes distinguishing the two
9021 strings a bit more difficult.  Finally, the format lacks support for
9022 several items of @code{m4} state, such that a reloaded file did not
9023 always behave the same as the original file.
9024
9025 These shortcomings have been addressed in version 2 of the frozen file
9026 syntax.  New directives have been added, and existing directives have
9027 additional, and sometimes optional, parameters.  All @var{str} instances
9028 in the grammar are now followed by @key{NL}, which makes the split
9029 between consecutive strings easier to recognize.  Strings may now
9030 contain escape sequences modeled after C, such as @samp{\n} for newline
9031 or @samp{\0} for @sc{nul}, so that the frozen file can be pure
9032 @sc{ascii} (although when hand-editing a frozen file, it is still
9033 acceptable to use the original byte rather than an escape sequence for
9034 all bytes except @samp{\}).  Also in the context of a @var{str}, the
9035 escape sequence @samp{\@key{NL}} is discarded, allowing a user to split
9036 lines that are too long for some platform tools.
9037
9038 @table @code
9039 @item V @var{number} @key{NL}
9040 Confirms the format of the file.  @code{m4} @value{VERSION} only creates
9041 frozen files where @var{number} is 2.  This directive must be the first
9042 non-comment in the file, and may not appear more than once.
9043
9044 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9045 Uses @var{str1} and @var{str2} as the begin-comment and
9046 end-comment strings.  If omitted, then @samp{#} and @key{NL} are the
9047 comment delimiters.
9048
9049 @item d @var{len} @key{NL} @var{str} @key{NL}
9050 Sets the debug flags, using @var{str} as the argument to
9051 @code{debugmode}.  If omitted, then the debug flags start in their
9052 default disabled state.
9053
9054 @item D @var{number} , @var{len} @key{NL} @var{str} @key{NL}
9055 Selects diversion @var{number}, making it current, then copy @var{str}
9056 in the current diversion.  @var{number} may be a negative number for a
9057 diversion that discards text.  To merely specify an active selection,
9058 use this command with an empty @var{string}.  With 0 as the diversion
9059 @var{number}, @var{str} will be issued on standard output at reload
9060 time.  @acronym{GNU} @code{m4} will not produce the @samp{D} directive
9061 with non-zero length for diversion 0, but this can be done with manual
9062 edits.  This directive may appear more than once for the same diversion,
9063 in which case the diversion is the concatenation of the various uses.
9064 If omitted, then diversion 0 is current.
9065
9066 @comment FIXME - the first usage, with only one string, is not supported
9067 @comment in the current code
9068 @c @item F @var{len1} @key{NL} @var{str1} @key{NL}
9069 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9070 @itemx F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9071 Defines, through @code{pushdef}, a definition for @var{str1} expanding
9072 to the function whose builtin name is given by @var{str2} (defaulting to
9073 @var{str1} if not present).  With two arguments, the builtin name is
9074 searched for among the intrinsic builtin functions only; with three
9075 arguments, the builtin name is searched for amongst the builtin
9076 functions defined by the module named by @var{str3}.
9077
9078 @item M @var{len} @key{NL} @var{str} @key{NL}
9079 Names a module which will be searched for according to the module search
9080 path and loaded.  Modules loaded from a frozen file don't add their
9081 builtin entries to the symbol table.  Modules must be loaded prior to
9082 specifying module-specific builtins via the three-argument @code{F} or
9083 @code{T}.
9084
9085 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9086 Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings.
9087 If omitted, then @samp{`} and @samp{'} are the quote delimiters.
9088
9089 @item R @var{len} @key{NL} @var{str} @key{NL}
9090 Sets the default regexp syntax, where @var{str} encodes one of the
9091 regular expression syntaxes supported by @acronym{GNU} M4.
9092 @xref{Changeresyntax}, for more details.
9093
9094 @item S @var{syntax-code} @var{len} @key{NL} @var{str} @key{NL}
9095 Defines, through @code{changesyntax}, a syntax category for each of the
9096 characters in @var{str}.  The @var{syntax-code} must be one of the
9097 characters described in @ref{Changesyntax}.
9098
9099 @item t @var{len} @key{NL} @var{str} @key{NL}
9100 Enables tracing for any macro named @var{str}, similar to using the
9101 @code{traceon} builtin.  This option may occur more than once for
9102 multiple macros; if omitted, no macro starts out as traced.
9103
9104 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9105 @itemx T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9106 Defines, though @code{pushdef}, a definition for @var{str1} expanding to
9107 the text given by @var{str2}.  This directive may appear more than once
9108 for the same name, and its order, along with @samp{F}, is important.  If
9109 present, the optional third argument associates the macro with a module
9110 named by @var{str3}.
9111 @end table
9112
9113 @node Compatibility
9114 @chapter Compatibility with other versions of @code{m4}
9115
9116 @cindex compatibility
9117 This chapter describes the many of the differences between this
9118 implementation of @code{m4}, and of other implementations found under
9119 UNIX, such as System V Release 3, Solaris, and @acronym{BSD} flavors.
9120 In particular, it lists the known differences and extensions to
9121 @acronym{POSIX}.  However, the list is not necessarily comprehensive.
9122
9123 At the time of this writing, @acronym{POSIX} 2001 (also known as IEEE
9124 Std 1003.1-2001) is the latest standard, although a new version of
9125 @acronym{POSIX} is under development and includes several proposals for
9126 modifying what @code{m4} is required to do.  The requirements for
9127 @code{m4} are shared between @acronym{SUSv3} and @acronym{POSIX}, and
9128 can be viewed at
9129 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
9130
9131 @menu
9132 * Extensions::                  Extensions in @acronym{GNU} M4
9133 * Incompatibilities::           Other incompatibilities
9134 * Experiments::                 Experimental features in @acronym{GNU} M4
9135 @end menu
9136
9137 @node Extensions
9138 @section Extensions in @acronym{GNU} M4
9139
9140 @cindex @acronym{GNU} extensions
9141 @cindex @acronym{POSIX}
9142 @cindex @env{POSIXLY_CORRECT}
9143 This version of @code{m4} contains a few facilities that do not exist
9144 in System V @code{m4}.  These extra facilities are all suppressed by
9145 using the @option{-G} command line option, unless overridden by other
9146 command line options.
9147 Most of these extensions are compatible with
9148 @uref{http://www.unix.org/single_unix_specification/,
9149 @acronym{POSIX}}; the few exceptions are suppressed if the
9150 @env{POSIXLY_CORRECT} environment variable is set.
9151
9152 @itemize @bullet
9153 @item
9154 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
9155 several digits, while the System V @code{m4} only accepts one digit.
9156 This allows macros in @acronym{GNU} @code{m4} to take any number of
9157 arguments, and not only nine (@pxref{Arguments}).
9158 @acronym{POSIX} does not allow this extension, so it is disabled if
9159 @env{POSIXLY_CORRECT} is set.
9160 @c FIXME - update this bullet when ${11} is implemented.
9161
9162 @item
9163 The @code{divert} (@pxref{Divert}) macro can manage more than 9
9164 diversions.  @acronym{GNU} @code{m4} treats all positive numbers as valid
9165 diversions, rather than discarding diversions greater than 9.
9166
9167 @item
9168 Files included with @code{include} and @code{sinclude} are sought in a
9169 user specified search path, if they are not found in the working
9170 directory.  The search path is specified by the @option{-I} option and the
9171 @samp{M4PATH} environment variable (@pxref{Search Path}).
9172
9173 @item
9174 Arguments to @code{undivert} can be non-numeric, in which case the named
9175 file will be included uninterpreted in the output (@pxref{Undivert}).
9176
9177 @item
9178 Formatted output is supported through the @code{format} builtin, which
9179 is modeled after the C library function @code{printf} (@pxref{Format}).
9180
9181 @item
9182 Searches and text substitution through regular expressions are supported
9183 by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
9184 (@pxref{Patsubst}) builtins.
9185
9186 The syntax of regular expressions in M4 has never been clearly
9187 formalized.  While Open@acronym{BSD} M4 uses extended regular
9188 expressions for @code{regexp} and @code{patsubst}, @acronym{GNU} M4
9189 defaults to basic regular expressions, but provides
9190 @code{changeresyntax} (@pxref{Changeresyntax}) to change the flavor of
9191 regular expression syntax in use.
9192
9193 @item
9194 The output of shell commands can be read into @code{m4} with
9195 @code{esyscmd} (@pxref{Esyscmd}).
9196
9197 @item
9198 There is indirect access to any builtin macro with @code{builtin}
9199 (@pxref{Builtin}).
9200
9201 @item
9202 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
9203
9204 @item
9205 The name of the program, the current input file, and the current input
9206 line number are accessible through the builtins @code{@w{__program__}},
9207 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
9208
9209 @item
9210 The generation of sync lines can be controlled through @code{syncoutput}
9211 (@pxref{Syncoutput}).
9212
9213 @item
9214 The format of the output from @code{dumpdef} and macro tracing can be
9215 controlled with @code{debugmode} (@pxref{Debugmode}).
9216
9217 @item
9218 The destination of trace and debug output can be controlled with
9219 @code{debugfile} (@pxref{Debugfile}).
9220
9221 @item
9222 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
9223 creating a new file with a unique name on every invocation, rather than
9224 following the insecure behavior of replacing the trailing @samp{X}
9225 characters with the @code{m4} process id.  @acronym{POSIX} does not
9226 allow this extension, so @code{maketemp} is insecure if
9227 @env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in
9228 the first place.
9229
9230 @item
9231 @acronym{POSIX} only requires support for the command line options
9232 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
9233 by @acronym{GNU} M4 are extensions.  @xref{Invoking m4}, for a
9234 description of these options.
9235
9236 @item
9237 The debugging and tracing facilities in @acronym{GNU} @code{m4} are much
9238 more extensive than in most other versions of @code{m4}.
9239
9240 @item
9241 Some traditional implementations only allow reading standard input
9242 once, but @acronym{GNU} @code{m4} correctly handles multiple instances
9243 of @samp{-} on the command line.
9244
9245 @item
9246 @acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
9247 (first-in, first-out) order, and most other implementations obey this.
9248 However, versions of @acronym{GNU} @code{m4} earlier than 1.6 used
9249 LIFO order.  Furthermore, @acronym{POSIX} states that only the first
9250 argument to @code{m4wrap} is saved for later evaluation, but
9251 @acronym{GNU} @code{m4} saves and processes all arguments, with output
9252 separated by spaces.
9253
9254 @item
9255 @acronym{POSIX} states that builtins that require arguments, but are
9256 called without arguments, have undefined behavior.  Traditional
9257 implementations simply behave as though empty strings had been passed.
9258 For example, @code{a`'define`'b} would expand to @code{ab}.  But
9259 @acronym{GNU} @code{m4} ignores certain builtins if they have missing
9260 arguments, giving @code{adefineb} for the above example.
9261 @end itemize
9262
9263 @node Incompatibilities
9264 @section Other incompatibilities
9265
9266 There are a few other incompatibilities between this implementation of
9267 @code{m4}, and what @acronym{POSIX} requires, or what the System V
9268 version implemented.
9269
9270 @itemize @bullet
9271 @item
9272 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
9273 by undefining the entire stack of previous definitions, and if doing
9274 @code{undefine(`f')} first.  @acronym{GNU} @code{m4} replaces just the top
9275 definition on the stack, as if doing @code{popdef(`f')} followed by
9276 @code{pushdef(`f',`1')}.  @acronym{POSIX} allows either behavior.
9277
9278 @item
9279 At one point, @acronym{POSIX} required @code{changequote(@var{arg})}
9280 (@pxref{Changequote}) to use newline as the close quote, but this was a
9281 bug, and the next version of @acronym{POSIX} is anticipated to state
9282 that using empty strings or just one argument is unspecified.
9283 Meanwhile, the @acronym{GNU} @code{m4} behavior of treating an empty
9284 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
9285 repeating the start-quote delimiter, and BSD treats it as leaving the
9286 previous end-quote delimiter unchanged.  For predictable results, never
9287 call changequote with just one argument, or with empty strings for
9288 arguments.
9289
9290 @item
9291 At one point, @acronym{POSIX} required @code{changecom(@var{arg},)}
9292 (@pxref{Changecom}) to make it impossible to end a comment, but this is
9293 a bug, and the next version of @acronym{POSIX} is anticipated to state
9294 that using empty strings is unspecified.  Meanwhile, the @acronym{GNU}
9295 @code{m4} behavior of treating an empty end-comment delimiter as newline
9296 is not portable, as BSD treats it as leaving the previous end-comment
9297 delimiter unchanged.  It is also impossible in BSD implementations to
9298 disable comments, even though that is required by @acronym{POSIX}.  For
9299 predictable results, never call changecom with empty strings for
9300 arguments.
9301
9302 @item
9303 Traditional implementations allow argument collection, but not string
9304 and comment processing, to span file boundaries.  Thus, if @file{a.m4}
9305 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
9306 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
9307 gives an error message that the end of file was encountered inside a
9308 macro with @acronym{GNU} @code{m4}.  On the other hand, traditional
9309 implementations do end of file processing for files included with
9310 @code{include} or @code{sinclude} (@pxref{Include}), while @acronym{GNU}
9311 @code{m4} seamlessly integrates the content of those files.  Thus
9312 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
9313 giving an error.
9314
9315 @item
9316 @acronym{POSIX} requires @code{eval} (@pxref{Eval}) to treat all
9317 operators with the same precedence as C@.  However, earlier versions of
9318 @acronym{GNU} @code{m4} followed the traditional behavior of other
9319 @code{m4} implementations, where bitwise and logical negation (@samp{~}
9320 and @samp{!}) have lower precedence than equality operators; and where
9321 equality operators (@samp{==} and @samp{!=}) had the same precedence as
9322 relational operators (such as @samp{<}).  Use explicit parentheses to
9323 ensure proper precedence.  As extensions to @acronym{POSIX},
9324 @acronym{GNU} @code{m4} gives well-defined semantics to operations that
9325 C leaves undefined, such as when overflow occurs, when shifting negative
9326 numbers, or when performing division by zero.  @acronym{POSIX} also
9327 requires @samp{=} to cause an error, but many traditional
9328 implementations allowed it as an alias for @samp{==}.
9329
9330 @item
9331 @acronym{POSIX} 2001 requires @code{translit} (@pxref{Translit}) to
9332 treat each character of the second and third arguments literally.
9333 However, it is anticipated that the next version of @acronym{POSIX} will
9334 allow the @acronym{GNU} @code{m4} behavior of treating @samp{-} as a
9335 range operator.
9336
9337 @item
9338 @acronym{POSIX} requires @code{m4} to honor the locale environment
9339 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
9340 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
9341 implemented in @acronym{GNU} @code{m4}.
9342
9343 @item
9344 @acronym{GNU} @code{m4} implements sync lines differently from System V
9345 @code{m4}, when text is being diverted.  @acronym{GNU} @code{m4} outputs
9346 the sync lines when the text is being diverted, and System V @code{m4}
9347 when the diverted text is being brought back.
9348
9349 The problem is which lines and file names should be attached to text
9350 that is being, or has been, diverted.  System V @code{m4} regards all
9351 the diverted text as being generated by the source line containing the
9352 @code{undivert} call, whereas @acronym{GNU} @code{m4} regards the
9353 diverted text as being generated at the time it is diverted.
9354
9355 The sync line option is used mostly when using @code{m4} as
9356 a front end to a compiler.  If a diverted line causes a compiler error,
9357 the error messages should most probably refer to the place where the
9358 diversion was made, and not where it was inserted again.
9359
9360 @comment options: -s
9361 @example
9362 divert(2)2
9363 divert(1)1
9364 divert`'0
9365 @result{}#line 3 "stdin"
9366 @result{}0
9367 ^D
9368 @result{}#line 2 "stdin"
9369 @result{}1
9370 @result{}#line 1 "stdin"
9371 @result{}2
9372 @end example
9373
9374 @comment FIXME - this needs to be fixed before 2.0.
9375 The current @code{m4} implementation has a limitation that the syncline
9376 output at the start of each diversion occurs no matter what, even if the
9377 previous diversion did not end with a newline.  This goes contrary to
9378 the claim that synclines appear on a line by themselves, so this
9379 limitation may be corrected in a future version of @code{m4}.  In the
9380 meantime, when using @option{-s}, it is wisest to make sure all
9381 diversions end with newline.
9382
9383 @item
9384 @acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
9385 definitions like:
9386
9387 @comment ignore
9388 @example
9389 define(`x', `x')
9390 @result{}
9391 define(`x', `x ')
9392 @result{}
9393 @end example
9394
9395 @cindex rescanning
9396 There is nothing inherently wrong with defining @samp{x} to
9397 return @samp{x}.  The wrong thing is to expand @samp{x} unquoted,
9398 because that would cause an infinite rescan loop.
9399 In @code{m4}, one might use macros to hold strings, as we do for
9400 variables in other programming languages, further checking them with:
9401
9402 @comment ignore
9403 @example
9404 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
9405 @end example
9406
9407 @noindent
9408 In cases like this one, an interdiction for a macro to hold its own name
9409 would be a useless limitation.  Of course, this leaves more rope for the
9410 @acronym{GNU} @code{m4} user to hang himself!  Rescanning hangs may be
9411 avoided through careful programming, a little like for endless loops in
9412 traditional programming languages.
9413
9414 @item
9415 @acronym{POSIX} states that only unquoted leading newlines and blanks
9416 (that is, space and tab) are ignored when collecting macro arguments.
9417 However, this appears to be a bug in @acronym{POSIX}, since most
9418 traditional implementations also ignore all whitespace (formfeed,
9419 carriage return, and vertical tab).  @acronym{GNU} @code{m4} follows
9420 tradition and ignores all leading unquoted whitespace.
9421 @end itemize
9422
9423 @node Experiments
9424 @section Experimental features in @acronym{GNU} M4
9425
9426 Certain features of GNU @code{m4} are experimental.
9427
9428 Some are only available if activated by an option given to
9429 @file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation
9430 time.  The functionality
9431 might change or even go away in the future.  @emph{Do not rely on it}.
9432 Please direct your comments about it the same way you would do for bugs.
9433
9434 @section Changesyntax
9435
9436 An experimental feature, which improves the flexibility of @code{m4},
9437 allows for changing the way the input is parsed (@pxref{Changesyntax}).
9438 No compile time option is needed for @code{changesyntax}.  The
9439 implementation is careful to not slow down @code{m4} parsing, unlike the
9440 withdrawn experiment of @code{changeword} that appeared earlier in M4
9441 1.4.x.
9442
9443 @section Multiple precision arithmetic
9444
9445 Another experimental feature, which would improve @code{m4} usefulness,
9446 allows for multiple precision rational arithmetic similar to
9447 @code{eval}.  You must have the @acronym{GNU} multi-precision (gmp)
9448 library installed, and should use @kbd{./configure --with-gmp} if you
9449 want this feature compiled in.  The current implementation is unproven
9450 and might go away.  Do not count on it yet.
9451
9452 @node Answers
9453 @chapter Correct version of some examples
9454
9455 Some of the examples in this manuals are buggy or not very robust, for
9456 demonstration purposes.  Improved versions of these composite macros are
9457 presented here.
9458
9459 @menu
9460 * Improved exch::               Solution for @code{exch}
9461 * Improved forloop::            Solution for @code{forloop}
9462 * Improved foreach::            Solution for @code{foreach}
9463 * Improved copy::               Solution for @code{copy}
9464 * Improved m4wrap::             Solution for @code{m4wrap}
9465 * Improved cleardivert::        Solution for @code{cleardivert}
9466 * Improved capitalize::         Solution for @code{capitalize}
9467 * Improved fatal_error::        Solution for @code{fatal_error}
9468 @end menu
9469
9470 @node Improved exch
9471 @section Solution for @code{exch}
9472
9473 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
9474 to double quote their arguments.  A nicer definition, which lets
9475 clients follow the rule of thumb of one level of quoting per level of
9476 parentheses, involves adding quotes in the definition of @code{exch}, as
9477 follows:
9478
9479 @example
9480 define(`exch', ``$2', `$1'')
9481 @result{}
9482 define(exch(`expansion text', `macro'))
9483 @result{}
9484 macro
9485 @result{}expansion text
9486 @end example
9487
9488 @node Improved forloop
9489 @section Solution for @code{forloop}
9490
9491 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
9492 into an infinite loop if given an iterator that is not parsed as a macro
9493 name.  It does not do any sanity checking on its numeric bounds, and
9494 only permits decimal numbers for bounds.  Here is an improved version,
9495 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
9496 version also optimizes overhead by calling four macros instead of six
9497 per iteration (excluding those in @var{text}), by not dereferencing the
9498 @var{iterator} in the helper @code{@w{_forloop}}.
9499
9500 @comment examples
9501 @example
9502 $ @kbd{m4 -I examples}
9503 undivert(`forloop2.m4')dnl
9504 @result{}divert(`-1')
9505 @result{}# forloop(var, from, to, stmt) - improved version:
9506 @result{}#   works even if VAR is not a strict macro name
9507 @result{}#   performs sanity check that FROM is larger than TO
9508 @result{}#   allows complex numerical expressions in TO and FROM
9509 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9510 @result{}  `pushdef(`$1')_$0(`$1', eval(`$2'),
9511 @result{}    eval(`$3'), `$4')popdef(`$1')')')
9512 @result{}define(`_forloop',
9513 @result{}  `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
9514 @result{}    `$0(`$1', incr(`$2'), `$3', `$4')')')
9515 @result{}divert`'dnl
9516 include(`forloop2.m4')
9517 @result{}
9518 forloop(`i', `2', `1', `no iteration occurs')
9519 @result{}
9520 forloop(`', `1', `2', ` odd iterator name')
9521 @result{} odd iterator name odd iterator name
9522 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
9523 @result{} 0xa 0xb 0xc
9524 forloop(`i', `a', `b', `non-numeric bounds')
9525 @error{}m4:stdin:6: warning: eval: bad input: '(a) <= (b)'
9526 @result{}
9527 @end example
9528
9529 One other change to notice is that the improved version used @samp{_$0}
9530 rather than @samp{_foreach} to invoke the helper routine.  In general,
9531 this is a good practice to follow, because then the set of macros can be
9532 uniformly transformed.  The following example shows a transformation
9533 that doubles the current quoting and appends a suffix @samp{2} to each
9534 transformed macro.  If @code{foreach} refers to the literal
9535 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
9536 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
9537 to an infinite recursion loop in this example.
9538
9539 @comment options: -L9
9540 @comment status: 1
9541 @comment examples
9542 @example
9543 $ @kbd{m4 -d -L 9 -I examples}
9544 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
9545 @result{}
9546 define(`double', `define(`$1'`2',
9547   arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
9548 @result{}
9549 double(`forloop')double(`_forloop')defn(`forloop2')
9550 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
9551 @result{}  ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
9552 @result{}    eval(``$3''), ``$4'')popdef(``$1'')'')
9553 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9554 @result{}
9555 changequote(`[', `]')changequote([``], [''])
9556 @result{}
9557 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9558 @result{}
9559 changequote`'include(`forloop.m4')
9560 @result{}
9561 double(`forloop')double(`_forloop')defn(`forloop2')
9562 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
9563 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9564 @result{}
9565 changequote(`[', `]')changequote([``], [''])
9566 @result{}
9567 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9568 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
9569 @end example
9570
9571 One more optimization is still possible.  Instead of repeatedly
9572 assigning a variable then invoking or dereferencing it, it is possible
9573 to pass the current iterator value as a single argument.  Coupled with
9574 @code{curry} if other arguments are needed (@pxref{Composition}), or
9575 with helper macros if the argument is needed in more than one place in
9576 the expansion, the output can be generated with three, rather than four,
9577 macros of overhead per iteration.  Notice how the file
9578 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
9579 arguments of the helper @code{_forloop} to take two arguments that are
9580 placed around the current value.  By splitting a balanced set of
9581 parantheses across multiple arguments, the helper macro can now be
9582 shared by @code{forloop} and the new @code{forloop_arg}.
9583
9584 @comment examples
9585 @example
9586 $ @kbd{m4 -I examples}
9587 include(`forloop3.m4')
9588 @result{}
9589 undivert(`forloop3.m4')dnl
9590 @result{}divert(`-1')
9591 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
9592 @result{}#   each value between FROM and TO, without define overhead
9593 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
9594 @result{}  `_forloop(`$1', eval(`$2'), `$3(', `)')')')
9595 @result{}# forloop(var, from, to, stmt) - refactored to share code
9596 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9597 @result{}  `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
9598 @result{}    `define(`$1',', `)$4')popdef(`$1')')')
9599 @result{}define(`_forloop',
9600 @result{}  `$3`$1'$4`'ifelse(`$1', `$2', `',
9601 @result{}    `$0(incr(`$1'), `$2', `$3', `$4')')')
9602 @result{}divert`'dnl
9603 forloop(`i', `1', `3', ` i')
9604 @result{} 1 2 3
9605 define(`echo', `$@@')
9606 @result{}
9607 forloop_arg(`1', `3', ` echo')
9608 @result{} 1 2 3
9609 include(`curry.m4')
9610 @result{}
9611 forloop_arg(`1', `3', `curry(`pushdef', `a')')
9612 @result{}
9613 a
9614 @result{}3
9615 popdef(`a')a
9616 @result{}2
9617 popdef(`a')a
9618 @result{}1
9619 popdef(`a')a
9620 @result{}a
9621 @end example
9622
9623 Of course, it is possible to make even more improvements, such as
9624 adding an optional step argument, or allowing iteration through
9625 descending sequences.  @acronym{GNU} Autoconf provides some of these
9626 additional bells and whistles in its @code{m4_for} macro.
9627
9628 @node Improved foreach
9629 @section Solution for @code{foreach}
9630
9631 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
9632 presented earlier each have flaws.  First, we will examine and fix the
9633 quadratic behavior of @code{foreachq}:
9634
9635 @comment examples
9636 @example
9637 $ @kbd{m4 -I examples}
9638 include(`foreachq.m4')
9639 @result{}
9640 traceon(`shift')debugmode(`aq')
9641 @result{}
9642 foreachq(`x', ``1', `2', `3', `4'', `x
9643 ')dnl
9644 @result{}1
9645 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9646 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9647 @result{}2
9648 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9649 @error{}m4trace: -3- shift(`2', `3', `4')
9650 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9651 @error{}m4trace: -2- shift(`2', `3', `4')
9652 @result{}3
9653 @error{}m4trace: -5- shift(`1', `2', `3', `4')
9654 @error{}m4trace: -4- shift(`2', `3', `4')
9655 @error{}m4trace: -3- shift(`3', `4')
9656 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9657 @error{}m4trace: -3- shift(`2', `3', `4')
9658 @error{}m4trace: -2- shift(`3', `4')
9659 @result{}4
9660 @error{}m4trace: -6- shift(`1', `2', `3', `4')
9661 @error{}m4trace: -5- shift(`2', `3', `4')
9662 @error{}m4trace: -4- shift(`3', `4')
9663 @error{}m4trace: -3- shift(`4')
9664 @end example
9665
9666 @cindex quadratic behavior, avoiding
9667 @cindex avoiding quadratic behavior
9668 Each successive iteration was adding more quoted @code{shift}
9669 invocations, and the entire list contents were passing through every
9670 iteration.  In general, when recursing, it is a good idea to make the
9671 recursion use fewer arguments, rather than adding additional quoted
9672 uses of @code{shift}.  By doing so, @code{m4} uses less memory, invokes
9673 fewer macros, is less likely to run into machine limits, and most
9674 importantly, performs faster.  The fixed version of @code{foreachq} can
9675 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
9676
9677 @comment examples
9678 @example
9679 $ @kbd{m4 -I examples}
9680 include(`foreachq2.m4')
9681 @result{}
9682 undivert(`foreachq2.m4')dnl
9683 @result{}include(`quote.m4')dnl
9684 @result{}divert(`-1')
9685 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9686 @result{}#   quoted list, improved version
9687 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
9688 @result{}define(`_arg1q', ``$1'')
9689 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
9690 @result{}define(`_foreachq', `ifelse(`$2', `', `',
9691 @result{}  `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
9692 @result{}divert`'dnl
9693 traceon(`shift')debugmode(`aq')
9694 @result{}
9695 foreachq(`x', ``1', `2', `3', `4'', `x
9696 ')dnl
9697 @result{}1
9698 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9699 @result{}2
9700 @error{}m4trace: -3- shift(`2', `3', `4')
9701 @result{}3
9702 @error{}m4trace: -3- shift(`3', `4')
9703 @result{}4
9704 @end example
9705
9706 Note that the fixed version calls unquoted helper macros in
9707 @code{@w{_foreachq}} to trim elements immediately; those helper macros
9708 in turn must re-supply the layer of quotes lost in the macro invocation.
9709 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
9710 element, with @code{@w{_arg1}} of the earlier implementation that
9711 returned the first list element directly.  Additionally, by calling the
9712 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
9713 contains unexpanded macros.
9714
9715 The astute m4 programmer might notice that the solution above still uses
9716 more macro invocations than strictly necessary.  Note that @samp{$2},
9717 which contains an arbitrarily long quoted list, is expanded and
9718 rescanned three times per iteration of @code{_foreachq}.  Furthermore,
9719 every iteration of the algorithm effectively unboxes then reboxes the
9720 list, which costs a couple of macro invocations.  It is possible to
9721 rewrite the algorithm by swapping the order of the arguments to
9722 @code{_foreachq} in order to operate on an unboxed list in the first
9723 place, and by using the fixed-length @samp{$#} instead of an arbitrary
9724 length list as the key to end recursion.  The result is an overhead of
9725 six macro invocations per loop (excluding any macros in @var{text}),
9726 instead of eight.  This alternative approach is available as
9727 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
9728
9729 @comment examples
9730 @example
9731 $ @kbd{m4 -I examples}
9732 include(`foreachq3.m4')
9733 @result{}
9734 undivert(`foreachq3.m4')dnl
9735 @result{}divert(`-1')
9736 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9737 @result{}#   quoted list, alternate improved version
9738 @result{}define(`foreachq', `ifelse(`$2', `', `',
9739 @result{}  `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
9740 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
9741 @result{}  `define(`$1', `$4')$2`'$0(`$1', `$2',
9742 @result{}    shift(shift(shift($@@))))')')
9743 @result{}divert`'dnl
9744 traceon(`shift')debugmode(`aq')
9745 @result{}
9746 foreachq(`x', ``1', `2', `3', `4'', `x
9747 ')dnl
9748 @result{}1
9749 @error{}m4trace: -4- shift(`x', `x
9750 @error{}', `', `1', `2', `3', `4')
9751 @error{}m4trace: -3- shift(`x
9752 @error{}', `', `1', `2', `3', `4')
9753 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
9754 @result{}2
9755 @error{}m4trace: -4- shift(`x', `x
9756 @error{}', `1', `2', `3', `4')
9757 @error{}m4trace: -3- shift(`x
9758 @error{}', `1', `2', `3', `4')
9759 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9760 @result{}3
9761 @error{}m4trace: -4- shift(`x', `x
9762 @error{}', `2', `3', `4')
9763 @error{}m4trace: -3- shift(`x
9764 @error{}', `2', `3', `4')
9765 @error{}m4trace: -2- shift(`2', `3', `4')
9766 @result{}4
9767 @error{}m4trace: -4- shift(`x', `x
9768 @error{}', `3', `4')
9769 @error{}m4trace: -3- shift(`x
9770 @error{}', `3', `4')
9771 @error{}m4trace: -2- shift(`3', `4')
9772 @end example
9773
9774 Prior to M4 1.6, every instance of @samp{$@@} was rescanned as it was
9775 encountered.  Thus, the @file{foreachq3.m4} alternative used much less
9776 memory than @file{foreachq2.m4}, and executed as much as 10% faster,
9777 since each iteration encountered fewer @samp{$@@}.  However, the
9778 implementation of rescanning every byte in @samp{$@@} was quadratic in
9779 the number of bytes scanned (for example, making the broken version in
9780 @file{foreachq.m4} cubic, rather than quadratic, in behavior).  Once the
9781 underlying M4 implementation was improved in 1.6 to reuse results of
9782 previous scans, both styles of @code{foreachq} become linear in the
9783 number of bytes scanned, but the @file{foreachq3.m4} version remains
9784 noticeably faster because of fewer macro invocations.  Notice how the
9785 implementation injects an empty argument prior to expanding @samp{$2}
9786 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
9787 the third argument altogether, and ends recursion when there are three
9788 arguments left because there was nothing left to pass through
9789 @code{shift}.  Thus, each iteration only needs one @code{ifelse}, rather
9790 than the two conditionals used in the version from @file{foreachq2.m4}.
9791
9792 @cindex nine arguments, more than
9793 @cindex more than nine arguments
9794 @cindex arguments, more than nine
9795 So far, all of the implementations of @code{foreachq} presented have
9796 been quadratic with M4 1.4.x.  But @code{forloop} is linear, because
9797 each iteration parses a constant amount of arguments.  So, it is
9798 possible to design a variant that uses @code{forloop} to do the
9799 iteration, then uses @samp{$@@} only once at the end, giving a linear
9800 result even with older M4 implementations.  This implementation relies
9801 on the @acronym{GNU} extension that @samp{$10} expands to the tenth
9802 argument rather than the first argument concatenated with @samp{0}.  The
9803 trick is to define an intermediate macro that repeats the text
9804 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
9805 integers corresponding to each argument.  The helper macro
9806 @code{_foreachq_} is needed in order to generate the literal sequences
9807 such as @samp{$1} into the intermediate macro, rather than expanding
9808 them as the arguments of @code{_foreachq}.  With this approach, no
9809 @code{shift} calls are even needed!  However, when linear recursion is
9810 available in new enough M4, the time and memory cost of using
9811 @code{forloop} to build an intermediate macro outweigh the costs of any
9812 of the previous implementations (there are seven macros of overhead per
9813 iteration instead of six in @file{foreachq3.m4}, and the entire
9814 intermediate macro must be built in memory before any iteration is
9815 expanded).  Additionally, this approach will need adjustment when a
9816 future version of M4 follows @acronym{POSIX} by no longer treating
9817 @samp{$10} as the tenth argument; the anticipation is that
9818 @samp{$@{10@}} can be used instead, although that alternative syntax is
9819 not yet supported.
9820
9821 @comment examples
9822 @example
9823 $ @kbd{m4 -I examples}
9824 include(`foreachq4.m4')
9825 @result{}
9826 undivert(`foreachq4.m4')dnl
9827 @result{}include(`forloop2.m4')dnl
9828 @result{}divert(`-1')
9829 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9830 @result{}#   quoted list, version based on forloop
9831 @result{}define(`foreachq',
9832 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
9833 @result{}define(`_foreachq',
9834 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
9835 @result{}  `$0_(`1', `2', indir(`$1'))')`popdef(
9836 @result{}    `$1')')indir(`$1', $@@)')
9837 @result{}define(`_foreachq_',
9838 @result{}``define(`$$1', `$$3')$$2`''')
9839 @result{}divert`'dnl
9840 traceon(`shift')debugmode(`aq')
9841 @result{}
9842 foreachq(`x', ``1', `2', `3', `4'', `x
9843 ')dnl
9844 @result{}1
9845 @result{}2
9846 @result{}3
9847 @result{}4
9848 @end example
9849
9850 For yet another approach, the improved version of @code{foreach},
9851 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
9852 overquotes the arguments to @code{@w{_foreach}} to begin with, using
9853 @code{dquote_elt}.  Then @code{@w{_foreach}} can just use
9854 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
9855 front:
9856
9857 @comment examples
9858 @example
9859 $ @kbd{m4 -I examples}
9860 include(`foreach2.m4')
9861 @result{}
9862 undivert(`foreach2.m4')dnl
9863 @result{}include(`quote.m4')dnl
9864 @result{}divert(`-1')
9865 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
9866 @result{}#   parenthesized list, improved version
9867 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
9868 @result{}  (dquote(dquote_elt$2)), `$3')popdef(`$1')')
9869 @result{}define(`_arg1', `$1')
9870 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
9871 @result{}  `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
9872 @result{}divert`'dnl
9873 traceon(`shift')debugmode(`aq')
9874 @result{}
9875 foreach(`x', `(`1', `2', `3', `4')', `x
9876 ')dnl
9877 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9878 @error{}m4trace: -4- shift(`2', `3', `4')
9879 @error{}m4trace: -4- shift(`3', `4')
9880 @result{}1
9881 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
9882 @result{}2
9883 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
9884 @result{}3
9885 @error{}m4trace: -3- shift(``3'', ``4'')
9886 @result{}4
9887 @error{}m4trace: -3- shift(``4'')
9888 @end example
9889
9890 It is likewise possible to write a variant of @code{foreach} that
9891 performs in linear time on M4 1.4.x; the easiest method is probably
9892 writing a version of @code{foreach} that unboxes its list, then invokes
9893 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
9894
9895 @cindex filtering defined symbols
9896 @cindex subset of defined symbols
9897 @cindex defined symbols, filtering
9898 With a robust @code{foreachq} implementation, it is possible to create a
9899 filter on a list of defined symbols.  This next example will find all
9900 symbols that contain @samp{if} or @samp{def}, via two different
9901 approaches.  In the first approach, @code{dquote_elt} is used to
9902 overquote each list element, then @code{dquote} forms the list; that
9903 way, the iterator @code{macro} can be expanded in place because its
9904 contents are already quoted.  This approach also uses a self-modifying
9905 macro @code{sep} to provide the correct number of commas.  In the second
9906 approach, the iterator @code{macro} contains live text, so it must be
9907 used with @code{defn} to avoid unintentional expansion.  The correct
9908 number of commas is achieved by using @code{shift} to ignore the first
9909 one, although a leading space still remains.
9910
9911 @comment examples
9912 @example
9913 $ @kbd{m4 -I examples}
9914 include(`quote.m4')include(`foreachq2.m4')
9915 @result{}
9916 pushdef(`sep', `define(`sep', ``, '')')
9917 @result{}
9918 foreachq(`macro', dquote(dquote_elt(m4symbols)),
9919   `regexp(macro, `.*if.*', `sep`\&'')')
9920 @result{}ifdef, ifelse, shift
9921 popdef(`sep')
9922 @result{}
9923 shift(foreachq(`macro', dquote(m4symbols),
9924   `regexp(defn(`macro'), `def', `,` ''dquote(defn(`macro')))'))
9925 @result{} define, defn, dumpdef, ifdef, popdef, pushdef, undefine
9926 @end example
9927
9928 In summary, recursion over list elements is trickier than it appeared at
9929 first glance, but provides a powerful idiom within @code{m4} processing.
9930 As a final demonstration, both list styles are now able to handle
9931 several scenarios that would wreak havoc on one or both of the original
9932 implementations.  This points out one other difference between the
9933 list styles.  @code{foreach} evaluates unquoted list elements only once,
9934 in preparation for calling @code{@w{_foreach}}, similary for
9935 @code{foreachq} as provided by @file{foreachq3.m4} or
9936 @file{foreachq4.m4}.  But
9937 @code{foreachq}, as provided by @file{foreachq2.m4},
9938 evaluates unquoted list elements twice while visiting the first list
9939 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}.  When
9940 deciding which list style to use, one must take into account whether
9941 repeating the side effects of unquoted list elements will have any
9942 detrimental effects.
9943
9944 @comment examples
9945 @example
9946 $ @kbd{m4 -d -I examples}
9947 include(`foreach2.m4')
9948 @result{}
9949 include(`foreachq2.m4')
9950 @result{}
9951 dnl 0-element list:
9952 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
9953 @result{} /@w{ }
9954 dnl 1-element list of empty element
9955 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
9956 @result{}<> / <>
9957 dnl 2-element list of empty elements
9958 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
9959 @result{}<><> / <><>
9960 dnl 1-element list of a comma
9961 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
9962 @result{}<,> / <,>
9963 dnl 2-element list of unbalanced parentheses
9964 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
9965 @result{}<(><)> / <(><)>
9966 define(`ab', `oops')dnl using defn(`iterator')
9967 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
9968  foreachq(`x', ``a', `b'', `defn(`x')')
9969 @result{}ab / ab
9970 define(`active', `ACT, IVE')
9971 @result{}
9972 traceon(`active')
9973 @result{}
9974 dnl list of unquoted macros; expansion occurs before recursion
9975 foreach(`x', `(active, active)', `<x>
9976 ')dnl
9977 @error{}m4trace: -4- active -> `ACT, IVE'
9978 @error{}m4trace: -4- active -> `ACT, IVE'
9979 @result{}<ACT>
9980 @result{}<IVE>
9981 @result{}<ACT>
9982 @result{}<IVE>
9983 foreachq(`x', `active, active', `<x>
9984 ')dnl
9985 @error{}m4trace: -3- active -> `ACT, IVE'
9986 @error{}m4trace: -3- active -> `ACT, IVE'
9987 @result{}<ACT>
9988 @error{}m4trace: -3- active -> `ACT, IVE'
9989 @error{}m4trace: -3- active -> `ACT, IVE'
9990 @result{}<IVE>
9991 @result{}<ACT>
9992 @result{}<IVE>
9993 dnl list of quoted macros; expansion occurs during recursion
9994 foreach(`x', `(`active', `active')', `<x>
9995 ')dnl
9996 @error{}m4trace: -1- active -> `ACT, IVE'
9997 @result{}<ACT, IVE>
9998 @error{}m4trace: -1- active -> `ACT, IVE'
9999 @result{}<ACT, IVE>
10000 foreachq(`x', ``active', `active'', `<x>
10001 ')dnl
10002 @error{}m4trace: -1- active -> `ACT, IVE'
10003 @result{}<ACT, IVE>
10004 @error{}m4trace: -1- active -> `ACT, IVE'
10005 @result{}<ACT, IVE>
10006 dnl list of double-quoted macro names; no expansion
10007 foreach(`x', `(``active'', ``active'')', `<x>
10008 ')dnl
10009 @result{}<active>
10010 @result{}<active>
10011 foreachq(`x', ```active'', ``active''', `<x>
10012 ')dnl
10013 @result{}<active>
10014 @result{}<active>
10015 @end example
10016
10017 @node Improved copy
10018 @section Solution for @code{copy}
10019
10020 The macro @code{copy} presented above works with M4 1.6 and newer, but
10021 is unable to handle builtin tokens with M4 1.4.x, because it tries to
10022 pass the builtin token through the macro @code{curry}, where it is
10023 silently flattened to an empty string (@pxref{Composition}).  Rather
10024 than using the problematic @code{curry} to work around the limitation
10025 that @code{stack_foreach} expects to invoke a macro that takes exactly
10026 one argument, we can write a new macro that lets us form the exact
10027 two-argument @code{pushdef} call sequence needed, so that we are no
10028 longer passing a builtin token through a text macro.
10029
10030 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
10031   @var{sep})
10032 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
10033   @var{post}, @var{sep})
10034 For each of the @code{pushdef} definitions associated with @var{macro},
10035 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
10036 Additionally, expand @var{sep} between definitions.
10037 @code{stack_foreach_sep} visits the oldest definition first, while
10038 @code{stack_foreach_sep_lifo} visits the current definition first.  The
10039 expansion may dereference @var{macro}, but should not modify it.  There
10040 are a few special macros, such as @code{defn}, which cannot be used as
10041 the @var{macro} parameter.
10042 @end deffn
10043
10044 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
10045 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
10046 `)')}.  By supplying explicit parentheses, split among the @var{pre} and
10047 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
10048 construct macro calls with more than one argument, without passing
10049 builtin tokens through a macro call.  It is likewise possible to
10050 directly reference the stack definitions without a macro call, by
10051 leaving @var{pre} and @var{post} empty.  Thus, in addition to fixing
10052 @code{copy} on builtin tokens, it also executes with fewer macro
10053 invocations.
10054
10055 The new macro also adds a separator that is only output after the first
10056 iteration of the helper @code{_stack_reverse_sep}, implemented by
10057 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
10058 argument in subsequent iterations.  Note that the empty string that
10059 separates @var{sep} from @var{pre} is provided as part of the fourth
10060 argument when originally calling @code{_stack_reverse_sep}, and not by
10061 writing @code{$4`'$3} as the third argument in the recursive call; while
10062 the other approach would give the same output, it does so at the expense
10063 of increasing the argument size on each iteration of
10064 @code{_stack_reverse_sep}, which results in quadratic instead of linear
10065 execution time.  The improved stack walking macros are available in
10066 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
10067
10068 @comment examples
10069 @example
10070 $ @kbd{m4 -I examples}
10071 include(`stack_sep.m4')
10072 @result{}
10073 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
10074 ')m4exit(`1')',
10075    `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
10076 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
10077 @result{}
10078 copy(`a', `b')
10079 @result{}
10080 b
10081 @result{}0
10082 popdef(`b')
10083 @result{}
10084 b
10085 @result{}1
10086 pushdef(`c', `1')pushdef(`c', `2')
10087 @result{}
10088 stack_foreach_sep_lifo(`c', `', `', `, ')
10089 @result{}2, 1
10090 undivert(`stack_sep.m4')dnl
10091 @result{}divert(`-1')
10092 @result{}# stack_foreach_sep(macro, pre, post, sep)
10093 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
10094 @result{}# from the definition stack of MACRO, starting with the oldest, and
10095 @result{}# separated by SEP between definitions.
10096 @result{}define(`stack_foreach_sep',
10097 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
10098 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
10099 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
10100 @result{}# Like stack_foreach_sep, but starting with the newest definition.
10101 @result{}define(`stack_foreach_sep_lifo',
10102 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
10103 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
10104 @result{}define(`_stack_reverse_sep',
10105 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
10106 @result{}  `$1', `$2', `$4$3')')')
10107 @result{}divert`'dnl
10108 @end example
10109
10110 @node Improved m4wrap
10111 @section Solution for @code{m4wrap}
10112
10113 The replacement @code{m4wrap} versions presented above, designed to
10114 guarantee FIFO or LIFO order regardless of the underlying M4
10115 implementation, share a bug when dealing with wrapped text that looks
10116 like parameter expansion.  Note how the invocation of
10117 @code{m4wrap@var{n}} interprets these parameters, while using the
10118 builtin preserves them for their intended use.
10119
10120 @comment examples
10121 @example
10122 $ @kbd{m4 -I examples}
10123 include(`wraplifo.m4')
10124 @result{}
10125 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10126 ')
10127 @result{}
10128 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
10129 ')
10130 @result{}
10131 ^D
10132 @result{}m4wrap0:---0-
10133 @result{}bar:-a-a,b-2-
10134 @end example
10135
10136 Additionally, the computation of @code{_m4wrap_level} and creation of
10137 multiple @code{m4wrap@var{n}} placeholders in the original examples is
10138 more expensive in time and memory than strictly necessary.  Notice how
10139 the improved version grabs the wrapped text via @code{defn} to avoid
10140 parameter expansion, then undefines @code{_m4wrap_text}, before
10141 stripping a level of quotes with @code{_arg1} to expand the text.  That
10142 way, each level of wrapping reuses the single placeholder, which starts
10143 each nesting level in an undefined state.
10144
10145 Finally, it is worth emulating the @acronym{GNU} M4 extension of saving
10146 all arguments to @code{m4wrap}, separated by a space, rather than saving
10147 just the first argument.  This is done with the @code{join} macro
10148 documented previously (@pxref{Shift}).  The improved LIFO example is
10149 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
10150 easily be converted to a FIFO solution by swapping the adjacent
10151 invocations of @code{joinall} and @code{defn}.
10152
10153 @comment examples
10154 @example
10155 $ @kbd{m4 -I examples}
10156 include(`wraplifo2.m4')
10157 @result{}
10158 undivert(`wraplifo2.m4')dnl
10159 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
10160 @result{}include(`join.m4')dnl
10161 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
10162 @result{}define(`_arg1', `$1')dnl
10163 @result{}define(`m4wrap',
10164 @result{}`ifdef(`_$0_text',
10165 @result{}       `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
10166 @result{}       `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
10167 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
10168 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10169 ')
10170 @result{}
10171 m4wrap(`lifo text
10172 m4wrap(`nested', `', `$@@
10173 ')')
10174 @result{}
10175 ^D
10176 @result{}lifo text
10177 @result{}foo:-a-a,b-2-
10178 @result{}nested  $@@
10179 @end example
10180
10181 @node Improved cleardivert
10182 @section Solution for @code{cleardivert}
10183
10184 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
10185 called without arguments to clear all pending diversions.  That is
10186 because using undivert with an empty string for an argument is different
10187 than using it with no arguments at all.  Compare the earlier definition
10188 with one that takes the number of arguments into account:
10189
10190 @example
10191 define(`cleardivert',
10192   `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
10193 @result{}
10194 divert(`1')one
10195 divert
10196 @result{}
10197 cleardivert
10198 @result{}
10199 undivert
10200 @result{}one
10201 @result{}
10202 define(`cleardivert',
10203   `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
10204     `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
10205 @result{}
10206 divert(`2')two
10207 divert
10208 @result{}
10209 cleardivert
10210 @result{}
10211 undivert
10212 @result{}
10213 @end example
10214
10215 @node Improved capitalize
10216 @section Solution for @code{capitalize}
10217
10218 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
10219 not allow clients to follow the quoting rule of thumb.  Consider the
10220 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
10221 difference between calling @code{capitalize} with the expansion of a
10222 macro, expanding the result of a case change, and changing the case of a
10223 double-quoted string:
10224
10225 @comment examples
10226 @example
10227 $ @kbd{m4 -I examples}
10228 include(`capitalize.m4')dnl
10229 define(`active', `act1, ive')dnl
10230 define(`Active', `Act2, Ive')dnl
10231 define(`ACTIVE', `ACT3, IVE')dnl
10232 upcase(active)
10233 @result{}ACT1,IVE
10234 upcase(`active')
10235 @result{}ACT3, IVE
10236 upcase(``active'')
10237 @result{}ACTIVE
10238 downcase(ACTIVE)
10239 @result{}act3,ive
10240 downcase(`ACTIVE')
10241 @result{}act1, ive
10242 downcase(``ACTIVE'')
10243 @result{}active
10244 capitalize(active)
10245 @result{}Act1
10246 capitalize(`active')
10247 @result{}Active
10248 capitalize(``active'')
10249 @result{}_capitalize(`active')
10250 define(`A', `OOPS')
10251 @result{}
10252 capitalize(active)
10253 @result{}OOPSct1
10254 capitalize(`active')
10255 @result{}OOPSctive
10256 @end example
10257
10258 First, when @code{capitalize} is called with more than one argument, it
10259 was throwing away later arguments, whereas @code{upcase} and
10260 @code{downcase} used @samp{$*} to collect them all.  The fix is simple:
10261 use @samp{$*} consistently.
10262
10263 Next, with single-quoting, @code{capitalize} outputs a single character,
10264 a set of quotes, then the rest of the characters, making it impossible
10265 to invoke @code{Active} after the fact, and allowing the alternate macro
10266 @code{A} to interfere.  Here, the solution is to use additional quoting
10267 in the helper macros, then pass the final over-quoted output string
10268 through @code{_arg1} to remove the extra quoting and finally invoke the
10269 concatenated portions as a single string.
10270
10271 Finally, when passed a double-quoted string, the nested macro
10272 @code{_capitalize} is never invoked because it ended up nested inside
10273 quotes.  This one is the toughest to fix.  In short, we have no idea how
10274 many levels of quotes are in effect on the substring being altered by
10275 @code{patsubst}.  If the replacement string cannot be expressed entirely
10276 in terms of literal text and backslash substitutions, then we need a
10277 mechanism to guarantee that the helper macros are invoked outside of
10278 quotes.  In other words, this sounds like a job for @code{changequote}
10279 (@pxref{Changequote}).  By changing the active quoting characters, we
10280 can guarantee that replacement text injected by @code{patsubst} always
10281 occurs in the middle of a string that has exactly one level of
10282 over-quoting using alternate quotes; so the replacement text closes the
10283 quoted string, invokes the helper macros, then reopens the quoted
10284 string.  In turn, that means the replacement text has unbalanced quotes,
10285 necessitating another round of @code{changequote}.
10286
10287 In the fixed version below, (also shipped as
10288 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
10289 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
10290 strings are chosen so as to be less likely to appear in the text being
10291 converted).  The helpers @code{_to_alt} and @code{_from_alt} merely
10292 reduce the number of characters required to perform a
10293 @code{changequote}, since the definition changes twice.  The outermost
10294 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
10295 with alternate quoting; the innermost pair is used so that the third
10296 argument to @code{patsubst} can contain an unbalanced
10297 @samp{]>>}/@samp{<<[} pair.  Note that @code{upcase} and @code{downcase}
10298 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
10299 they contain nested quotes but are invoked with the alternate quoting
10300 scheme in effect.
10301
10302 @comment examples
10303 @example
10304 $ @kbd{m4 -I examples}
10305 include(`capitalize2.m4')dnl
10306 define(`active', `act1, ive')dnl
10307 define(`Active', `Act2, Ive')dnl
10308 define(`ACTIVE', `ACT3, IVE')dnl
10309 define(`A', `OOPS')dnl
10310 capitalize(active; `active'; ``active''; ```actIVE''')
10311 @result{}Act1,Ive; Act2, Ive; Active; `Active'
10312 undivert(`capitalize2.m4')dnl
10313 @result{}divert(`-1')
10314 @result{}# upcase(text)
10315 @result{}# downcase(text)
10316 @result{}# capitalize(text)
10317 @result{}#   change case of text, improved version
10318 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
10319 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
10320 @result{}define(`_arg1', `$1')
10321 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
10322 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
10323 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
10324 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
10325 @result{}define(`_capitalize_alt',
10326 @result{}  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
10327 @result{}    <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
10328 @result{}define(`capitalize',
10329 @result{}  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
10330 @result{}    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
10331 @result{}divert`'dnl
10332 @end example
10333
10334 @node Improved fatal_error
10335 @section Solution for @code{fatal_error}
10336
10337 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
10338 of @acronym{GNU} M4 earlier than 1.4.8, where invoking
10339 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
10340 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
10341 though all files start at line 1.  Furthermore, versions earlier than
10342 1.4.6 did not support the @code{@w{__program__}} macro.  If you want
10343 @code{fatal_error} to work across the entire 1.4.x release series, a
10344 better implementation would be:
10345
10346 @comment status: 1
10347 @example
10348 define(`fatal_error',
10349   `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
10350 `:ifelse(__line__, `0', `',
10351     `__file__:__line__:')` fatal error: $*
10352 ')m4exit(`1')')
10353 @result{}
10354 m4wrap(`divnum(`demo of internal message')
10355 fatal_error(`inside wrapped text')')
10356 @result{}
10357 ^D
10358 @error{}m4:stdin:6: warning: divnum: extra arguments ignored: 1 > 0
10359 @result{}0
10360 @error{}m4:stdin:6: fatal error: inside wrapped text
10361 @end example
10362
10363 @c ========================================================== Appendices
10364
10365 @node Copying This Package
10366 @appendix How to make copies of the overall M4 package
10367 @cindex License, code
10368
10369 This appendix covers the license for copying the source code of the
10370 overall M4 package.  This manual is under a different set of
10371 restrictions, covered later (@pxref{Copying This Manual}).
10372
10373 @menu
10374 * GNU General Public License::  License for copying the M4 package
10375 @end menu
10376
10377 @node GNU General Public License
10378 @appendixsec License for copying the M4 package
10379 @cindex GPL, GNU General Public License
10380 @cindex GNU General Public License
10381 @cindex General Public License (GPL), GNU
10382 @include gpl-3.0.texi
10383
10384 @node Copying This Manual
10385 @appendix How to make copies of this manual
10386 @cindex License, manual
10387
10388 This appendix covers the license for copying this manual.  Note that
10389 some of the longer examples in this manual are also distributed in the
10390 directory @file{m4-@value{VERSION}/@/examples/}, where a more
10391 permissive license is in effect when copying just the examples.
10392
10393 @menu
10394 * GNU Free Documentation License::  License for copying this manual
10395 @end menu
10396
10397 @node GNU Free Documentation License
10398 @appendixsec License for copying this manual
10399 @cindex FDL, GNU Free Documentation License
10400 @cindex GNU Free Documentation License
10401 @cindex Free Documentation License (FDL), GNU
10402 @include fdl-1.3.texi
10403
10404 @node Indices
10405 @appendix Indices of concepts and macros
10406
10407 @menu
10408 * Macro index::                 Index for all @code{m4} macros
10409 * Concept index::               Index for many concepts
10410 @end menu
10411
10412 @node Macro index
10413 @appendixsec Index for all @code{m4} macros
10414
10415 This index covers all @code{m4} builtins, as well as several useful
10416 composite macros.  References are exclusively to the places where a
10417 macro is introduced the first time.
10418
10419 @printindex fn
10420
10421 @node Concept index
10422 @appendixsec Index for many concepts
10423
10424 @printindex cp
10425
10426 @bye
10427
10428 @c Local Variables:
10429 @c fill-column: 72
10430 @c ispell-local-dictionary: "american"
10431 @c indent-tabs-mode: nil
10432 @c whitespace-check-buffer-indent: nil
10433 @c End: