1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
6 @settitle GNU M4 @value{VERSION} macro processor
9 @setcontentsaftertitlepage
17 @c The testsuite expects literal tab output in some examples, but
18 @c literal tabs in texinfo leads to formatting issues.
24 @c -------------------
25 @c The ARG is an optional argument. To be used for macro arguments in
26 @c their documentation (@defmac).
28 @r{[}@var{\varname\}@r{]}@c
31 @c @dvar{ARG, DEFAULT}
32 @c -------------------
33 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
34 @c for macro arguments in their documentation (@defmac).
35 @macro dvar{varname, default}
36 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
39 @comment %**end of header
40 @comment ========================================================
44 This manual (@value{UPDATED}) is for @acronym{GNU} M4 (version
45 @value{VERSION}), a package containing an implementation of the m4 macro
48 Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999,
49 2000, 2001, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
52 Permission is granted to copy, distribute and/or modify this document
53 under the terms of the @acronym{GNU} Free Documentation License,
54 Version 1.2 or any later version published by the Free Software
55 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
56 Back-Cover Texts. A copy of the license is included in the section
57 entitled ``@acronym{GNU} Free Documentation License.''
61 @dircategory Text creation and manipulation
63 * M4: (m4). A powerful macro processor.
67 @title GNU M4, version @value{VERSION}
68 @subtitle A powerful macro processor
69 @subtitle Edition @value{EDITION}, @value{UPDATED}
70 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
71 @author Gary V. Vaughan, and Eric Blake
72 @author (@email{bug-m4@@gnu.org})
75 @vskip 0pt plus 1filll
87 @acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
88 processor. It is mostly SVR4 compatible, although it has some
89 extensions (for example, handling more than 9 positional parameters
90 to macros). @code{m4} also has builtin functions for including
91 files, running shell commands, doing arithmetic, etc. Autoconf needs
92 @acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
95 @acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
96 subsequent changes by Fran@,{c}ois Pinard and other volunteers
97 on the Internet. All names and email addresses can be found in the
98 files @file{m4-@value{VERSION}/@/AUTHORS} and
99 @file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
103 This is release @value{VERSION}. It is now considered stable: future
104 releases on this branch are only meant to fix bugs, increase speed, or
105 improve documentation.
109 This is BETA release @value{VERSION}. This is a development release,
110 and as such, is prone to bugs, crashes, unforeseen features, incomplete
111 documentation@dots{}, therefore, use at your own peril. In case of
112 problems, please do not hesitate to report them (see the
113 @file{m4-@value{VERSION}/@/README} file in the distribution).
118 * Preliminaries:: Introduction and preliminaries
119 * Invoking m4:: Invoking @code{m4}
120 * Syntax:: Lexical and syntactic conventions
122 * Macros:: How to invoke macros
123 * Definitions:: How to define new macros
124 * Conditionals:: Conditionals, loops, and recursion
126 * Debugging:: How to debug macros and input
128 * Input Control:: Input control
129 * File Inclusion:: File inclusion
130 * Diversions:: Diverting and undiverting output
132 * Modules:: Extending M4 with dynamic runtime modules
134 * Text handling:: Macros for text handling
135 * Arithmetic:: Macros for doing arithmetic
136 * Shell commands:: Macros for running shell commands
137 * Miscellaneous:: Miscellaneous builtin macros
138 * Frozen files:: Fast loading of frozen state
140 * Compatibility:: Compatibility with other versions of @code{m4}
141 * Answers:: Correct version of some examples
143 * Copying This Package:: How to make copies of the overall M4 package
144 * Copying This Manual:: How to make copies of this manual
145 * Indices:: Indices of concepts and macros
148 --- The Detailed Node Listing ---
150 Introduction and preliminaries
152 * Intro:: Introduction to @code{m4}
153 * History:: Historical references
154 * Bugs:: Problems and bugs
155 * Manual:: Using this manual
159 * Operation modes:: Command line options for operation modes
160 * Dynamic loading features:: Command line options for dynamic loading
161 * Preprocessor features:: Command line options for preprocessor features
162 * Limits control:: Command line options for limits control
163 * Frozen state:: Command line options for frozen state
164 * Debugging options:: Command line options for debugging
165 * Command line files:: Specifying input files on the command line
167 Lexical and syntactic conventions
169 * Names:: Macro names
170 * Quoted strings:: Quoting input to @code{m4}
171 * Comments:: Comments in @code{m4} input
172 * Other tokens:: Other kinds of input tokens
173 * Input processing:: How @code{m4} copies input to output
174 * Regular expression syntax:: How @code{m4} interprets regular expressions
178 * Invocation:: Macro invocation
179 * Inhibiting Invocation:: Preventing macro invocation
180 * Macro Arguments:: Macro arguments
181 * Quoting Arguments:: On Quoting Arguments to macros
182 * Macro expansion:: Expanding macros
184 How to define new macros
186 * Define:: Defining a new macro
187 * Arguments:: Arguments to macros
188 * Pseudo Arguments:: Special arguments to macros
189 * Undefine:: Deleting a macro
190 * Defn:: Renaming macros
191 * Pushdef:: Temporarily redefining macros
192 * Renamesyms:: Renaming macros with regular expressions
194 * Indir:: Indirect call of macros
195 * Builtin:: Indirect call of builtins
196 * M4symbols:: Getting the defined macro names
198 Conditionals, loops, and recursion
200 * Ifdef:: Testing if a macro is defined
201 * Ifelse:: If-else construct, or multibranch
202 * Shift:: Recursion in @code{m4}
203 * Forloop:: Iteration by counting
204 * Foreach:: Iteration by list contents
205 * Stacks:: Working with definition stacks
206 * Composition:: Building macros with macros
208 How to debug macros and input
210 * Dumpdef:: Displaying macro definitions
211 * Trace:: Tracing macro calls
212 * Debugmode:: Controlling debugging options
213 * Debuglen:: Limiting debug output
214 * Debugfile:: Saving debugging output
218 * Dnl:: Deleting whitespace in input
219 * Changequote:: Changing the quote characters
220 * Changecom:: Changing the comment delimiters
221 * Changeresyntax:: Changing the regular expression syntax
222 * Changesyntax:: Changing the lexical structure of the input
223 * M4wrap:: Saving text until end of input
227 * Include:: Including named files
228 * Search Path:: Searching for include files
230 Diverting and undiverting output
232 * Divert:: Diverting output
233 * Undivert:: Undiverting output
234 * Divnum:: Diversion numbers
235 * Cleardivert:: Discarding diverted text
237 Extending M4 with dynamic runtime modules
239 * M4modules:: Listing loaded modules
240 * Load:: Loading additional modules
241 * Unload:: Removing loaded modules
242 * Refcount:: Tracking module references
243 * Standard Modules:: Standard bundled modules
245 Macros for text handling
247 * Len:: Calculating length of strings
248 * Index macro:: Searching for substrings
249 * Regexp:: Searching for regular expressions
250 * Substr:: Extracting substrings
251 * Translit:: Translating characters
252 * Patsubst:: Substituting text by regular expression
253 * Format:: Formatting strings (printf-like)
255 Macros for doing arithmetic
257 * Incr:: Decrement and increment operators
258 * Eval:: Evaluating integer expressions
259 * Mpeval:: Multiple precision arithmetic
261 Macros for running shell commands
263 * Platform macros:: Determining the platform
264 * Syscmd:: Executing simple commands
265 * Esyscmd:: Reading the output of commands
266 * Sysval:: Exit status
267 * Mkstemp:: Making temporary files
268 * Mkdtemp:: Making temporary directories
270 Miscellaneous builtin macros
272 * Errprint:: Printing error messages
273 * Location:: Printing current location
274 * M4exit:: Exiting from @code{m4}
275 * Syncoutput:: Turning on and off sync lines
277 Fast loading of frozen state
279 * Using frozen files:: Using frozen files
280 * Frozen file format 1:: Frozen file format 1
281 * Frozen file format 2:: Frozen file format 2
283 Compatibility with other versions of @code{m4}
285 * Extensions:: Extensions in @acronym{GNU} M4
286 * Incompatibilities:: Other incompatibilities
287 * Experiments:: Experimental features in @acronym{GNU} M4
289 Correct version of some examples
291 * Improved exch:: Solution for @code{exch}
292 * Improved forloop:: Solution for @code{forloop}
293 * Improved foreach:: Solution for @code{foreach}
294 * Improved copy:: Solution for @code{copy}
295 * Improved m4wrap:: Solution for @code{m4wrap}
296 * Improved cleardivert:: Solution for @code{cleardivert}
297 * Improved capitalize:: Solution for @code{capitalize}
298 * Improved fatal_error:: Solution for @code{fatal_error}
300 How to make copies of the overall M4 package
302 * GNU General Public License:: License for copying the M4 package
304 How to make copies of this manual
306 * GNU Free Documentation License:: License for copying this manual
308 Indices of concepts and macros
310 * Macro index:: Index for all @code{m4} macros
311 * Concept index:: Index for many concepts
317 @chapter Introduction and preliminaries
319 This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
320 comes from, how to read and use this documentation, how to call the
321 @code{m4} program, and how to report bugs about it. It concludes by
322 giving tips for reading the remainder of the manual.
324 The following chapters then detail all the features of the @code{m4}
325 language, as shipped in the @acronym{GNU} M4 package.
328 * Intro:: Introduction to @code{m4}
329 * History:: Historical references
330 * Bugs:: Problems and bugs
331 * Manual:: Using this manual
335 @section Introduction to @code{m4}
337 @cindex overview of @code{m4}
338 @code{m4} is a macro processor, in the sense that it copies its
339 input to the output, expanding macros as it goes. Macros are either
340 builtin or user-defined, and can take any number of arguments.
341 Besides just doing macro expansion, @code{m4} has builtin functions
342 for including named files, running shell commands, doing integer
343 arithmetic, manipulating text in various ways, performing recursion,
344 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
345 or as a macro processor in its own right.
347 The @code{m4} macro processor is widely available on all UNIXes, and has
348 been standardized by @acronym{POSIX}.
349 Usually, only a small percentage of users are aware of its existence.
350 However, those who find it often become committed users. The
351 popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
352 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
353 for many to install it, while these people will not themselves
354 program in @code{m4}. @acronym{GNU} @code{m4} is mostly compatible with the
355 System V, Release 3 version, except for some minor differences.
356 @xref{Compatibility}, for more details.
358 Some people find @code{m4} to be fairly addictive. They first use
359 @code{m4} for simple problems, then take bigger and bigger challenges,
360 learning how to write complex sets of @code{m4} macros along the way.
361 Once really addicted, users pursue writing of sophisticated @code{m4}
362 applications even to solve simple problems, devoting more time
363 debugging their @code{m4} scripts than doing real work. Beware that
364 @code{m4} may be dangerous for the health of compulsive programmers.
367 @section Historical references
369 @cindex history of @code{m4}
370 @cindex @acronym{GNU} M4, history of
371 @code{GPM} was an important ancestor of @code{m4}. See
372 C. Strachey: ``A General Purpose Macro generator'', Computer Journal
373 8,3 (1965), pp.@: 225 ff. @code{GPM} is also succinctly described into
374 David Gries classic ``Compiler Construction for Digital Computers''.
376 The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
377 Addison-Wesley, Inc.@: (1976) describes and implements a Unix
378 macro-processor language, which inspired Dennis Ritchie to write
379 @code{m3}, a macro processor for the AP-3 minicomputer.
381 Kernighan and Ritchie then joined forces to develop the original
382 @code{m4}, as described in ``The M4 Macro Processor'', Bell
383 Laboratories (1977). It had only 21 builtin macros.
385 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
386 the true intricacies of real life: macros can be recognized without
387 being pre-announced, skipping whitespace or end-of-lines is easier,
388 more constructs are builtin instead of derived, etc.
390 Originally, the Kernighan and Plauger macro-processor, and then
391 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
392 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
393 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
395 Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
397 in 1990, with the aim of removing the artificial limitations in many
398 of the traditional @code{m4} implementations, such as maximum line
399 length, macro size, or number of macros.
401 The late Professor A. Dain Samples described and implemented a further
402 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
403 Language: 2nd edition'', Electronic Announcement on comp.compilers
406 Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
407 1992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
408 the stable release for 10 years. It was at this time that @acronym{GNU}
409 Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
410 engine, since all other implementations of @code{m4} had too many
413 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
414 addressed some long standing bugs in the venerable 1.4 release. Then in
415 2005, Gary V. Vaughan collected together the many patches to
416 @acronym{GNU} @code{m4} 1.4 that were floating around the net and
417 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
418 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
419 More bug fixes were incorporated in 2007, with the releases of 1.4.9 and
420 1.4.10, closing the series with 1.4.11 in 2008.
422 Additionally, in 2008, Eric rewrote the scanning engine to reduce
423 recursive evaluation from quadratic to linear complexity, released as M4
424 1.6. The 1.x branch series remains open for bug fixes.
426 Meanwhile, development was underway for new features for @code{m4},
427 such as dynamic module loading and additional builtins, practically
428 rewriting the entire code base. This development has spurred
429 improvements to other @acronym{GNU} software, such as @acronym{GNU}
430 Libtool. @acronym{GNU} M4 2.0 is the result of this effort.
433 @section Problems and bugs
435 @cindex reporting bugs
437 @cindex suggestions, reporting
438 If you have problems with @acronym{GNU} M4 or think you've found a bug,
439 please report it. Before reporting a bug, make sure you've actually
440 found a real bug. Carefully reread the documentation and see if it
441 really says you can do what you're trying to do. If it's not clear
442 whether you should be able to do something or not, report that too; it's
443 a bug in the documentation!
445 Before reporting a bug or trying to fix it yourself, try to isolate it
446 to the smallest possible input file that reproduces the problem. Then
447 send us the input file and the exact results @code{m4} gave you. Also
448 say what you expected to occur; this will help us decide whether the
449 problem was really in the documentation.
451 Once you've got a precise problem, send e-mail to
452 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
453 you are using. You can get this information with the command
454 @kbd{m4 --version}. You can also run @kbd{make check} to generate the
455 file @file{tests/@/testsuite.log}, useful for including in your report.
457 Non-bug suggestions are always welcome as well. If you have questions
458 about things that are unclear in the documentation or are just obscure
459 features, please report them too.
462 @section Using this manual
464 @cindex examples, understanding
465 This manual contains a number of examples of @code{m4} input and output,
466 and a simple notation is used to distinguish input, output and error
467 messages from @code{m4}. Examples are set out from the normal text, and
468 shown in a fixed width font, like this
472 This is an example of an example!
475 To distinguish input from output, all output from @code{m4} is prefixed
476 by the string @samp{@result{}}, and all error messages by the string
477 @samp{@error{}}. When showing how command line options affect matters,
478 the command line is shown with a prompt @samp{$ @kbd{like this}},
479 otherwise, you can assume that a simple @kbd{m4} invocation will work.
484 $ @kbd{command line to invoke m4}
485 Example of input line
486 @result{}Output line from m4
487 @error{}and an error message
490 The sequence @samp{^D} in an example indicates the end of the input
491 file. The sequence @samp{@key{NL}} refers to the newline character.
492 The majority of these examples are self-contained, and you can run them
493 with similar results. In fact, the testsuite that is bundled in the
494 @acronym{GNU} M4 package consists in part of the examples
495 in this document! Some of the examples assume that your current
496 directory is located where you unpacked the installation, so if you plan
497 on following along, you may find it helpful to do this now:
501 $ @kbd{cd m4-@value{VERSION}}
504 As each of the predefined macros in @code{m4} is described, a prototype
505 call of the macro will be shown, giving descriptive names to the
508 @deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @
509 @ovar{argument}@dots{})
510 This is a sample prototype. There is not really a macro named
511 @code{example}, but this documents that if there were, it would be a
512 Composite macro, rather than a Builtin, and would be provided by the
515 It requires at least one argument, @var{string}. Remember that in
516 @code{m4}, there must not be a space between the macro name and the
517 opening parenthesis, unless it was intended to call the macro without
518 any arguments. The brackets around @var{count} and @var{argument} show
519 that these arguments are optional. If @var{count} is omitted, the macro
520 behaves as if count were @samp{1}, whereas if @var{argument} is omitted,
521 the macro behaves as if it were the empty string. A blank argument is
522 not the same as an omitted argument. For example, @samp{example(`a')},
523 @samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave
524 identically with @var{count} set to @samp{1}; while @samp{example(`a',)}
525 and @samp{example(`a',`')} would explicitly pass the empty string for
526 @var{count}. The ellipses (@samp{@dots{}}) show that the macro
527 processes additional arguments after @var{argument}, rather than
531 Each builtin definition will list, in parentheses, the module that must
532 be loaded to use that macro. The standard modules include
533 @samp{m4} (which is always available), @samp{gnu} (for @acronym{GNU} specific
534 m4 extensions), and @samp{traditional} (for compatibility with System V
538 All macro arguments in @code{m4} are strings, but some are given
539 special interpretation, e.g., as numbers, file names, regular
540 expressions, etc. The documentation for each macro will state how the
541 parameters are interpreted, and what happens if the argument cannot be
542 parsed according to the desired interpretation. Unless specified
543 otherwise, a parameter specified to be a number is parsed as a decimal,
544 even if the argument has leading zeros; and parsing the empty string as
545 a number results in 0 rather than an error, although a warning will be
548 This document consistently writes and uses @dfn{builtin}, without a
549 hyphen, as if it were an English word. This is how the @code{builtin}
550 primitive is spelled within @code{m4}.
553 @chapter Invoking @code{m4}
556 @cindex invoking @code{m4}
557 The format of the @code{m4} command is:
561 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
564 @cindex command line, options
565 @cindex options, command line
566 @cindex @env{POSIXLY_CORRECT}
567 All options begin with @samp{-}, or if long option names are used, with
568 @samp{--}. A long option name need not be written completely, any
569 unambiguous prefix is sufficient. @acronym{POSIX} requires @code{m4} to
570 recognize arguments intermixed with files, even when
571 @env{POSIXLY_CORRECT} is set in the environment. Most options take
572 effect at startup regardless of their position, but some are documented
573 below as taking effect after any files that occurred earlier in the
574 command line. The argument @option{--} is a marker to denote the end of
577 With short options, options that do not take arguments may be combined
578 into a single command line argument with subsequent options, options
579 with mandatory arguments may be provided either as a single command line
580 argument or as two arguments, and options with optional arguments must
581 be provided as a single argument. In other words,
582 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
583 @kbd{m4 -Q -P -D foo -d ./a -d+f}, although the latter form is
584 considered canonical.
586 With long options, options with mandatory arguments may be provided with
587 an equal sign (@samp{=}) in a single argument, or as two arguments, and
588 options with optional arguments must be provided as a single argument.
589 In other words, @kbd{m4 --def foo --debug a} is equivalent to
590 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
591 considered canonical (not to mention more robust, in case a future
592 version of @code{m4} introduces an option named @option{--default}).
594 @code{m4} understands the following options, grouped by functionality.
597 * Operation modes:: Command line options for operation modes
598 * Dynamic loading features:: Command line options for dynamic loading
599 * Preprocessor features:: Command line options for preprocessor features
600 * Limits control:: Command line options for limits control
601 * Frozen state:: Command line options for frozen state
602 * Debugging options:: Command line options for debugging
603 * Command line files:: Specifying input files on the command line
606 @node Operation modes
607 @section Command line options for operation modes
609 Several options control the overall operation of @code{m4}:
613 Print a help summary on standard output, then immediately exit
614 @code{m4} without reading any input files or performing any other
618 Print the version number of the program on standard output, then
619 immediately exit @code{m4} without reading any input files or
620 performing any other actions.
624 Makes this invocation of @code{m4} non-interactive. This means that
625 output will be buffered, and an interrupt or pipe write error will halt
626 execution. If neither
627 @option{-b} nor @option{-i} are specified, this is activated by default
628 when any input files are specified, or when either standard input or
629 standard error is not a terminal. Note that this means that @kbd{m4}
630 alone might be interactive, but @kbd{m4 -} is not, even though both
631 commands process only standard input. If both @option{-b} and
632 @option{-i} are specified, only the last one takes effect.
635 @itemx --discard-comments
636 Discard all comments instead of copying them to the output.
639 @itemx --fatal-warnings
640 @cindex errors, fatal
642 Controls the effect of warnings. If unspecified, then execution
643 continues and exit status is unaffected when a warning is printed. If
644 specified exactly once, warnings become fatal; when one is issued,
645 execution continues, but the exit status will be non-zero. If specified
646 multiple times, then execution halts with non-zero status the first time
647 a warning is issued. The introduction of behavior levels is new to M4
648 1.4.9; for behavior consistent with earlier versions, you should specify
652 For backwards compatibility reasons, using @option{-E} behaves as if an
653 implicit @option{--debug=-d} option is also present. This is so that
654 scripts written for older M4 versions will not fail if they used
655 constructs that were previously silently allowed, but would now trigger
661 @error{}m4:stdin:1: Warning: defn: undefined macro `oops'
686 @comment options: -E -d
691 @error{}m4:stdin:1: Warning: defn: undefined macro `oops'
705 Makes this invocation of @code{m4} interactive. This means that all
706 output will be unbuffered, interrupts will be ignored, and behavior on
707 pipe write errors is inherited from the parent process. If neither
708 @option{-b} nor @option{-i} are specified, this is activated by default
709 when no input files are specified, and when both standard input and
710 standard error are terminals (similar to the way that /bin/sh determines
711 when to be interactive). If both @option{-b} and @option{-i} are
712 specified, only the last one takes effect. The spelling @option{-e}
713 exists for compatibility with other @code{m4} implementations, and
714 issues a warning because it may be withdrawn in a future version of
718 @itemx --prefix-builtins
719 Internally modify @emph{all} builtin macro names so they all start with
720 the prefix @samp{m4_}. For example, using this option, one should write
721 @samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}}
722 instead of @samp{@w{__file__}}. This option has no effect if @option{-R}
728 Suppress warnings, such as missing or superfluous arguments in macro
729 calls, or treating the empty string as zero. Error messages are still
730 printed. The distinction between error and warning is fuzzy, and if
731 you encounter a situation where the message output did not match your
732 expectations, please report that as a bug. This option is implied if
733 @env{POSIXLY_CORRECT} is set in the environment.
735 @item -r@r{[}@var{resyntax-spec}@r{]}
736 @itemx --regexp-syntax@r{[}=@var{resyntax-spec}@r{]}
737 Set the regular expression syntax according to @var{resyntax-spec}.
738 When this option is not given, or @var{resyntax-spec} is omitted,
739 @acronym{GNU} M4 uses the flavor @code{GNU_M4}, which provides
740 emacs-compatible regular expressions. @xref{Changeresyntax}, for more
741 details on the format and meaning of @var{resyntax-spec}. This option
742 may be given more than once, and order with respect to file names is
746 Cripple the following builtins, since each can perform potentially
747 unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}),
748 @code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}),
749 @code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}).
750 An attempt to use any of these macros will result in an error. This
751 option is intended to make it safer to preprocess an input file of
756 Enable warnings. Warnings are on by default unless
757 @env{POSIXLY_CORRECT} was set in the environment; this option exists to
758 allow overriding @option{--silent}.
759 @comment FIXME should we accept -Wall, -Wnone, -Wcategory,
760 @comment -Wno-category...?
763 @node Dynamic loading features
764 @section Command line options for dynamic loading
766 On platforms that support dynamic libraries, there are some options
767 that affect dynamic loading.
770 @item -M @var{directory}
771 @itemx --module-directory=@var{directory}
772 Specify an alternate @var{directory} to search for modules. This option
773 can be used multiple times to add several different directories to the
774 module search path. @xref{Modules}, for more details.
776 @item -m @var{module}
777 @itemx --load-module=@var{module}
778 Load @var{module} before parsing more input files. @var{module} is
779 searched for in each directory of the module search path, until the
780 first match is found or the list is exhausted. @xref{Modules}, for more
781 details. By default, the modules @samp{m4}, @samp{traditional}, and
782 @samp{gnu} are preloaded, although this can be controlled during
783 configuration with the @option{--with-modules} option to
784 @file{m4-@value{VERSION}/@/configure}. This option may be given more
785 than once, and order with respect to file names is significant.
787 @item --unload-module=@var{module}
788 Unload @var{module} before parsing more input files. @xref{Modules},
789 for more details. This option may be given more than once, and order
790 with respect to file names is significant.
793 @node Preprocessor features
794 @section Command line options for preprocessor features
796 @cindex macro definitions, on the command line
797 @cindex command line, macro definitions on the
798 @cindex preprocessor features
799 Several options allow @code{m4} to behave more like a preprocessor.
800 Macro definitions and deletions can be made on the command line, the
801 search path can be altered, and the output file can track where the
802 input came from. These features occur with the following options:
805 @item -B @var{directory}
806 @itemx --prepend-include=@var{directory}
807 Make @code{m4} search @var{directory} for included files, prior to
808 searching the current working directory. @xref{Search Path}, for more
809 details. This option may be given more than once. Some other
810 implementations of @code{m4} use @option{-B @var{number}} to change their
811 hard-coded limits, but that is unnecessary in @acronym{GNU} where the
812 only limit is your hardware capability. So although it is unlikely that
813 you will want to include a relative directory whose name is purely
814 numeric, @acronym{GNU} @code{m4} will warn you about this potential
815 compatibility issue; you can avoid the warning by using the long
816 spelling, or by using @samp{./@var{number}} if you really meant it.
818 @item -D @var{name}@r{[}=@var{value}@r{]}
819 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
820 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
821 missing, the value is taken to be the empty string. The @var{value} can
822 be any string, and the macro can be defined to take arguments, just as
823 if it was defined from within the input. This option may be given more
824 than once; order with respect to file names is significant, and
825 redefining the same @var{name} loses the previous value.
827 @item --import-environment
828 Imports every variable in the environment as a macro. This is done
829 before @option{-D} and @option{-U}, so they can override the
832 @item -I @var{directory}
833 @itemx --include=@var{directory}
834 Make @code{m4} search @var{directory} for included files that are not
835 found in the current working directory. @xref{Search Path}, for more
836 details. This option may be given more than once.
838 @item --popdef=@var{name}
839 This deletes the top-most meaning @var{name} might have. Obviously,
840 only predefined macros can be deleted in this way. This option may be
841 given more than once; popping a @var{name} that does not have a
842 definition is silently ignored. Order is significant with respect to
845 @item -p @var{name}@r{[}=@var{value}@r{]}
846 @itemx --pushdef=@var{name}@r{[}=@var{value}@r{]}
847 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
848 missing, the value is taken to be the empty string. The @var{value} can
849 be any string, and the macro can be defined to take arguments, just as
850 if it was defined from within the input. This option may be given more
851 than once; order with respect to file names is significant, and
852 redefining the same @var{name} adds another definition to its stack.
856 Short for @option{--syncoutput=1}, turning on synchronization lines
857 (sometimes called @dfn{synclines}).
859 @item --syncoutput@r{[}=@var{state}@r{]}
860 @cindex synchronization lines
861 @cindex location, input
862 @cindex input location
863 Control the generation of synchronization lines from the command line.
864 Synchronization lines are for use by the C preprocessor or other
865 similar tools. Order is significant with respect to file names. This
866 option is useful, for example, when @code{m4} is used as a
867 front end to a compiler. Source file name and line number information
868 is conveyed by directives of the form @samp{#line @var{linenum}
869 "@var{file}"}, which are inserted as needed into the middle of the
870 output. Such directives mean that the following line originated or was
871 expanded from the contents of input file @var{file} at line
872 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
873 the file name did not change from the previous directive.
875 Synchronization directives are always given on complete lines by
876 themselves. When a synchronization discrepancy occurs in the middle of
877 an output line, the associated synchronization directive is delayed
878 until the next newline that does not occur in the middle of a quoted
879 string or comment. @xref{Syncoutput}, for runtime control. @var{state}
880 is interpreted the same as the argument to @code{syncoutput}; if
881 @var{state} is omitted, or @option{--syncoutput} is not used,
882 synchronization lines are disabled.
885 @itemx --undefine=@var{name}
886 This deletes any predefined meaning @var{name} might have. Obviously,
887 only predefined macros can be deleted in this way. This option may be
888 given more than once; undefining a @var{name} that does not have a
889 definition is silently ignored. Order is significant with respect to
894 @section Command line options for limits control
896 There are some limits within @code{m4} that can be tuned. For
897 compatibility, @code{m4} also accepts some options that control limits
898 in other implementations, but which are automatically unbounded (limited
899 only by your hardware and operating system constraints) in @acronym{GNU}
905 Enable all the extensions in this implementation. This is on by
906 default unless @env{POSIXLY_CORRECT} is set in the environment; it
907 exists to allow overriding @option{--traditional}.
912 Suppress all the extensions made in this implementation, compared to the
913 System V version. @xref{Compatibility}, for a list of these. This
914 loads the @samp{traditional} module in place of the @samp{gnu} module.
915 It is implied if @env{POSIXLY_CORRECT} is set in the environment.
918 @itemx --nesting-limit=@var{num}
919 @cindex nesting limit
920 @cindex limit, nesting
921 Artificially limit the nesting of macro calls to @var{num} levels,
922 stopping program execution if this limit is ever exceeded. When not
923 specified, nesting is limited to 1024 levels. A value of zero means
924 unlimited; but then heavily nested code could potentially cause a stack
925 overflow. @var{num} can have an optional scaling suffix.
926 @comment FIXME - need a node on what scaling suffixes are supported (see
927 @comment [info coreutils 'block size'] for ideas), and need to consider
928 @comment whether builtins should also understand scaling suffixes:
929 @comment eval, mpeval, perhaps format
931 The precise effect of this option might be more correctly associated
932 with textual nesting than dynamic recursion. It has been useful
933 when some complex @code{m4} input was generated by mechanical means.
934 Most users would never need this option. If shown to be obtrusive,
935 this option (which is still experimental) might well disappear.
938 This option does @emph{not} have the ability to break endless
939 rescanning loops, since these do not necessarily consume much memory
940 or stack space. Through clever usage of rescanning loops, one can
941 request complex, time-consuming computations from @code{m4} with useful
942 results. Putting limitations in this area would break @code{m4} power.
943 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
944 only the simplest example (but @pxref{Compatibility}). Expecting @acronym{GNU}
945 @code{m4} to detect these would be a little like expecting a compiler
946 system to detect and diagnose endless loops: it is a quite @emph{hard}
947 problem in general, if not undecidable!
950 @itemx --hashsize=@var{num}
951 @itemx --word-regexp=@var{regexp}
952 These options are present only for compatibility with previous versions
953 of GNU @code{m4}. They do nothing except issue a warning, because the
954 symbol table size is not fixed anymore, and because the new
955 @code{changesyntax} feature is more efficient than the withdrawn
956 experimental @code{changeword}. These options will eventually disappear
961 These options are present for compatibility with System V @code{m4}, but
962 do nothing in this implementation. They may disappear in future
963 releases, and issue a warning to that effect.
967 @section Command line options for frozen state
969 @acronym{GNU} @code{m4} comes with a feature of freezing internal state
970 (@pxref{Frozen files}). This can be used to speed up @code{m4}
971 execution when reusing a common initialization script.
975 @itemx --freeze-state=@var{file}
976 Once execution is finished, write out the frozen state on the specified
977 @var{file}. It is conventional, but not required, for @var{file} to end
981 @itemx --reload-state=@var{file}
982 Before execution starts, recover the internal state from the specified
983 frozen @var{file}. The options @option{-D}, @option{-U}, @option{-t},
984 @option{-m}, @option{-r}, and @option{--import-environment} take effect
985 after state is reloaded, but before the input files are read.
988 @node Debugging options
989 @section Command line options for debugging
991 Finally, there are several options for aiding in debugging @code{m4}
995 @item -d@r{[}@r{[}-@r{|}+@r{]}@var{flags}@r{]}
996 @itemx --debug@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
997 @itemx --debugmode@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
998 Set the debug-level according to the flags @var{flags}. The debug-level
999 controls the format and amount of information presented by the debugging
1000 functions. @xref{Debugmode}, for more details on the format and
1001 meaning of @var{flags}. If omitted, @var{flags} defaults to
1002 @samp{+adeq}. If the option occurs multiple times, @var{flags} starting
1003 with @samp{-} or @samp{+} are cumulative, while @var{flags} starting
1004 with a letter override all earlier settings. The debug-level starts
1005 with @samp{d} enabled and all other flags disabled. To disable all
1006 previously set flags, specify an explicit @var{flags} of @samp{-V}. For
1007 backward compatibility reasons, the option @option{--fatal-warnings}
1008 implies @samp{--debug=-d} as part of its effects. The spelling
1009 @option{--debug} is recognized as an unambiguous option for
1010 compatibility with earlier versions of @acronym{GNU} M4, but for
1011 consistency with the builtin name, you can also use the spelling
1012 @option{--debugmode}. Order is significant with respect to file names.
1014 The cumulative effect of the various options in this example is
1015 equivalent to a single invocation of @code{debugmode(`adlqx')}:
1017 @comment options: -d-V -d+lx --debug --debugmode=-e
1019 $ @kbd{m4 -d+lx --debug --debugmode=-e}
1023 @error{}m4trace:2: -1- id 2: len(`123')
1027 @item --debugfile@r{[}=@var{file}@r{]}
1028 @itemx -o @var{file}
1029 @itemx --error-output=@var{file}
1030 Redirect debug messages and trace output to the
1031 named @var{file}. Warnings, error messages, and @code{errprint} output
1032 are still printed to standard error. Output from @code{dumpdef} goes to
1033 this file when the debug level @code{o} is not set (@pxref{Debugmode}).
1034 If these options are not used, or
1035 if @var{file} is unspecified (only possible for @option{--debugfile}),
1036 debug output goes to standard error; if @var{file} is the empty string,
1037 debug output is discarded. @xref{Debugfile}, for more details. The
1038 option @option{--debugfile} may be given more than once, and order is
1039 significant with respect to file names. The spellings @option{-o} and
1040 @option{--error-output} are misleading and
1041 inconsistent with other @acronym{GNU} tools; using those spellings will
1042 evoke a warning, and they may be withdrawn or change semantics in a
1046 @itemx --debuglen=@var{num}
1047 @itemx --arglength=@var{num}
1048 Restrict the size of the output generated by macro tracing or by
1049 @code{dumpdef} to @var{num} characters per string. If unspecified or
1050 zero, output is unlimited. @xref{Debuglen}, for more details.
1051 @var{num} can have an optional scaling suffix. The spelling
1052 @option{--arglength} is deprecated, since it does not match the
1053 @code{debuglen} macro; using it will evoke a warning, and it may be
1054 withdrawn in a future release.
1055 @comment FIXME - Should we add an option that controls whether output
1056 @comment strings are sanitized with escape sequences, so that dumpdef is
1057 @comment truly one line per macro?
1058 @comment FIXME - see comment on --nesting-limit about NUM.
1061 @itemx --trace=@var{name}
1062 @itemx --traceon=@var{name}
1063 This enables tracing for the macro @var{name}, at any point where it is
1064 defined. @var{name} need not be defined when this option is given.
1065 This option may be given more than once, and order is significant with
1066 respect to file names. @xref{Trace}, for more details.
1068 @item --traceoff=@var{name}
1069 This disables tracing for the macro @var{name}, at any point where it is
1070 defined. @var{name} need not be defined when this option is given.
1071 This option may be given more than once, and order is significant with
1072 respect to file names. @xref{Trace}, for more details.
1075 @node Command line files
1076 @section Specifying input files on the command line
1078 @cindex command line, file names on the
1079 @cindex file names, on the command line
1080 The remaining arguments on the command line are taken to be input file
1081 names. If no names are present, standard input is read. A file
1082 name of @file{-} is taken to mean standard input. It is
1083 conventional, but not required, for input files to end in @samp{.m4}.
1085 The input files are read in the sequence given. Standard input can be
1086 read more than once, so the file name @file{-} may appear multiple times
1087 on the command line; this makes a difference when input is from a
1088 terminal or other special file type. It is an error if an input file
1089 ends in the middle of argument collection, a comment, or a quoted
1091 @comment FIXME - it would be nicer if we let these three things
1092 @comment continue across file boundaries, provided that we warn in
1093 @comment interactive use when switching to stdin in a non-default parse
1096 Various options, such as @option{--define} (@option{-D}), @option{--undefine}
1097 (@option{-U}), @option{--synclines} (@option{-s}), @option{--trace}
1098 (@option{-t}), @option{--regexp-syntax} (@option{-r}), and
1099 @option{--load-module} (@option{-m}), only take effect after processing
1100 input from any file names that occur earlier on the command line. For
1101 example, assume the file @file{foo} contains:
1109 The text @samp{bar} can then be redefined over multiple uses of
1112 @comment options: -Dbar=hello foo -Dbar=world foo
1114 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
1119 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
1120 exit status of @code{m4} will be 0 for success, 1 for general failure
1121 (such as problems with reading an input file), and 63 for version
1122 mismatch (@pxref{Using frozen files}).
1124 If you need to read a file whose name starts with a @file{-}, you can
1125 specify it as @samp{./-file}, or use @option{--} to mark the end of
1129 @chapter Lexical and syntactic conventions
1131 @cindex input tokens
1133 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1134 token is either a name, a quoted string, or any single character, that
1135 is not a part of either a name or a string. Input to @code{m4} can also
1136 contain comments. @acronym{GNU} @code{m4} does not yet understand
1137 multibyte locales; all operations are byte-oriented rather than
1138 character-oriented (although if your locale uses a single byte
1139 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1140 However, @code{m4} is eight-bit clean, so you can
1141 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1142 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1143 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1145 @comment FIXME - each builtin needs to document how it handles NUL, then
1146 @comment update the above paragraph to mention that NUL is now handled
1147 @comment transparently.
1150 * Names:: Macro names
1151 * Quoted strings:: Quoting input to @code{m4}
1152 * Comments:: Comments in @code{m4} input
1153 * Other tokens:: Other kinds of input tokens
1154 * Input processing:: How @code{m4} copies input to output
1155 * Regular expression syntax:: How @code{m4} interprets regular expressions
1159 @section Macro names
1163 A name is any sequence of letters, digits, and the character @samp{_}
1164 (underscore), where the first character is not a digit. @code{m4} will
1165 use the longest such sequence found in the input. If a name has a
1166 macro definition, it will be subject to macro expansion
1167 (@pxref{Macros}). Names are case-sensitive.
1169 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1171 The definitions of letters, digits and other input characters can be
1172 changed at any time, using the builtin macro @code{changesyntax}.
1173 @xref{Changesyntax}, for more information.
1175 @node Quoted strings
1176 @section Quoting input to @code{m4}
1178 @cindex quoted string
1179 @cindex string, quoted
1180 A quoted string is a sequence of characters surrounded by quote
1181 strings, defaulting to
1182 @samp{`} and @samp{'}, where the nested begin and end quotes within the
1183 string are balanced. The value of a string token is the text, with one
1184 level of quotes stripped off. Thus
1193 is the empty string, and double-quoting turns into single-quoting.
1201 The quote characters can be changed at any time, using the builtin macros
1202 @code{changequote} (@pxref{Changequote}) or @code{changesyntax}
1203 (@pxref{Changesyntax}).
1206 @section Comments in @code{m4} input
1209 Comments in @code{m4} are normally delimited by the characters @samp{#}
1210 and newline. All characters between the comment delimiters are ignored,
1211 but the entire comment (including the delimiters) is passed through to
1212 the output, unless you supply the @option{--discard-comments} or
1213 @option{-c} option at the command line (@pxref{Operation modes, ,
1214 Invoking m4}). When discarding comments, the comment delimiters are
1215 discarded, even if the close-comment string is a newline.
1217 Comments cannot be nested, so the first newline after a @samp{#} ends
1218 the comment. The commenting effect of the begin-comment string
1219 can be inhibited by quoting it.
1223 `quoted text' # `commented text'
1224 @result{}quoted text # `commented text'
1225 `quoting inhibits' `#' `comments'
1226 @result{}quoting inhibits # comments
1229 @comment options: -c
1232 `quoted text' # `commented text'
1233 `quoting inhibits' `#' `comments'
1234 @result{}quoted text quoting inhibits # comments
1237 The comment delimiters can be changed to any string at any time, using
1238 the builtin macros @code{changecom} (@pxref{Changecom}) or
1239 @code{changesyntax} (@pxref{Changesyntax}).
1242 @section Other kinds of input tokens
1244 @cindex tokens, special
1245 Any character, that is neither a part of a name, nor of a quoted string,
1246 nor a comment, is a token by itself. When not in the context of macro
1247 expansion, all of these tokens are just copied to output. However,
1248 during macro expansion, whitespace characters (space, tab, newline,
1249 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1250 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1251 roles, explained later. Which characters actually perform these roles
1252 can be adjusted with @code{changesyntax} (@pxref{Changesyntax}).
1254 @node Input processing
1255 @section How @code{m4} copies input to output
1257 As @code{m4} reads the input token by token, it will copy each token
1258 directly to the output immediately.
1260 The exception is when it finds a word with a macro definition. In that
1261 case @code{m4} will calculate the macro's expansion, possibly reading
1262 more input to get the arguments. It then inserts the expansion in front
1263 of the remaining input. In other words, the resulting text from a macro
1264 call will be read and parsed into tokens again.
1266 @code{m4} expands a macro as soon as possible. If it finds a macro call
1267 when collecting the arguments to another, it will expand the second call
1268 first. This process continues until there are no more macro calls to
1269 expand and all the input has been consumed.
1271 For a running example, examine how @code{m4} handles this input:
1275 format(`Result is %d', eval(`2**15'))
1279 First, @code{m4} sees that the token @samp{format} is a macro name, so
1280 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1281 and @samp{@w{ }}, before encountering another potential macro. Sure
1282 enough, @samp{eval} is a macro name, so the nested argument collection
1283 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1284 with the lone argument of @samp{2**15}. The expansion of
1285 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1286 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1287 combined with the next @samp{)}, the format macro now has all its
1288 arguments, as if the user had typed:
1292 format(`Result is %d', 32768)
1296 The format macro expands to @samp{Result is 32768}, and we have another
1297 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1298 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1299 @samp{8}. None of these are macros, so the final output is
1303 @result{}Result is 32768
1306 As a more complicated example, we will contrast an actual code example
1307 from the Gnulib project@footnote{Derived from a patch in
1308 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1309 and a followup patch in
1310 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1311 showing both a buggy approach and the desired results. The user desires
1312 to output a shell assignment statement that takes its argument and turns
1313 it into a shell variable by converting it to uppercase and prepending a
1314 prefix. The original attempt looks like this:
1318 define([gl_STRING_MODULE_INDICATOR],
1321 GNULIB_]translit([$1],[a-z],[A-Z])[=1
1323 gl_STRING_MODULE_INDICATOR([strcase])
1325 @result{} GNULIB_strcase=1
1329 Oops -- the argument did not get capitalized. And although the manual
1330 is not able to easily show it, both lines that appear empty actually
1331 contain two trailing spaces. By stepping through the parse, it is easy
1332 to see what happened. First, @code{m4} sees the token
1333 @samp{changequote}, which it recognizes as a macro, followed by
1334 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1335 argument list. The macro expands to the empty string, but changes the
1336 quoting characters to something more useful for generating shell code
1337 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1338 but unbalanced @samp{[]} tend to be rare). Also in the first line,
1339 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1340 macro that consumes the rest of the line, resulting in no output for
1343 The second line starts a macro definition. @code{m4} sees the token
1344 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1345 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
1346 comma was encountered, the first argument is known to be the expansion
1347 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1348 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1349 whitespace is discarded as part of argument collection. Then comes a
1350 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1351 comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
1352 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1353 macro expansion has started.
1355 The arguments to the @code{translit} are found by the tokens @samp{(},
1356 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1357 @samp{)}. All three string arguments are expanded (or in other words,
1358 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1359 capitalization, the result of the macro is @samp{$1}. This expansion is
1360 rescanned, resulting in the two literal characters @samp{$} and
1363 Scanning of the outer macro resumes, and picks up with
1364 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
1365 expanded text are concatenated, with the end result that the macro
1366 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1367 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1368 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1370 The final line is then parsed, beginning with @samp{ } and @samp{ }
1371 that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
1372 recognized as a macro name, with an argument list of @samp{(},
1373 @samp{[strcase]}, and @samp{)}. Since the definition of the macro
1374 contains the sequence @samp{$1}, that sequence is replaced with the
1375 argument @samp{strcase} prior to starting the rescan. The rescan sees
1376 @samp{@key{NL}} and four spaces, which are output literally, then
1377 @samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
1378 comes four more spaces, also output literally, and the token
1379 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1380 substitution. Since that is not a macro name, it is output literally,
1381 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1382 two more spaces. Finally, the original @samp{@key{NL}} seen after the
1383 macro invocation is scanned and output literally.
1385 Now for a corrected approach. This rearranges the use of newlines and
1386 whitespace so that less whitespace is output (which, although harmless
1387 to shell scripts, can be visually unappealing), and fixes the quoting
1388 issues so that the capitalization occurs when the macro
1389 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1394 define([gl_STRING_MODULE_INDICATOR],
1396 GNULIB_[]translit([$1], [a-z], [A-Z])=1dnl
1398 gl_STRING_MODULE_INDICATOR([strcase])
1399 @result{} GNULIB_STRCASE=1
1402 The parsing of the first line is unchanged. The second line sees the
1403 name of the macro to define, then sees the discarded @samp{@key{NL}}
1404 and two spaces, as before. But this time, the next token is
1405 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([$1], [a-z],
1406 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1407 @samp{)} to end the macro definition and @samp{dnl} to skip the
1408 newline. No early expansion of @code{translit} occurs, so the entire
1409 string becomes the definition of the macro.
1411 The final line is then parsed, beginning with two spaces that are
1412 output literally, and an invocation of
1413 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1414 Again, the @samp{$1} in the macro definition is substituted prior to
1415 rescanning. Rescanning first encounters @samp{dnl}, and discards
1416 @samp{ comment@key{NL}}. Then two spaces are output literally. Next
1417 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1418 output literally. The token @samp{[]} is an empty string, so it does
1419 not affect output. Then the token @samp{translit} is encountered.
1421 This time, the arguments to @code{translit} are parsed as @samp{(},
1422 @samp{[strcase]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1423 @samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
1424 translit results in the desired result @samp{STRCASE}. This is
1425 rescanned, but since it is not a macro name, it is output literally.
1426 Then the scanner sees @samp{=} and @samp{1}, which are output
1427 literally, followed by @samp{dnl} which discards the rest of the
1428 definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
1429 end of output is the literal @samp{@key{NL}} that appeared after the
1430 invocation of the macro.
1432 The order in which @code{m4} expands the macros can be further explored
1433 using the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
1435 @node Regular expression syntax
1436 @section How @code{m4} interprets regular expressions
1438 There are several contexts where @code{m4} parses an argument as a
1439 regular expression. This section describes the various flavors of
1440 regular expressions. @xref{Changeresyntax}.
1442 @include regexprops-generic.texi
1445 @chapter How to invoke macros
1447 This chapter covers macro invocation, macro arguments and how macro
1448 expansion is treated.
1451 * Invocation:: Macro invocation
1452 * Inhibiting Invocation:: Preventing macro invocation
1453 * Macro Arguments:: Macro arguments
1454 * Quoting Arguments:: On Quoting Arguments to macros
1455 * Macro expansion:: Expanding macros
1459 @section Macro invocation
1461 @cindex macro invocation
1462 @cindex invoking macros
1463 Macro invocations has one of the forms
1471 which is a macro invocation without any arguments, or
1475 name(arg1, arg2, @dots{}, arg@var{n})
1479 which is a macro invocation with @var{n} arguments. Macros can have any
1480 number of arguments. All arguments are strings, but different macros
1481 might interpret the arguments in different ways.
1483 The opening parenthesis @emph{must} follow the @var{name} directly, with
1484 no spaces in between. If it does not, the macro is called with no
1487 For a macro call to have no arguments, the parentheses @emph{must} be
1488 left out. The macro call
1496 is a macro call with one argument, which is the empty string, not a call
1499 @node Inhibiting Invocation
1500 @section Preventing macro invocation
1502 An innovation of the @code{m4} language, compared to some of its
1503 predecessors (like Strachey's @code{GPM}, for example), is the ability
1504 to recognize macro calls without resorting to any special, prefixed
1505 invocation character. While generally useful, this feature might
1506 sometimes be the source of spurious, unwanted macro calls. So, @acronym{GNU}
1507 @code{m4} offers several mechanisms or techniques for inhibiting the
1508 recognition of names as macro calls.
1510 @cindex @acronym{GNU} extensions
1512 @cindex macro, blind
1513 First of all, many builtin macros cannot meaningfully be called without
1514 arguments. As a @acronym{GNU} extension, for any of these macros,
1515 whenever an opening parenthesis does not immediately follow their name,
1516 the builtin macro call is not triggered. This solves the most usual
1517 cases, like for @samp{include} or @samp{eval}. Later in this document,
1518 the sentence ``This macro is recognized only with parameters'' refers to
1519 this specific provision of @acronym{GNU} M4, also known as a blind
1520 builtin macro. For the builtins defined by @acronym{POSIX} that bear
1521 this disclaimer, @acronym{POSIX} specifically states that invoking those
1522 builtins without arguments is unspecified, because many other
1523 implementations simply invoke the builtin as though it were given one
1524 empty argument instead.
1534 There is also a command line option (@option{--prefix-builtins}, or
1535 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1536 builtin macros with a prefix of @samp{m4_} at startup. The option has
1537 no effect whatsoever on user defined macros. For example, with this option,
1538 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1539 no effect on whether a macro requires parameters.
1541 @comment options: -P
1554 Another alternative is to redefine problematic macros to a name less
1555 likely to cause conflicts, @xref{Definitions}. Or the parsing engine
1556 can be changed to redefine what constitutes a valid macro name,
1557 @xref{Changesyntax}.
1559 Of course, the simplest way to prevent a name from being interpreted
1560 as a call to an existing macro is to quote it. The remainder of
1561 this section studies a little more deeply how quoting affects macro
1562 invocation, and how quoting can be used to inhibit macro invocation.
1564 Even if quoting is usually done over the whole macro name, it can also
1565 be done over only a few characters of this name (provided, of course,
1566 that the unquoted portions are not also a macro). It is also possible
1567 to quote the empty string, but this works only @emph{inside} the name.
1582 all yield the string @samp{divert}. While in both:
1592 the @code{divert} builtin macro will be called, which expands to the
1596 The output of macro evaluations is always rescanned. In the following
1597 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1599 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1602 define(`cde', `CDE')
1604 define(`x', `substr(ab')
1606 define(`y', `cde, `1', `3')')
1612 Unquoted strings on either side of a quoted string are subject to
1613 being recognized as macro names. In the following example, quoting the
1614 empty string allows for the second @code{macro} to be recognized as such:
1617 define(`macro', `m')
1625 Quoting may prevent recognizing as a macro name the concatenation of a
1626 macro expansion with the surrounding characters. In this example:
1629 define(`macro', `di$1')
1638 the input will produce the string @samp{divert}. When the quotes were
1639 removed, the @code{divert} builtin was called instead.
1641 @node Macro Arguments
1642 @section Macro arguments
1644 @cindex macros, arguments to
1645 @cindex arguments to macros
1646 When a name is seen, and it has a macro definition, it will be expanded
1649 If the name is followed by an opening parenthesis, the arguments will be
1650 collected before the macro is called. If too few arguments are
1651 supplied, the missing arguments are taken to be the empty string.
1652 However, some builtins are documented to behave differently for a
1653 missing optional argument than for an explicit empty string. If there
1654 are too many arguments, the excess arguments are ignored. Unquoted
1655 leading whitespace is stripped off all arguments, but whitespace
1656 generated by a macro expansion or occurring after a macro that expanded
1657 to an empty string remains intact. Whitespace includes space, tab,
1658 newline, carriage return, vertical tab, and formfeed.
1661 define(`macro', `$1')
1663 macro( unquoted leading space lost)
1664 @result{}unquoted leading space lost
1665 macro(` quoted leading space kept')
1666 @result{} quoted leading space kept
1668 divert `unquoted space kept after expansion')
1669 @result{} unquoted space kept after expansion
1671 ')`whitespace from expansion kept')
1673 @result{}whitespace from expansion kept
1674 macro(`unquoted trailing whitespace kept'
1676 @result{}unquoted trailing whitespace kept
1680 @cindex warnings, suppressing
1681 @cindex suppressing warnings
1682 Normally @code{m4} will issue warnings if a builtin macro is called
1683 with an inappropriate number of arguments, but it can be suppressed with
1684 the @option{--quiet} command line option (or @option{--silent}, or
1685 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1686 defined macros, there is no check of the number of arguments given.
1691 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
1695 index(`abc', `b', `0', `ignored')
1696 @error{}m4:stdin:3: Warning: index: extra arguments ignored: 4 > 3
1700 @comment options: -Q
1707 index(`abc', `b', `', `ignored')
1711 Macros are expanded normally during argument collection, and whatever
1712 commas, quotes and parentheses that might show up in the resulting
1713 expanded text will serve to define the arguments as well. Thus, if
1714 @var{foo} expands to @samp{, b, c}, the macro call
1722 is a macro call with four arguments, which are @samp{a }, @samp{b},
1723 @samp{c} and @samp{d}. To understand why the first argument contains
1724 whitespace, remember that unquoted leading whitespace is never part
1725 of an argument, but trailing whitespace always is.
1727 It is possible for a macro's definition to change during argument
1728 collection, in which case the expansion uses the definition that was in
1729 effect at the time the opening @samp{(} was seen.
1740 It is an error if the end of file occurs while collecting arguments.
1745 @result{}hello world
1748 @error{}m4:stdin:2: define: end of file in argument list
1751 @node Quoting Arguments
1752 @section On Quoting Arguments to macros
1754 @cindex quoted macro arguments
1755 @cindex macros, quoted arguments to
1756 @cindex arguments, quoted macro
1757 Each argument has unquoted leading whitespace removed. Within each
1758 argument, all unquoted parentheses must match. For example, if
1759 @var{foo} is a macro,
1767 is a macro call, with one argument, whose value is @samp{() (() (}.
1768 Commas separate arguments, except when they occur inside quotes,
1769 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1772 It is common practice to quote all arguments to macros, unless you are
1773 sure you want the arguments expanded. Thus, in the above
1774 example with the parentheses, the `right' way to do it is like this:
1781 @cindex quoting rule of thumb
1782 @cindex rule of thumb, quoting
1783 It is, however, in certain cases necessary (because nested expansion
1784 must occur to create the arguments for the outer macro) or convenient
1785 (because it uses fewer characters) to leave out quotes for some
1786 arguments, and there is nothing wrong in doing it. It just makes life a
1787 bit harder, if you are not careful to follow a consistent quoting style.
1788 For consistency, this manual follows the rule of thumb that each layer
1789 of parentheses introduces another layer of single quoting, except when
1790 showing the consequences of quoting rules. This is done even when the
1791 quoted string cannot be a macro, such as with integers when you have not
1792 changed the syntax via @code{changesyntax} (@pxref{Changesyntax}).
1794 The quoting rule of thumb of one level of quoting per parentheses has a
1795 nice property: when a macro name appears inside parentheses, you can
1796 determine when it will be expanded. If it is not quoted, it will be
1797 expanded prior to the outer macro, so that its expansion becomes the
1798 argument. If it is single-quoted, it will be expanded after the outer
1799 macro. And if it is double-quoted, it will be used as literal text
1800 instead of a macro name.
1803 define(`active', `ACT, IVE')
1805 define(`show', `$1 $1')
1810 @result{}ACT, IVE ACT, IVE
1812 @result{}active active
1815 @node Macro expansion
1816 @section Macro expansion
1818 @cindex macros, expansion of
1819 @cindex expansion of macros
1820 When the arguments, if any, to a macro call have been collected, the
1821 macro is expanded, and the expansion text is pushed back onto the input
1822 (unquoted), and reread. The expansion text from one macro call might
1823 therefore result in more macros being called, if the calls are included,
1824 completely or partially, in the first macro calls' expansion.
1826 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1827 @var{bar} expands to @samp{Hello world}, the input
1829 @comment options: -Dbar='Hello world' -Dfoo=bar
1831 $ @kbd{m4 -Dbar="Hello world" -Dfoo=bar}
1833 @result{}Hello world
1837 will expand first to @samp{bar}, and when this is reread and
1838 expanded, into @samp{Hello world}.
1841 @chapter How to define new macros
1843 @cindex macros, how to define new
1844 @cindex defining new macros
1845 Macros can be defined, redefined and deleted in several different ways.
1846 Also, it is possible to redefine a macro without losing a previous
1847 value, and bring back the original value at a later time.
1850 * Define:: Defining a new macro
1851 * Arguments:: Arguments to macros
1852 * Pseudo Arguments:: Special arguments to macros
1853 * Undefine:: Deleting a macro
1854 * Defn:: Renaming macros
1855 * Pushdef:: Temporarily redefining macros
1856 * Renamesyms:: Renaming macros with regular expressions
1858 * Indir:: Indirect call of macros
1859 * Builtin:: Indirect call of builtins
1860 * M4symbols:: Getting the defined macro names
1864 @section Defining a macro
1866 The normal way to define or redefine macros is to use the builtin
1869 @deffn {Builtin (m4)} define (@var{name}, @ovar{expansion})
1870 Defines @var{name} to expand to @var{expansion}. If
1871 @var{expansion} is not given, it is taken to be empty.
1873 The expansion of @code{define} is void.
1874 The macro @code{define} is recognized only with parameters.
1876 @comment Other implementations, such as Solaris, can define a macro
1877 @comment with a builtin token attached to text:
1878 @comment define(foo, a`'defn(`divnum')b)
1879 @comment defn(`foo') => ab
1880 @comment dumpdef(`foo') => foo: a<divnum>b
1881 @comment len(defn(`foo')) => 3
1882 @comment index(defn(`foo'), defn(`divnum')) => 1
1884 @comment It may be worth making some changes to support this behavior,
1885 @comment or something similar to it.
1887 @comment But be sure it has sane semantics, with potentially deferred
1888 @comment expansion of builtins. For example, this should not warn
1889 @comment about trying to access the definition of an undefined macro:
1890 @comment define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops')
1891 @comment Also, think how to handle conflicting argument counts:
1892 @comment define(`bar', defn(`dnl', `len'))
1894 The following example defines the macro @var{foo} to expand to the text
1895 @samp{Hello World.}.
1898 define(`foo', `Hello world.')
1901 @result{}Hello world.
1904 The empty line in the output is there because the newline is not
1905 a part of the macro definition, and it is consequently copied to
1906 the output. This can be avoided by use of the macro @code{dnl}.
1907 @xref{Dnl}, for details.
1909 The first argument to @code{define} should be quoted; otherwise, if the
1910 macro is already defined, you will be defining a different macro. This
1911 example shows the problems with underquoting, since we did not want to
1912 redefine @code{one}:
1923 @cindex @acronym{GNU} extensions
1924 @acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
1925 definition of a macro if it has several definitions from @code{pushdef}
1926 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1927 definitions of a macro with @code{define}. @xref{Incompatibilities},
1930 As a @acronym{GNU} extension, the first argument to @code{define} does
1931 not have to be a simple word.
1932 It can be any text string, even the empty string. A macro with a
1933 non-standard name cannot be invoked in the normal way, as the name is
1934 not recognized. It can only be referenced by the builtins @code{Indir}
1935 (@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
1938 Arrays and associative arrays can be simulated by using non-standard
1941 @deffn Composite array (@var{index})
1942 @deffnx Composite array_set (@var{index}, @ovar{value})
1943 Provide access to entries within an array. @code{array} reads the entry
1944 at location @var{index}, and @code{array_set} assigns @var{value} to
1945 location @var{index}.
1949 define(`array', `defn(format(``array[%d]'', `$1'))')
1951 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1953 array_set(`4', `array element no. 4')
1955 array_set(`17', `array element no. 17')
1958 @result{}array element no. 4
1959 array(eval(`10 + 7'))
1960 @result{}array element no. 17
1963 Change the @samp{%d} to @samp{%s} and it is an associative array.
1966 @section Arguments to macros
1968 @cindex macros, arguments to
1969 @cindex arguments to macros
1970 Macros can have arguments. The @var{n}th argument is denoted by
1971 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1972 argument, when the macro is expanded. Replacement of arguments happens
1973 before rescanning, regardless of how many nesting levels of quoting
1974 appear in the expansion. Here is an example of a macro with
1977 @deffn Composite exch (@var{arg1}, @var{arg2})
1978 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
1983 define(`exch', `$2, $1')
1985 exch(`arg1', `arg2')
1989 This can be used, for example, if you like the arguments to
1990 @code{define} to be reversed.
1993 define(`exch', `$2, $1')
1995 define(exch(``expansion text'', ``macro''))
1998 @result{}expansion text
2001 @xref{Quoting Arguments}, for an explanation of the double quotes.
2002 (You should try and improve this example so that clients of @code{exch}
2003 do not have to double quote; or @pxref{Improved exch, , Answers}).
2005 @cindex @acronym{GNU} extensions
2006 @acronym{GNU} @code{m4} allows the number following the @samp{$} to
2008 or more digits, allowing macros to have any number of arguments. This
2009 is not so in UNIX implementations of @code{m4}, which only recognize
2011 @comment FIXME - See Austin group XCU ERN 111. POSIX says that $11 must
2012 @comment be the first argument concatenated with 1, and instead reserves
2013 @comment ${11} for implementation use. Once this is implemented, the
2014 @comment documentation needs to reflect how these extended arguments
2015 @comment are handled, as well as backwards compatibility issues with
2016 @comment 1.4.x. Also, consider adding further extensions such as
2017 @comment ${1-default}, which expands to `default' if $1 is empty.
2019 As a special case, the zeroth argument, @code{$0}, is always the name
2020 of the macro being expanded.
2023 define(`test', ``Macro name: $0'')
2026 @result{}Macro name: test
2029 If you want quoted text to appear as part of the expansion text,
2030 remember that quotes can be nested in quoted strings. Thus, in
2033 define(`foo', `This is macro `foo'.')
2036 @result{}This is macro foo.
2040 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
2041 a quoted string, and not a name.
2043 @node Pseudo Arguments
2044 @section Special arguments to macros
2046 @cindex special arguments to macros
2047 @cindex macros, special arguments to
2048 @cindex arguments to macros, special
2049 There is a special notation for the number of actual arguments supplied,
2050 and for all the actual arguments.
2052 The number of actual arguments in a macro call is denoted by @code{$#}
2053 in the expansion text.
2055 @deffn Composite nargs (@dots{})
2056 Expands to a count of the number of arguments supplied.
2060 define(`nargs', `$#')
2066 nargs(`arg1', `arg2', `arg3')
2068 nargs(`commas can be quoted, like this')
2070 nargs(arg1#inside comments, commas do not separate arguments
2073 nargs((unquoted parentheses, like this, group arguments))
2077 Remember that @samp{#} defaults to the comment character; if you forget
2078 quotes to inhibit the comment behavior, your macro definition may not
2079 end where you expected.
2082 dnl Attempt to define a macro to just `$#'
2083 define(underquoted, $#)
2091 The notation @code{$*} can be used in the expansion text to denote all
2092 the actual arguments, unquoted, with commas in between. For example
2095 define(`echo', `$*')
2097 echo(arg1, arg2, arg3 , arg4)
2098 @result{}arg1,arg2,arg3 ,arg4
2101 Often each argument should be quoted, and the notation @code{$@@} handles
2102 that. It is just like @code{$*}, except that it quotes each argument.
2103 A simple example of that is:
2106 define(`echo', `$@@')
2108 echo(arg1, arg2, arg3 , arg4)
2109 @result{}arg1,arg2,arg3 ,arg4
2112 Where did the quotes go? Of course, they were eaten, when the expanded
2113 text were reread by @code{m4}. To show the difference, try
2116 define(`echo1', `$*')
2118 define(`echo2', `$@@')
2120 define(`foo', `This is macro `foo'.')
2123 @result{}This is macro This is macro foo..
2125 @result{}This is macro foo.
2127 @result{}This is macro foo.
2133 @xref{Trace}, if you do not understand this. As another example of the
2134 difference, remember that comments encountered in arguments are passed
2135 untouched to the macro, and that quoting disables comments.
2138 define(`echo1', `$*')
2140 define(`echo2', `$@@')
2142 define(`foo', `bar')
2154 A @samp{$} sign in the expansion text, that is not followed by anything
2155 @code{m4} understands, is simply copied to the macro expansion, as any
2159 define(`foo', `$$$ hello $$$')
2162 @result{}$$$ hello $$$
2166 @cindex literal output
2167 @cindex output, literal
2168 If you want a macro to expand to something like @samp{$12}, the
2169 judicious use of nested quoting can put a safe character between the
2170 @code{$} and the next character, relying on the rescanning to remove the
2171 nested quote. This will prevent @code{m4} from interpreting the
2172 @code{$} sign as a reference to an argument.
2175 define(`foo', `no nested quote: $1')
2178 @result{}no nested quote: arg
2179 define(`foo', `nested quote around $: `$'1')
2182 @result{}nested quote around $: $1
2183 define(`foo', `nested empty quote after $: $`'1')
2186 @result{}nested empty quote after $: $1
2187 define(`foo', `nested quote around next character: $`1'')
2190 @result{}nested quote around next character: $1
2191 define(`foo', `nested quote around both: `$1'')
2194 @result{}nested quote around both: arg
2198 @section Deleting a macro
2200 @cindex macros, how to delete
2201 @cindex deleting macros
2202 @cindex undefining macros
2203 A macro definition can be removed with @code{undefine}:
2205 @deffn {Builtin (m4)} undefine (@var{name}@dots{})
2206 For each argument, remove the macro @var{name}. The macro names must
2207 necessarily be quoted, since they will be expanded otherwise. If an
2208 argument is not a defined macro, then the @samp{d} debug level controls
2209 whether a warning is issued (@pxref{Debugmode}).
2211 The expansion of @code{undefine} is void.
2212 The macro @code{undefine} is recognized only with parameters.
2217 @result{}foo bar blah
2218 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2221 @result{}some other text
2225 @result{}foo other text
2226 undefine(`bar', `blah')
2229 @result{}foo bar blah
2232 Undefining a macro inside that macro's expansion is safe; the macro
2233 still expands to the definition that was in effect at the @samp{(}.
2236 define(`f', ``$0':$1')
2238 f(f(f(undefine(`f')`hello world')))
2239 @result{}f:f:f:hello world
2244 As of M4 1.6, @code{undefine} can warn if @var{name} is not a macro, by
2245 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2246 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2252 @error{}m4:stdin:1: Warning: undefine: undefined macro `a'
2261 @section Renaming macros
2263 @cindex macros, how to rename
2264 @cindex renaming macros
2265 @cindex macros, displaying definitions
2266 @cindex definitions, displaying macro
2267 It is possible to rename an already defined macro. To do this, you need
2268 the builtin @code{defn}:
2270 @deffn {Builtin (m4)} defn (@var{name}@dots{})
2271 Expands to the @emph{quoted definition} of each @var{name}. If an
2272 argument is not a defined macro, the expansion for that argument is
2273 empty, and the @samp{d} debug level controls whether a warning is issued
2274 (@pxref{Debugmode}).
2276 If @var{name} is a user-defined macro, the quoted definition is simply
2277 the quoted expansion text. If, instead, @var{name} is a builtin, the
2278 expansion is a special token, which points to the builtin's internal
2279 definition. This token meaningful primarily as the second argument to
2280 @code{define} (and @code{pushdef}), and is silently converted to an
2281 empty string in many other contexts.
2283 The macro @code{defn} is recognized only with parameters.
2286 Its normal use is best understood through an example, which shows how to
2287 rename @code{undefine} to @code{zap}:
2290 define(`zap', defn(`undefine'))
2295 @result{}undefine(zap)
2298 In this way, @code{defn} can be used to copy macro definitions, and also
2299 definitions of builtin macros. Even if the original macro is removed,
2300 the other name can still be used to access the definition.
2302 The fact that macro definitions can be transferred also explains why you
2303 should use @code{$0}, rather than retyping a macro's name in its
2307 define(`foo', `This is `$0'')
2309 define(`bar', defn(`foo'))
2312 @result{}This is bar
2315 Macros used as string variables should be referred through @code{defn},
2316 to avoid unwanted expansion of the text:
2319 define(`string', `The macro dnl is very useful
2323 @result{}The macro@w{ }
2325 @result{}The macro dnl is very useful
2330 However, it is important to remember that @code{m4} rescanning is purely
2331 textual. If an unbalanced end-quote string occurs in a macro
2332 definition, the rescan will see that embedded quote as the termination
2333 of the quoted string, and the remainder of the macro's definition will
2334 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2335 end-quotes in macro definitions or arguments to macros.
2342 define(`echo', `$@@')
2352 On the other hand, it is possible to exploit the fact that @code{defn}
2353 can concatenate multiple macros prior to the rescanning phase, in order
2354 to join the definitions of macros that, in isolation, have unbalanced
2355 quotes. This is particularly useful when one has used several macros to
2356 accumulate text that M4 should rescan as a whole. In the example below,
2357 note how the use of @code{defn} on @code{l} in isolation opens a string,
2358 which is not closed until the next line; but used on @code{l} and
2359 @code{r} together results in nested quoting.
2362 define(`l', `<[>')define(`r', `<]>')
2364 changequote(`[', `]')
2368 @result{}<[>]defn([r])
2374 @cindex builtins, special tokens
2375 @cindex tokens, builtin macro
2376 Using @code{defn} to generate special tokens for builtin macros will
2377 generate a warning in contexts where a macro name is expected. But in
2378 contexts that operate on text, the builtin token is just silently
2379 converted to an empty string. As of M4 1.6, expansion of user macros
2380 will also preserve builtin tokens. However, any use of builtin tokens
2381 outside of the second argument to @code{define} and @code{pushdef} is
2382 generally not portable, since earlier @acronym{GNU} M4 versions, as well
2383 as other @code{m4} implementations, vary on how such tokens are treated.
2389 define(defn(`divnum'), `cannot redefine a builtin token')
2390 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2396 define(`echo', `$@@')
2398 define(`mydivnum', shift(echo(`', defn(`divnum'))))
2402 define(`', `empty-$1')
2404 defn(defn(`divnum'))
2405 @error{}m4:stdin:9: Warning: defn: invalid macro name ignored
2407 pushdef(defn(`divnum'), `oops')
2408 @error{}m4:stdin:10: Warning: pushdef: invalid macro name ignored
2410 traceon(defn(`divnum'))
2411 @error{}m4:stdin:11: Warning: traceon: invalid macro name ignored
2413 indir(defn(`divnum'), `string')
2414 @error{}m4:stdin:12: Warning: indir: invalid macro name ignored
2417 @result{}empty-string
2418 traceoff(defn(`divnum'))
2419 @error{}m4:stdin:14: Warning: traceoff: invalid macro name ignored
2421 popdef(defn(`divnum'))
2422 @error{}m4:stdin:15: Warning: popdef: invalid macro name ignored
2424 dumpdef(defn(`divnum'))
2425 @error{}m4:stdin:16: Warning: dumpdef: invalid macro name ignored
2427 undefine(defn(`divnum'))
2428 @error{}m4:stdin:17: Warning: undefine: invalid macro name ignored
2431 @error{}:@tabchar{}`empty-$1'
2433 m4symbols(defn(`divnum'))
2434 @error{}m4:stdin:19: Warning: m4symbols: invalid macro name ignored
2436 define(`foo', `define(`$1', $2)')dnl
2437 foo(`bar', defn(`divnum'))
2443 As of M4 1.6, @code{defn} can warn if @var{name} is not a macro, by
2444 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2445 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2446 m4}). Also, @code{defn} with multiple arguments can join text with
2447 builtin tokens. However, when defining a macro via @code{define} or
2448 @code{pushdef}, a warning is issued and the builtin token ignored if the
2449 builtin token does not occur in isolation. A future version of
2450 @acronym{GNU} M4 may lift this restriction.
2455 @error{}m4:stdin:1: Warning: defn: undefined macro `foo'
2461 define(`a', `A')define(`AA', `b')
2463 traceon(`defn', `define')
2465 defn(`a', `divnum', `a')
2466 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A''
2468 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2469 @error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
2470 @error{}m4:stdin:7: Warning: define: cannot concatenate builtins
2471 @error{}m4trace: -1- define(`mydivnum', `<divnum><divnum>') -> `'
2473 traceoff(`defn', `define')dumpdef(`mydivnum')
2474 @error{}mydivnum:@tabchar{}`'
2476 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
2477 @error{}m4:stdin:9: Warning: define: cannot concatenate builtins
2479 define(`mydivnum', defn(`divnum')`a')mydivnum
2480 @error{}m4:stdin:10: Warning: define: cannot concatenate builtins
2482 define(`mydivnum', `a'defn(`divnum'))mydivnum
2483 @error{}m4:stdin:11: Warning: define: cannot concatenate builtins
2485 define(`q', ``$@@'')
2487 define(`foo', q(`a', defn(`divnum')))foo
2488 @error{}m4:stdin:13: Warning: define: cannot concatenate builtins
2490 ifdef(`foo', `yes', `no')
2495 @section Temporarily redefining macros
2497 @cindex macros, temporary redefinition of
2498 @cindex temporary redefinition of macros
2499 @cindex redefinition of macros, temporary
2500 @cindex definition stack
2501 @cindex pushdef stack
2502 @cindex stack, macro definition
2503 It is possible to redefine a macro temporarily, reverting to the
2504 previous definition at a later time. This is done with the builtins
2505 @code{pushdef} and @code{popdef}:
2507 @deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion})
2508 @deffnx {Builtin (m4)} popdef (@var{name}@dots{})
2509 Analogous to @code{define} and @code{undefine}.
2511 These macros work in a stack-like fashion. A macro is temporarily
2512 redefined with @code{pushdef}, which replaces an existing definition of
2513 @var{name}, while saving the previous definition, before the new one is
2514 installed. If there is no previous definition, @code{pushdef} behaves
2515 exactly like @code{define}.
2517 If a macro has several definitions (of which only one is accessible),
2518 the topmost definition can be removed with @code{popdef}. If there is
2519 no previous definition, @code{popdef} behaves like @code{undefine}, and
2520 if there is no definition at all, the @samp{d} debug level controls
2521 whether a warning is issued (@pxref{Debugmode}).
2523 The expansion of both @code{pushdef} and @code{popdef} is void.
2524 The macros @code{pushdef} and @code{popdef} are recognized only with
2529 define(`foo', `Expansion one.')
2532 @result{}Expansion one.
2533 pushdef(`foo', `Expansion two.')
2536 @result{}Expansion two.
2537 pushdef(`foo', `Expansion three.')
2539 pushdef(`foo', `Expansion four.')
2544 @result{}Expansion three.
2545 popdef(`foo', `foo')
2548 @result{}Expansion one.
2555 If a macro with several definitions is redefined with @code{define}, the
2556 topmost definition is @emph{replaced} with the new definition. If it is
2557 removed with @code{undefine}, @emph{all} the definitions are removed,
2558 and not only the topmost one. However, @acronym{POSIX} allows other
2559 implementations that treat @code{define} as replacing an entire stack
2560 of definitions with a single new definition, so to be portable to other
2561 implementations, it may be worth explicitly using @code{popdef} and
2562 @code{pushdef} rather than relying on the @acronym{GNU} behavior of
2566 define(`foo', `Expansion one.')
2569 @result{}Expansion one.
2570 pushdef(`foo', `Expansion two.')
2573 @result{}Expansion two.
2574 define(`foo', `Second expansion two.')
2577 @result{}Second expansion two.
2584 @cindex local variables
2585 @cindex variables, local
2586 Local variables within macros are made with @code{pushdef} and
2587 @code{popdef}. At the start of the macro a new definition is pushed,
2588 within the macro it is manipulated and at the end it is popped,
2589 revealing the former definition.
2591 It is possible to temporarily redefine a builtin with @code{pushdef}
2594 As of M4 1.6, @code{popdef} can warn if @var{name} is not a macro, by
2595 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2596 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2605 @error{}m4:stdin:3: Warning: popdef: undefined macro `a'
2614 @section Renaming macros with regular expressions
2616 @cindex regular expressions
2617 @cindex macros, how to rename
2618 @cindex renaming macros
2619 @cindex @acronym{GNU} extensions
2620 Sometimes it is desirable to rename multiple symbols without having to
2621 use a long sequence of calls to @code{define}. The @code{renamesyms}
2622 builtin allows this:
2624 @deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @
2626 Global renaming of macros is done by @code{renamesyms}, which selects
2627 all macros with names that match @var{regexp}, and renames each match
2628 according to @var{replacement}. It is unspecified what happens if the
2629 rename causes multiple macros to map to the same name.
2630 @comment FIXME - right now, collisions cause a core dump on some platforms:
2631 @comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa')
2633 If @var{resyntax} is given, the particular flavor of regular
2634 expression understood with respect to @var{regexp} can be changed from
2635 the current default. @xref{Changeresyntax}, for details of the values
2636 that can be given for this argument.
2638 A macro that does not have a name that matches @var{regexp} is left
2639 with its original name. If only part of the name matches, any part of
2640 the name that is not covered by @var{regexp} is copied to the
2641 replacement name. Whenever a match is found in the name, the search
2642 proceeds from the end of the match, so no character in the original
2643 name can be substituted twice. If @var{regexp} matches a string of
2644 zero length, the start position for the continued search is
2645 incremented to avoid infinite loops.
2647 Where a replacement is to be made, @var{replacement} replaces the
2648 matched text in the original name, with @samp{\@var{n}} substituted by
2649 the text matched by the @var{n}th parenthesized sub-expression of
2650 @var{regexp}, and @samp{\&} being the text matched by the entire
2653 The expansion of @code{renamesyms} is void.
2654 The macro @code{renamesyms} is recognized only with parameters.
2655 This macro was added in M4 2.0.
2658 The following example starts with a rename similar to the
2659 @option{--prefix-builtins} option (or @option{-P}), prefixing every
2660 macro with @code{m4_}. However, note that @option{-P} only renames M4
2661 builtin macros, even if other macros were defined previously, while
2662 @code{renamesyms} will rename any macros that match when it runs,
2663 including text macros. The rest of the example demonstrates the
2664 behavior of unanchored regular expressions in symbol renaming.
2666 @comment options: -Dfoo=bar -P
2668 $ @kbd{m4 -Dfoo=bar -P}
2679 define(`foo', `bar')
2681 renamesyms(`^.*$', `m4_\&')
2689 m4_renamesyms(`f', `g')
2691 m4_igdeg(`m4_goo', `m4_goo')
2695 If @var{resyntax} is given, @var{regexp} must be given according to
2696 the syntax chosen, though the default regular expression syntax
2697 remains unchanged for other invocations. Here is a more realistic
2698 example that performs a similar renaming on macros, except that it
2699 ignores macros with names that begin with @samp{_}, and avoids creating
2700 macros with names that begin with @samp{m4_m4}.
2703 renamesyms(`^[^_]\w*$', `m4_\&')
2705 m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED')
2714 When a symbol has multiple definitions, thanks to @code{pushdef}, the
2715 entire stack is renamed.
2718 pushdef(`foo', `1')pushdef(`foo', `2')
2720 renamesyms(`^foo$', `bar')
2731 @section Indirect call of macros
2733 @cindex indirect call of macros
2734 @cindex call of macros, indirect
2735 @cindex macros, indirect call of
2736 @cindex @acronym{GNU} extensions
2737 Any macro can be called indirectly with @code{indir}:
2739 @deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}})
2740 Results in a call to the macro @var{name}, which is passed the rest of
2741 the arguments @var{args}. If @var{name} is not defined, the expansion
2742 is void, and the @samp{d} debug level controls whether a warning is
2743 issued (@pxref{Debugmode}).
2745 The macro @code{indir} is recognized only with parameters.
2748 This can be used to call macros with computed or ``invalid''
2749 names (@code{define} allows such names to be defined):
2752 define(`$$internal$macro', `Internal macro (name `$0')')
2755 @result{}$$internal$macro
2756 indir(`$$internal$macro')
2757 @result{}Internal macro (name $$internal$macro)
2760 The point is, here, that larger macro packages can have private macros
2761 defined, that will not be called by accident. They can @emph{only} be
2762 called through the builtin @code{indir}.
2764 One other point to observe is that argument collection occurs before
2765 @code{indir} invokes @var{name}, so if argument collection changes the
2766 value of @var{name}, that will be reflected in the final expansion.
2767 This is different than the behavior when invoking macros directly,
2768 where the definition that was in effect before argument collection is
2777 indir(`f', define(`f', `3'))
2779 indir(`f', undefine(`f'))
2780 @error{}m4:stdin:4: Warning: indir: undefined macro `f'
2788 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2789 arguments, @code{indir} defers to the invoked @var{name} for whether a
2790 token representing a builtin is recognized or flattened to the empty
2795 indir(defn(`defn'), `divnum')
2796 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2798 indir(`define', defn(`defn'), `divnum')
2799 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2801 indir(`define', `foo', defn(`divnum'))
2805 indir(`divert', defn(`foo'))
2806 @error{}m4:stdin:5: Warning: divert: empty string treated as 0
2810 Warning messages issued on behalf of an indirect macro use an
2811 unambiguous representation of the macro name, using escape sequences
2812 similar to C strings, and with colons also quoted.
2816 odd', defn(`divnum'))
2820 @error{}m4:stdin:3: Warning: %%\:\\\nodd: extra arguments ignored: 1 > 0
2825 @section Indirect call of builtins
2827 @cindex indirect call of builtins
2828 @cindex call of builtins, indirect
2829 @cindex builtins, indirect call of
2830 @cindex @acronym{GNU} extensions
2831 Builtin macros can be called indirectly with @code{builtin}:
2833 @deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}})
2834 @deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1})
2835 Results in a call to the builtin @var{name}, which is passed the
2836 rest of the arguments @var{args}. If @var{name} does not name a
2837 builtin, the expansion is void, and the @samp{d} debug level controls
2838 whether a warning is issued (@pxref{Debugmode}).
2840 As a special case, if @var{name} is exactly the special token
2841 representing the @code{builtin} macro, as obtained by @code{defn}
2842 (@pxref{Defn}), then @var{args} must consist of a single @var{name1},
2843 and the expansion is the special token representing the builtin macro
2844 named by @var{name1}.
2846 The macro @code{builtin} is recognized only with parameters.
2849 This can be used even if @var{name} has been given another definition
2850 that has covered the original, or been undefined so that no macro
2851 maps to the builtin.
2854 pushdef(`define', `hidden')
2856 undefine(`undefine')
2858 define(`foo', `bar')
2862 builtin(`define', `foo', defn(`divnum'))
2866 builtin(`define', `foo', `BAR')
2871 @result{}undefine(foo)
2874 builtin(`undefine', `foo')
2880 The @var{name} argument only matches the original name of the builtin,
2881 even when the @option{--prefix-builtins} option (or @option{-P},
2882 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2883 from @code{indir}, which only tracks current macro names.
2885 @comment options: -P
2888 m4_builtin(`divnum')
2890 m4_builtin(`m4_divnum')
2891 @error{}m4:stdin:2: Warning: m4_builtin: undefined builtin `m4_divnum'
2894 @error{}m4:stdin:3: Warning: m4_indir: undefined macro `divnum'
2896 m4_indir(`m4_divnum')
2900 m4_builtin(`m4_divnum')
2904 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2905 without arguments, even when they normally require parameters to be
2906 recognized; but it will provoke a warning, and the expansion will behave
2907 as though empty strings had been passed as the required arguments.
2913 @error{}m4:stdin:2: Warning: builtin: undefined builtin `'
2916 @error{}m4:stdin:3: Warning: builtin: too few arguments: 0 < 1
2919 @error{}m4:stdin:4: Warning: builtin: undefined builtin `'
2921 builtin(`builtin', ``'
2923 @error{}m4:stdin:5: Warning: builtin: undefined builtin ``\'\n'
2926 @error{}m4:stdin:7: Warning: index: too few arguments: 0 < 2
2930 Normally, once a builtin macro is undefined, the only way to retrieve
2931 its functionality is by defining a new macro that expands to
2932 @code{builtin} under the hood. But this extra layer of expansion is
2933 slightly inefficient, not to mention the fact that it is not robust to
2934 changes in the current quoting scheme due to @code{changequote}
2935 (@pxref{Changequote}). On the other hand, defining a macro to the
2936 special token produced by @code{defn} (@pxref{Defn}) is very efficient,
2937 and avoids the need for quoting within the macro definition; but
2938 @code{defn} only works if the desired macro is already defined by some
2939 other name. So @code{builtin} provides a special case where it is
2940 possible to retrieve the same special token representing a builtin as
2941 what @code{defn} would provide, were the desired macro still defined.
2942 This feature is activated by passing @code{defn(`builtin')} as the first
2943 argument to builtin. Normally, passing a special token representing a
2944 macro as @var{name} results in a warning and an empty expansion, but in
2945 this case, if the second argument @var{name1} names a valid builtin,
2946 there is no warning and the expansion is the appropriate special
2947 token. In fact, with just the @code{builtin} macro accessible, it is
2948 possible to reconstitute the entire startup state of @code{m4}.
2950 In the example below, compare the number of macro invocations performed
2951 by @code{defn1} and @code{defn2}, and the differences once quoting is
2958 define(`foo', `bar')
2960 define(`defn1', `builtin(`defn', $@@)')
2962 define(`defn2', builtin(builtin(`defn', `builtin'), `defn'))
2964 dumpdef(`defn1', `defn2')
2965 @error{}defn1:@tabchar{}`builtin(`defn', $@@)'
2966 @error{}defn2:@tabchar{}<defn>
2971 @error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')'
2972 @error{}m4trace: -1- builtin(`defn', `foo') -> ``bar''
2975 @error{}m4trace: -1- defn2(`foo') -> ``bar''
2978 @error{}m4trace: -1- traceoff -> `'
2980 changequote(`[', `]')
2983 @error{}m4:stdin:11: Warning: builtin: undefined builtin ``defn\''
2987 define([defn1], [builtin([defn], $@@)])
2994 @error{}m4:stdin:16: Warning: builtin: undefined builtin `[defn]'
2999 @section Getting the defined macro names
3001 @cindex macro names, listing
3002 @cindex listing macro names
3003 @cindex currently defined macros
3004 @cindex @acronym{GNU} extensions
3005 The name of the currently defined macros can be accessed by
3008 @deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}})
3009 Without arguments, @code{m4symbols} expands to a sorted list of quoted
3010 strings, separated by commas. This contrasts with @code{dumpdef}
3011 (@pxref{Dumpdef}), whose output cannot be accessed by @code{m4}
3014 When given arguments, @code{m4symbols} returns the sorted subset of the
3015 @var{names} currently defined, and silently ignores the rest.
3016 This macro was added in M4 2.0.
3020 m4symbols(`ifndef', `ifdef', `define', `undef')
3021 @result{}define,ifdef
3025 @chapter Conditionals, loops, and recursion
3027 Macros, expanding to plain text, perhaps with arguments, are not quite
3028 enough. We would like to have macros expand to different things, based
3029 on decisions taken at run-time. For that, we need some kind of conditionals.
3030 Also, we would like to have some kind of loop construct, so we could do
3031 something a number of times, or while some condition is true.
3034 * Ifdef:: Testing if a macro is defined
3035 * Ifelse:: If-else construct, or multibranch
3036 * Shift:: Recursion in @code{m4}
3037 * Forloop:: Iteration by counting
3038 * Foreach:: Iteration by list contents
3039 * Stacks:: Working with definition stacks
3040 * Composition:: Building macros with macros
3044 @section Testing if a macro is defined
3046 @cindex conditionals
3047 There are two different builtin conditionals in @code{m4}. The first is
3050 @deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2})
3051 If @var{name} is defined as a macro, @code{ifdef} expands to
3052 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
3053 omitted, it is taken to be the empty string (according to the normal
3056 The macro @code{ifdef} is recognized only with parameters.
3060 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3061 @result{}foo is not defined
3064 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3065 @result{}foo is defined
3066 ifdef(`no_such_macro', `yes', `no', `extra argument')
3067 @error{}m4:stdin:4: Warning: ifdef: extra arguments ignored: 4 > 3
3071 As of M4 1.6, @code{ifdef} transparently handles builtin tokens
3072 generated by @code{defn} (@pxref{Defn}) that occur in either
3073 @var{string}, although a warning is issued for invalid macro names.
3078 ifdef(defn(`defn'), `yes', `no')
3079 @error{}m4:stdin:2: Warning: ifdef: invalid macro name ignored
3081 define(`foo', ifdef(`divnum', defn(`divnum'), `undefined'))
3088 @section If-else construct, or multibranch
3090 @cindex comparing strings
3091 @cindex discarding input
3092 @cindex input, discarding
3093 The other conditional, @code{ifelse}, is much more powerful. It can be
3094 used as a way to introduce a long comment, as an if-else construct, or
3095 as a multibranch, depending on the number of arguments supplied:
3097 @deffn {Builtin (m4)} ifelse (@var{comment})
3098 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
3100 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
3101 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
3102 Used with only one argument, the @code{ifelse} simply discards it and
3105 If called with three or four arguments, @code{ifelse} expands into
3106 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
3107 for character), otherwise it expands to @var{not-equal}. A final fifth
3108 argument is ignored, after triggering a warning.
3110 If called with six or more arguments, and @var{string-1} and
3111 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
3112 otherwise the first three arguments are discarded and the processing
3115 The macro @code{ifelse} is recognized only with parameters.
3118 Using only one argument is a common @code{m4} idiom for introducing a
3119 block comment, as an alternative to repeatedly using @code{dnl}. This
3120 special usage is recognized by @acronym{GNU} @code{m4}, so that in this
3121 case, the warning about missing arguments is never triggered.
3124 ifelse(`some comments')
3126 ifelse(`foo', `bar')
3127 @error{}m4:stdin:2: Warning: ifelse: too few arguments: 2 < 3
3131 Using three or four arguments provides decision points.
3134 ifelse(`foo', `bar', `true')
3136 ifelse(`foo', `foo', `true')
3138 define(`foo', `bar')
3140 ifelse(foo, `bar', `true', `false')
3142 ifelse(foo, `foo', `true', `false')
3146 @cindex macro, blind
3148 Notice how the first argument was used unquoted; it is common to compare
3149 the expansion of a macro with a string. With this macro, you can now
3150 reproduce the behavior of blind builtins, where the macro is recognized
3151 only with arguments.
3154 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
3159 @result{}arguments:1
3161 @result{}arguments:3
3164 For an example of a way to make defining blind macros easier, see
3167 @cindex multibranches
3168 @cindex switch statement
3169 @cindex case statement
3170 The macro @code{ifelse} can take more than four arguments. If given more
3171 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
3172 statement in traditional programming languages. If @var{string-1} and
3173 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
3174 the procedure is repeated with the first three arguments discarded. This
3175 calls for an example:
3178 ifelse(`foo', `bar', `third', `gnu', `gnats')
3179 @error{}m4:stdin:1: Warning: ifelse: extra arguments ignored: 5 > 4
3181 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
3183 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
3185 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
3186 @error{}m4:stdin:4: Warning: ifelse: extra arguments ignored: 8 > 7
3190 As of M4 1.6, @code{ifelse} transparently handles builtin tokens
3191 generated by @code{defn} (@pxref{Defn}). Because of this, it is always
3192 safe to compare two macro definitions, without worrying whether the
3193 macro might be a builtin.
3196 ifelse(defn(`defn'), `', `yes', `no')
3198 ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
3200 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
3202 define(`foo', ifelse(`', `', defn(`divnum')))
3208 Naturally, the normal case will be slightly more advanced than these
3209 examples. A common use of @code{ifelse} is in macros implementing loops
3213 @section Recursion in @code{m4}
3215 @cindex recursive macros
3216 @cindex macros, recursive
3217 There is no direct support for loops in @code{m4}, but macros can be
3218 recursive. There is no limit on the number of recursion levels, other
3219 than those enforced by your hardware and operating system.
3222 Loops can be programmed using recursion and the conditionals described
3225 There is a builtin macro, @code{shift}, which can, among other things,
3226 be used for iterating through the actual arguments to a macro:
3228 @deffn {Builtin (m4)} shift (@var{arg1}, @dots{})
3229 Takes any number of arguments, and expands to all its arguments except
3230 @var{arg1}, separated by commas, with each argument quoted.
3232 The macro @code{shift} is recognized only with parameters.
3240 shift(`foo', `bar', `baz')
3244 An example of the use of @code{shift} is this macro:
3246 @cindex reversing arguments
3247 @cindex arguments, reversing
3248 @deffn Composite reverse (@dots{})
3249 Takes any number of arguments, and reverses their order.
3252 It is implemented as:
3255 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3256 `reverse(shift($@@)), `$1'')')
3262 reverse(`foo', `bar', `gnats', `and gnus')
3263 @result{}and gnus, gnats, bar, foo
3266 While not a very interesting macro, it does show how simple loops can be
3267 made with @code{shift}, @code{ifelse} and recursion. It also shows
3268 that @code{shift} is usually used with @samp{$@@}. Another example of
3269 this is an implementation of a short-circuiting conditional operator.
3271 @cindex short-circuiting conditional
3272 @cindex conditional, short-circuiting
3273 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
3274 @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
3275 Similar to @code{ifelse}, where an equal comparison between the first
3276 two strings results in the third, otherwise the first three arguments
3277 are discarded and the process repeats. The difference is that each
3278 @var{test-<n>} is expanded only when it is encountered. This means that
3279 every third argument to @code{cond} is normally given one more level of
3280 quoting than the corresponding argument to @code{ifelse}.
3283 Here is the implementation of @code{cond}, along with a demonstration of
3284 how it can short-circuit the side effects in @code{side}. Notice how
3285 all the unquoted side effects happen regardless of how many comparisons
3286 are made with @code{ifelse}, compared with only the relevant effects
3291 `ifelse(`$#', `1', `$1',
3292 `ifelse($1, `$2', `$3',
3293 `$0(shift(shift(shift($@@))))')')')dnl
3294 define(`side', `define(`counter', incr(counter))$1')dnl
3296 `define(`counter', `0')dnl
3297 ifelse(side(`$1'), `yes', `one comparison: ',
3298 side(`$1'), `no', `two comparisons: ',
3299 side(`$1'), `maybe', `three comparisons: ',
3300 `side(`default answer: ')')counter')dnl
3302 `define(`counter', `0')dnl
3303 cond(`side(`$1')', `yes', `one comparison: ',
3304 `side(`$1')', `no', `two comparisons: ',
3305 `side(`$1')', `maybe', `three comparisons: ',
3306 `side(`default answer: ')')counter')dnl
3308 @result{}one comparison: 3
3310 @result{}two comparisons: 3
3312 @result{}three comparisons: 3
3313 example1(`feeling rather indecisive today')
3314 @result{}default answer: 4
3316 @result{}one comparison: 1
3318 @result{}two comparisons: 2
3320 @result{}three comparisons: 3
3321 example2(`feeling rather indecisive today')
3322 @result{}default answer: 4
3325 @cindex joining arguments
3326 @cindex arguments, joining
3327 @cindex concatenating arguments
3328 Another common task that requires iteration is joining a list of
3329 arguments into a single string.
3331 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3332 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3333 Generate a single-quoted string, consisting of each @var{arg} separated
3334 by @var{separator}. While @code{joinall} always outputs a
3335 @var{separator} between arguments, @code{join} avoids the
3336 @var{separator} for an empty @var{arg}.
3339 Here are some examples of its usage, based on the implementation
3340 @file{m4-@value{VERSION}/@/examples/@/join.m4} distributed in this
3345 $ @kbd{m4 -I examples}
3348 join,join(`-'),join(`-', `'),join(`-', `', `')
3350 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3354 join(`-', `1', `2', `3')
3356 join(`', `1', `2', `3')
3358 join(`-', `', `1', `', `', `2', `')
3360 joinall(`-', `', `1', `', `', `2', `')
3362 join(`,', `1', `2', `3')
3364 define(`nargs', `$#')dnl
3365 nargs(join(`,', `1', `2', `3'))
3369 Examining the implementation shows some interesting points about several
3370 m4 programming idioms.
3374 $ @kbd{m4 -I examples}
3375 undivert(`join.m4')dnl
3376 @result{}divert(`-1')
3377 @result{}# join(sep, args) - join each non-empty ARG into a single
3378 @result{}# string, with each element separated by SEP
3379 @result{}define(`join',
3380 @result{}`ifelse(`$#', `2', ``$2'',
3381 @result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3382 @result{}define(`_join',
3383 @result{}`ifelse(`$#$2', `2', `',
3384 @result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3385 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3386 @result{}# into a single string, with each element separated by SEP
3387 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3388 @result{}define(`_joinall',
3389 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3390 @result{}divert`'dnl
3393 First, notice that this implementation creates helper macros
3394 @code{_join} and @code{_joinall}. This division of labor makes it
3395 easier to output the correct number of @var{separator} instances:
3396 @code{join} and @code{joinall} are responsible for the first argument,
3397 without a separator, while @code{_join} and @code{_joinall} are
3398 responsible for all remaining arguments, always outputting a separator
3399 when outputting an argument.
3401 Next, observe how @code{join} decides to iterate to itself, because the
3402 first @var{arg} was empty, or to output the argument and swap over to
3403 @code{_join}. If the argument is non-empty, then the nested
3404 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3405 with the @samp{$0} to form the next macro name to invoke. The
3406 @code{joinall} implementation is simpler since it does not have to
3407 suppress empty @var{arg}; it always executes once then defers to
3410 Another important idiom is the idea that @var{separator} is reused for
3411 each iteration. Each iteration has one less argument, but rather than
3412 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3413 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3415 Next, notice that it is possible to compare more than one condition in a
3416 single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
3417 allows @code{_join} to iterate for two separate reasons---either there
3418 are still more than two arguments, or there are exactly two arguments
3419 but the last argument is not empty.
3421 Finally, notice that these macros require exactly two arguments to
3422 terminate recursion, but that they still correctly result in empty
3423 output when given no @var{args} (i.e., zero or one macro argument). On
3424 the first pass when there are too few arguments, the @code{shift}
3425 results in no output, but leaves an empty string to serve as the
3426 required second argument for the second pass. Put another way,
3427 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3428 former guarantees at least two arguments.
3430 @cindex quote manipulation
3431 @cindex manipulating quotes
3432 Sometimes, a recursive algorithm requires adding quotes to each element,
3433 or treating multiple arguments as a single element:
3435 @deffn Composite quote (@dots{})
3436 @deffnx Composite dquote (@dots{})
3437 @deffnx Composite dquote_elt (@dots{})
3438 Takes any number of arguments, and adds quoting. With @code{quote},
3439 only one level of quoting is added, effectively removing whitespace
3440 after commas and turning multiple arguments into a single string. With
3441 @code{dquote}, two levels of quoting are added, one around each element,
3442 and one around the list. And with @code{dquote_elt}, two levels of
3443 quoting are added around each element.
3446 An actual implementation of these three macros is distributed as
3447 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
3448 let's examine their usage:
3452 $ @kbd{m4 -I examples}
3455 -quote-dquote-dquote_elt-
3457 -quote()-dquote()-dquote_elt()-
3459 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3460 @result{}-1-`1'-`1'-
3461 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3462 @result{}-1,2-`1',`2'-`1',`2'-
3463 define(`n', `$#')dnl
3464 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3466 dquote(dquote_elt(`1', `2'))
3467 @result{}``1'',``2''
3468 dquote_elt(dquote(`1', `2'))
3472 The last two lines show that when given two arguments, @code{dquote}
3473 results in one string, while @code{dquote_elt} results in two. Now,
3474 examine the implementation. Note that @code{quote} and
3475 @code{dquote_elt} make decisions based on their number of arguments, so
3476 that when called without arguments, they result in nothing instead of a
3477 quoted empty string; this is so that it is possible to distinguish
3478 between no arguments and an empty first argument. @code{dquote}, on the
3479 other hand, results in a string no matter what, since it is still
3480 possible to tell whether it was invoked without arguments based on the
3485 $ @kbd{m4 -I examples}
3486 undivert(`quote.m4')dnl
3487 @result{}divert(`-1')
3488 @result{}# quote(args) - convert args to single-quoted string
3489 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3490 @result{}# dquote(args) - convert args to quoted list of quoted strings
3491 @result{}define(`dquote', ``$@@'')
3492 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3493 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3494 @result{} ```$1'',$0(shift($@@))')')
3495 @result{}divert`'dnl
3498 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3499 than @samp{joinall(`,', @var{args})} for producing the same output.
3501 @cindex nine arguments, more than
3502 @cindex more than nine arguments
3503 @cindex arguments, more than nine
3504 One more useful macro based on @code{shift} allows portably selecting
3505 an arbitrary argument (usually greater than the ninth argument), without
3506 relying on the @acronym{GNU} extension of multi-digit arguments
3507 (@pxref{Arguments}).
3509 @deffn Composite argn (@var{n}, @dots{})
3510 Expands to argument @var{n} out of the remaining arguments. @var{n}
3511 must be a positive number. Usually invoked as
3512 @samp{argn(`@var{n}',$@@)}.
3515 It is implemented as:
3518 define(`argn', `ifelse(`$1', 1, ``$2'',
3519 `argn(decr(`$1'), shift(shift($@@)))')')
3523 define(`foo', `argn(`11', $@@)')
3525 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3530 @section Iteration by counting
3533 @cindex loops, counting
3534 @cindex counting loops
3535 Here is an example of a loop macro that implements a simple for loop.
3537 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3538 Takes the name in @var{iterator}, which must be a valid macro name, and
3539 successively assign it each integer value from @var{start} to @var{end},
3540 inclusive. For each assignment to @var{iterator}, append @var{text} to
3541 the expansion of the @code{forloop}. @var{text} may refer to
3542 @var{iterator}. Any definition of @var{iterator} prior to this
3543 invocation is restored.
3546 It can, for example, be used for simple counting:
3550 $ @kbd{m4 -I examples}
3551 include(`forloop.m4')
3553 forloop(`i', `1', `8', `i ')
3554 @result{}1 2 3 4 5 6 7 8@w{ }
3557 For-loops can be nested, like:
3561 $ @kbd{m4 -I examples}
3562 include(`forloop.m4')
3564 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3566 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3567 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3568 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3569 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3573 The implementation of the @code{forloop} macro is fairly
3574 straightforward. The @code{forloop} macro itself is simply a wrapper,
3575 which saves the previous definition of the first argument, calls the
3576 internal macro @code{@w{_forloop}}, and re-establishes the saved
3577 definition of the first argument.
3579 The macro @code{@w{_forloop}} expands the fourth argument once, and
3580 tests to see if the iterator has reached the final value. If it has
3581 not finished, it increments the iterator (using the predefined macro
3582 @code{incr}, @pxref{Incr}), and recurses.
3584 Here is an actual implementation of @code{forloop}, distributed as
3585 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
3589 $ @kbd{m4 -I examples}
3590 undivert(`forloop.m4')dnl
3591 @result{}divert(`-1')
3592 @result{}# forloop(var, from, to, stmt) - simple version
3593 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3594 @result{}define(`_forloop',
3595 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3596 @result{}divert`'dnl
3599 Notice the careful use of quotes. Certain macro arguments are left
3600 unquoted, each for its own reason. Try to find out @emph{why} these
3601 arguments are left unquoted, and see what happens if they are quoted.
3602 (As presented, these two macros are useful but not very robust for
3603 general use. They lack even basic error handling for cases like
3604 @var{start} less than @var{end}, @var{end} not numeric, or
3605 @var{iterator} not being a macro name. See if you can improve these
3606 macros; or @pxref{Improved forloop, , Answers}).
3609 @section Iteration by list contents
3611 @cindex for each loops
3612 @cindex loops, list iteration
3613 @cindex iterating over lists
3614 Here is an example of a loop macro that implements list iteration.
3616 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3617 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3618 Takes the name in @var{iterator}, which must be a valid macro name, and
3619 successively assign it each value from @var{paren-list} or
3620 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
3621 comma-separated list of elements contained in parentheses. In
3622 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3623 contained in a quoted string. For each assignment to @var{iterator},
3624 append @var{text} to the overall expansion. @var{text} may refer to
3625 @var{iterator}. Any definition of @var{iterator} prior to this
3626 invocation is restored.
3629 As an example, this displays each word in a list inside of a sentence,
3630 using an implementation of @code{foreach} distributed as
3631 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
3632 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
3636 $ @kbd{m4 -I examples}
3637 include(`foreach.m4')
3639 foreach(`x', (foo, bar, foobar), `Word was: x
3641 @result{}Word was: foo
3642 @result{}Word was: bar
3643 @result{}Word was: foobar
3644 include(`foreachq.m4')
3646 foreachq(`x', `foo, bar, foobar', `Word was: x
3648 @result{}Word was: foo
3649 @result{}Word was: bar
3650 @result{}Word was: foobar
3653 It is possible to be more complex; each element of the @var{paren-list}
3654 or @var{quote-list} can itself be a list, to pass as further arguments
3655 to a helper macro. This example generates a shell case statement:
3659 $ @kbd{m4 -I examples}
3660 include(`foreach.m4')
3662 define(`_case', ` $1)
3665 define(`_cat', `$1$2')dnl
3668 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3669 `_cat(`_case', x)')dnl
3671 @result{} vara=" a";;
3673 @result{} varb=" b";;
3675 @result{} varc=" c";;
3680 The implementation of the @code{foreach} macro is a bit more involved;
3681 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
3682 needed to grab the first element of a list. Second,
3683 @code{@w{_foreach}} implements the recursion, successively walking
3684 through the original list. Here is a simple implementation of
3689 $ @kbd{m4 -I examples}
3690 undivert(`foreach.m4')dnl
3691 @result{}divert(`-1')
3692 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3693 @result{}# parenthesized list, simple version
3694 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3695 @result{}define(`_arg1', `$1')
3696 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3697 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3698 @result{}divert`'dnl
3701 Unfortunately, that implementation is not robust to macro names as list
3702 elements. Each iteration of @code{@w{_foreach}} is stripping another
3703 layer of quotes, leading to erratic results if list elements are not
3704 already fully expanded. The first cut at implementing @code{foreachq}
3705 takes this into account. Also, when using quoted elements in a
3706 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
3707 has the nice property of requiring fewer characters to create a list
3708 containing the same quoted elements. To see the difference between the
3709 two macros, we attempt to pass double-quoted macro names in a list,
3710 expecting the macro name on output after one layer of quotes is removed
3711 during list iteration and the final layer removed during the final
3716 $ @kbd{m4 -I examples}
3717 define(`a', `1')define(`b', `2')define(`c', `3')
3719 include(`foreach.m4')
3721 include(`foreachq.m4')
3723 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3730 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3737 Obviously, @code{foreachq} did a better job; here is its implementation:
3741 $ @kbd{m4 -I examples}
3742 undivert(`foreachq.m4')dnl
3743 @result{}include(`quote.m4')dnl
3744 @result{}divert(`-1')
3745 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3746 @result{}# quoted list, simple version
3747 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3748 @result{}define(`_arg1', `$1')
3749 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3750 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3751 @result{}divert`'dnl
3754 Notice that @code{@w{_foreachq}} had to use the helper macro
3755 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3756 embedded @code{ifelse} call does not go haywire if a list element
3757 contains a comma. Unfortunately, this implementation of @code{foreachq}
3758 has its own severe flaw. Whereas the @code{foreach} implementation was
3759 linear, this macro is quadratic in the number of list elements, and is
3760 much more likely to trip up the limit set by the command line option
3761 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3762 Invoking m4}). Additionally, this implementation does not expand
3763 @samp{defn(`@var{iterator}')} very well, when compared with
3768 $ @kbd{m4 -I examples}
3769 include(`foreach.m4')include(`foreachq.m4')
3771 foreach(`name', `(`a', `b')', ` defn(`name')')
3773 foreachq(`name', ``a', `b'', ` defn(`name')')
3774 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3777 It is possible to have robust iteration with linear behavior and sane
3778 @var{iterator} contents for either list style. See if you can learn
3779 from the best elements of both of these implementations to create robust
3780 macros (or @pxref{Improved foreach, , Answers}).
3783 @section Working with definition stacks
3785 @cindex definition stack
3786 @cindex pushdef stack
3787 @cindex stack, macro definition
3788 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3789 operation in @code{m4}. Normally, only the topmost definition in a
3790 stack is important, but sometimes, it is desirable to manipulate the
3791 entire definition stack.
3793 @deffn Composite stack_foreach (@var{macro}, @var{action})
3794 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3795 For each of the @code{pushdef} definitions associated with @var{macro},
3796 invoke the macro @var{action} with a single argument of that definition.
3797 @code{stack_foreach} visits the oldest definition first, while
3798 @code{stack_foreach_lifo} visits the current definition first.
3799 @var{action} should not modify or dereference @var{macro}. There are a
3800 few special macros, such as @code{defn}, which cannot be used as the
3801 @var{macro} parameter.
3804 A sample implementation of these macros is distributed in the file
3805 @file{m4-@value{VERSION}/@/examples/@/stack.m4}.
3809 $ @kbd{m4 -I examples}
3812 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3814 define(`show', ``$1'
3817 stack_foreach(`a', `show')dnl
3821 stack_foreach_lifo(`a', `show')dnl
3827 Now for the implementation. Note the definition of a helper macro,
3828 @code{_stack_reverse}, which destructively swaps the contents of one
3829 stack of definitions into the reverse order in the temporary macro
3830 @samp{tmp-$1}. By calling the helper twice, the original order is
3831 restored back into the macro @samp{$1}; since the operation is
3832 destructive, this explains why @samp{$1} must not be modified or
3833 dereferenced during the traversal. The caller can then inject
3834 additional code to pass the definition currently being visited to
3835 @samp{$2}. The choice of helper names is intentional; since @samp{-} is
3836 not valid as part of a macro name, there is no risk of conflict with a
3837 valid macro name, and the code is guaranteed to use @code{defn} where
3838 necessary. Finally, note that any macro used in the traversal of a
3839 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3840 handled by @code{stack_foreach}, since the macro would temporarily be
3841 undefined during the algorithm.
3845 $ @kbd{m4 -I examples}
3846 undivert(`stack.m4')dnl
3847 @result{}divert(`-1')
3848 @result{}# stack_foreach(macro, action)
3849 @result{}# Invoke ACTION with a single argument of each definition
3850 @result{}# from the definition stack of MACRO, starting with the oldest.
3851 @result{}define(`stack_foreach',
3852 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3853 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3854 @result{}# stack_foreach_lifo(macro, action)
3855 @result{}# Invoke ACTION with a single argument of each definition
3856 @result{}# from the definition stack of MACRO, starting with the newest.
3857 @result{}define(`stack_foreach_lifo',
3858 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3859 @result{}`_stack_reverse(`tmp-$1', `$1')')
3860 @result{}define(`_stack_reverse',
3861 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3862 @result{}divert`'dnl
3866 @section Building macros with macros
3868 @cindex macro composition
3869 @cindex composing macros
3870 Since m4 is a macro language, it is possible to write macros that
3871 can build other macros. First on the list is a way to automate the
3872 creation of blind macros.
3874 @cindex macro, blind
3876 @deffn Composite define_blind (@var{name}, @ovar{value})
3877 Defines @var{name} as a blind macro, such that @var{name} will expand to
3878 @var{value} only when given explicit arguments. @var{value} should not
3879 be the result of @code{defn} (@pxref{Defn}). This macro is only
3880 recognized with parameters, and results in an empty string.
3883 Defining a macro to define another macro can be a bit tricky. We want
3884 to use a literal @samp{$#} in the argument to the nested @code{define}.
3885 However, if @samp{$} and @samp{#} are adjacent in the definition of
3886 @code{define_blind}, then it would be expanded as the number of
3887 arguments to @code{define_blind} rather than the intended number of
3888 arguments to @var{name}. The solution is to pass the difficult
3889 characters through extra arguments to a helper macro
3890 @code{_define_blind}. When composing macros, it is a common idiom to
3891 need a helper macro to concatenate text that forms parameters in the
3892 composed macro, rather than interpreting the text as a parameter of the
3895 As for the limitation against using @code{defn}, there are two reasons.
3896 If a macro was previously defined with @code{define_blind}, then it can
3897 safely be renamed to a new blind macro using plain @code{define}; using
3898 @code{define_blind} to rename it just adds another layer of
3899 @code{ifelse}, occupying memory and slowing down execution. And if a
3900 macro is a builtin, then it would result in an attempt to define a macro
3901 consisting of both text and a builtin token; this is not supported, and
3902 the builtin token is flattened to an empty string.
3904 With that explanation, here's the definition, and some sample usage.
3905 Notice that @code{define_blind} is itself a blind macro.
3909 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3910 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3912 define(`_define_blind', `define(`$1',
3913 `ifelse(`$3', `0', ``$4'', `$2')')')
3916 @result{}define_blind
3917 define_blind(`foo', `arguments were $*')
3922 @result{}arguments were bar
3923 define(`blah', defn(`foo'))
3928 @result{}arguments were a,b
3930 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
3933 @cindex currying arguments
3934 @cindex argument currying
3935 Another interesting composition tactic is argument @dfn{currying}, or
3936 factoring a macro that takes multiple arguments for use in a context
3937 that provides exactly one argument.
3939 @deffn Composite curry (@var{macro}, @dots{})
3940 Expand to a macro call that takes exactly one argument, then appends
3941 that argument to the original arguments and invokes @var{macro} with the
3942 resulting list of arguments.
3945 A demonstration of currying makes the intent of this macro a little more
3946 obvious. The macro @code{stack_foreach} mentioned earlier is an example
3947 of a context that provides exactly one argument to a macro name. But
3948 coupled with currying, we can invoke @code{reverse} with two arguments
3949 for each definition of a macro stack. This example uses the file
3950 @file{m4-@value{VERSION}/@/examples/@/curry.m4} included in the
3955 $ @kbd{m4 -I examples}
3956 include(`curry.m4')include(`stack.m4')
3958 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3959 `reverse(shift($@@)), `$1'')')
3961 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3963 stack_foreach(`a', `:curry(`reverse', `4')')
3964 @result{}:1, 4:2, 4:3, 4
3965 curry(`curry', `reverse', `1')(`2')(`3')
3969 Now for the implementation. Notice how @code{curry} leaves off with a
3970 macro name but no open parenthesis, while still in the middle of
3971 collecting arguments for @samp{$1}. The macro @code{_curry} is the
3972 helper macro that takes one argument, then adds it to the list and
3973 finally supplies the closing parenthesis. The use of a comma inside the
3974 @code{shift} call allows currying to also work for a macro that takes
3975 one argument, although it often makes more sense to invoke that macro
3976 directly rather than going through @code{curry}.
3980 $ @kbd{m4 -I examples}
3981 undivert(`curry.m4')dnl
3982 @result{}divert(`-1')
3983 @result{}# curry(macro, args)
3984 @result{}# Expand to a macro call that takes one argument, then invoke
3985 @result{}# macro(args, extra).
3986 @result{}define(`curry', `$1(shift($@@,)_$0')
3987 @result{}define(`_curry', ``$1')')
3988 @result{}divert`'dnl
3991 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
3992 tokens, which are silently flattened to the empty string when passed
3993 through another text macro. The following example demonstrates a usage
3994 of @code{curry} that works in M4 1.6, but is not portable to earlier
3999 $ @kbd{m4 -I examples}
4002 curry(`define', `mylen')(defn(`len'))
4008 @cindex renaming macros
4009 @cindex copying macros
4010 @cindex macros, copying
4011 Putting the last few concepts together, it is possible to copy or rename
4012 an entire stack of macro definitions.
4014 @deffn Composite copy (@var{source}, @var{dest})
4015 @deffnx Composite rename (@var{source}, @var{dest})
4016 Ensure that @var{dest} is undefined, then define it to the same stack of
4017 definitions currently in @var{source}. @code{copy} leaves @var{source}
4018 unchanged, while @code{rename} undefines @var{source}. There are only a
4019 few macros, such as @code{copy} or @code{defn}, which cannot be copied
4023 The implementation is relatively straightforward (although since it uses
4024 @code{curry}, it is unable to copy builtin macros when used with M4
4025 1.4.x. See if you can design a portable version that works across all
4026 M4 versions, or @pxref{Improved copy, , Answers}).
4030 $ @kbd{m4 -I examples}
4031 include(`curry.m4')include(`stack.m4')
4033 define(`rename', `copy($@@)undefine(`$1')')dnl
4034 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
4036 `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
4037 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
4052 @chapter How to debug macros and input
4054 @cindex debugging macros
4055 @cindex macros, debugging
4056 When writing macros for @code{m4}, they often do not work as intended on
4057 the first try (as is the case with most programming languages).
4058 Fortunately, there is support for macro debugging in @code{m4}.
4061 * Dumpdef:: Displaying macro definitions
4062 * Trace:: Tracing macro calls
4063 * Debugmode:: Controlling debugging options
4064 * Debuglen:: Limiting debug output
4065 * Debugfile:: Saving debugging output
4069 @section Displaying macro definitions
4071 @cindex displaying macro definitions
4072 @cindex macros, displaying definitions
4073 @cindex definitions, displaying macro
4074 @cindex standard error, output to
4075 If you want to see what a name expands into, you can use the builtin
4078 @deffn {Builtin (m4)} dumpdef (@ovar{name@dots{}})
4079 Accepts any number of arguments. If called without any arguments, it
4080 displays the definitions of all known names, otherwise it displays the
4081 definitions of each @var{name} given, sorted by name. If a @var{name}
4082 is undefined, the @samp{d} debug level controls whether a warning is
4083 issued (@pxref{Debugmode}). Likewise, the @samp{o} debug level controls
4084 whether the output is issued to standard error or the current debug
4085 file (@pxref{Debugfile}).
4087 The expansion of @code{dumpdef} is void.
4092 define(`foo', `Hello world.')
4095 @error{}foo:@tabchar{}`Hello world.'
4098 @error{}define:@tabchar{}<define>
4102 The last example shows how builtin macros definitions are displayed.
4103 The definition that is dumped corresponds to what would occur if the
4104 macro were to be called at that point, even if other definitions are
4105 still live due to redefining a macro during argument collection.
4109 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
4111 f(popdef(`f')dumpdef(`f'))
4112 @error{}f:@tabchar{}``$0'1'
4114 f(popdef(`f')dumpdef(`f'))
4115 @error{}m4:stdin:3: Warning: dumpdef: undefined macro `f'
4123 @xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and
4124 @samp{s} flags affect the details of the display. Remember, the
4125 @samp{q} flag is implied when the @option{--debug} option (@option{-d},
4126 @pxref{Debugging options, , Invoking m4}) is used in the command line
4127 without arguments. Also, @option{--debuglen} (@pxref{Debuglen}) can affect
4128 output, by truncating longer strings (but not builtin and module names).
4130 @comment options: -ds -l3
4133 pushdef(`foo', `1 long string')
4135 pushdef(`foo', defn(`divnum'))
4141 dumpdef(`foo', `dnl', `indir', `__gnu__')
4142 @error{}__gnu__:@tabchar{}@{gnu@}
4143 @error{}dnl:@tabchar{}<dnl>@{m4@}
4144 @error{}foo:@tabchar{}3, <divnum>@{m4@}, 1 l...
4145 @error{}indir:@tabchar{}<indir>@{gnu@}
4147 debugmode(`-ms')debugmode(`+q')
4150 @error{}foo:@tabchar{}`3'
4155 @section Tracing macro calls
4157 @cindex tracing macro expansion
4158 @cindex macro expansion, tracing
4159 @cindex expansion, tracing macro
4160 @cindex standard error, output to
4161 It is possible to trace macro calls and expansions through the builtins
4162 @code{traceon} and @code{traceoff}:
4164 @deffn {Builtin (m4)} traceon (@ovar{names@dots{}})
4165 @deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}})
4166 When called without any arguments, @code{traceon} and @code{traceoff}
4167 will turn tracing on and off, respectively, for all macros, identical to
4168 using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}).
4170 When called with arguments, only the macros listed in @var{names} are
4171 affected, whether or not they are currently defined. A macro's
4172 expansion will be traced if global tracing is on, or if the individual
4173 macro tracing flag is set; to avoid tracing a macro, both the global
4174 flag and the macro must have tracing off.
4176 The expansion of @code{traceon} and @code{traceoff} is void.
4179 Whenever a traced macro is called and the arguments have been collected,
4180 the call is displayed. If the expansion of the macro call is not void,
4181 the expansion can be displayed after the call. The output is printed
4182 to the current debug file (defaulting to standard error,
4187 define(`foo', `Hello World.')
4189 define(`echo', `$@@')
4191 traceon(`foo', `echo')
4194 @error{}m4trace: -1- foo -> `Hello World.'
4195 @result{}Hello World.
4196 echo(`gnus', `and gnats')
4197 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
4198 @result{}gnus,and gnats
4201 The number between dashes is the depth of the expansion. It is one most
4202 of the time, signifying an expansion at the outermost level, but it
4203 increases when macro arguments contain unquoted macro calls. The
4204 maximum number that will appear between dashes is controlled by the
4205 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
4206 , Invoking m4}). Additionally, the option @option{--trace} (or
4207 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
4210 @comment options: -d-V -L3 -tifelse
4213 $ @kbd{m4 -L 3 -t ifelse}
4215 @error{}m4trace: -1- ifelse
4217 ifelse(ifelse(ifelse(`three levels')))
4218 @error{}m4trace: -3- ifelse
4219 @error{}m4trace: -2- ifelse
4220 @error{}m4trace: -1- ifelse
4222 ifelse(ifelse(ifelse(ifelse(`four levels'))))
4223 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
4226 Tracing by name is an attribute that is preserved whether the macro is
4227 defined or not. This allows the selection of macros to trace before
4228 those macros are defined.
4239 @error{}m4:stdin:4: Warning: defn: undefined macro `foo'
4242 @error{}m4:stdin:5: Warning: undefine: undefined macro `foo'
4249 @error{}m4:stdin:8: Warning: popdef: undefined macro `foo'
4251 define(`foo', `bar')
4254 @error{}m4trace: -1- foo -> `bar'
4258 ifdef(`foo', `yes', `no')
4261 @error{}m4:stdin:13: Warning: indir: undefined macro `foo'
4263 define(`foo', `blah')
4266 @error{}m4trace: -1- foo -> `blah'
4270 Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
4271 does not transfer tracing status.
4278 @error{}m4trace: -1- traceon(`traceoff') -> `'
4280 traceoff(`traceoff')
4281 @error{}m4trace: -1- traceoff(`traceoff') -> `'
4285 traceon(`eval', `m4_divnum')
4287 define(`m4_eval', defn(`eval'))
4289 define(`m4_divnum', defn(`divnum'))
4292 @error{}m4trace: -1- eval(`0') -> `0'
4295 @error{}m4trace: -2- m4_divnum -> `0'
4299 As of @acronym{GNU} M4 2.0, named macro tracing is independent of global
4300 tracing status; calling @code{traceoff} without arguments turns off the
4301 global trace flag, but does not turn off tracing for macros where
4302 tracing was requested by name. Likewise, calling @code{traceon} without
4303 arguments will affect tracing of macros that are not defined yet. This
4304 behavior matches traditional implementations of @code{m4}.
4310 define(`foo', `bar')
4311 @error{}m4trace: -1- define(`foo', `bar') -> `'
4313 foo # traced, even though foo was not defined at traceon
4314 @error{}m4trace: -1- foo -> `bar'
4315 @result{}bar # traced, even though foo was not defined at traceon
4317 @error{}m4trace: -1- traceoff(`foo') -> `'
4319 foo # traced, since global tracing is still on
4320 @error{}m4trace: -1- foo -> `bar'
4321 @result{}bar # traced, since global tracing is still on
4323 @error{}m4trace: -1- traceon(`foo') -> `'
4326 @error{}m4trace: -1- traceoff -> `'
4328 foo # traced, since foo is now traced by name
4329 @error{}m4trace: -1- foo -> `bar'
4330 @result{}bar # traced, since foo is now traced by name
4334 @result{}bar # untraced
4337 However, @acronym{GNU} M4 prior to 2.0 had slightly different
4338 semantics, where @code{traceon} without arguments only affected symbols
4339 that were defined at that moment, and @code{traceoff} without arguments
4340 stopped all tracing, even when tracing was requested by macro name. The
4341 addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it
4342 possible to write a file that approximates the older semantics
4343 regardless of which version of @acronym{GNU} M4 is in use.
4345 @comment options: -d-V
4349 `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)',
4350 `builtin(`traceon', $@@)')')dnl
4351 define(`traceoff', `ifelse(`$#', `0',
4352 `builtin(`traceoff')builtin(`traceoff', m4symbols)',
4353 `builtin(`traceoff', $@@)')')')dnl
4356 traceon # called before b is defined, so b is not traced
4357 @result{} # called before b is defined, so b is not traced
4359 @error{}m4trace: -1- define
4362 @error{}m4trace: -1- a
4365 @error{}m4trace: -1- traceon
4366 @error{}m4trace: -1- ifelse
4367 @error{}m4trace: -1- builtin
4370 @error{}m4trace: -1- a
4371 @error{}m4trace: -1- b
4373 traceoff # stops tracing b, even though it was traced by name
4374 @error{}m4trace: -1- traceoff
4375 @error{}m4trace: -1- ifelse
4376 @error{}m4trace: -1- builtin
4377 @error{}m4trace: -2- m4symbols
4378 @error{}m4trace: -1- builtin
4379 @result{} # stops tracing b, even though it was traced by name
4384 @xref{Debugmode}, for information on controlling the details of the
4385 display. The format of the trace output is not specified by
4386 @acronym{POSIX}, and varies between implementations of @code{m4}.
4388 Starting with M4 1.6, tracing also works via @code{indir}
4389 (@pxref{Indir}). However, since tracing is an attribute tracked by
4390 macro names, and @code{builtin} bypasses macro names (@pxref{Builtin}),
4391 it is not possible for @code{builtin} to trace which subsidiary builtin
4392 it invokes. If you are worried about tracking all invocations of a
4393 given builtin, you should also trace @code{builtin}, or enable global
4394 tracing (the @samp{t} debug level, @pxref{Debugmode}).
4398 define(`my_defn', defn(`defn'))undefine(`defn')
4400 define(`foo', `bar')traceon(`foo', `defn', `my_defn')
4403 @error{}m4trace: -1- foo -> `bar'
4406 @error{}m4trace: -1- foo -> `bar'
4409 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4411 indir(`my_defn', `foo')
4412 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4414 builtin(`defn', `foo')
4418 builtin(`defn', builtin(`shift', `', `foo'))
4419 @error{}m4trace: -1- id 12: builtin ... = <builtin>
4420 @error{}m4trace: -2- id 13: builtin ... = <builtin>
4421 @error{}m4trace: -2- id 13: builtin(`shift', `', `foo') -> ``foo''
4422 @error{}m4trace: -1- id 12: builtin(`defn', `foo') -> ``bar''
4424 indir(`my_defn', indir(`shift', `', `foo'))
4425 @error{}m4trace: -1- id 14: indir ... = <indir>
4426 @error{}m4trace: -2- id 15: indir ... = <indir>
4427 @error{}m4trace: -2- id 15: shift ... = <shift>
4428 @error{}m4trace: -2- id 15: shift(`', `foo') -> ``foo''
4429 @error{}m4trace: -2- id 15: indir(`shift', `', `foo') -> ``foo''
4430 @error{}m4trace: -1- id 14: my_defn ... = <defn>
4431 @error{}m4trace: -1- id 14: my_defn(`foo') -> ``bar''
4432 @error{}m4trace: -1- id 14: indir(`my_defn', `foo') -> ``bar''
4437 @section Controlling debugging options
4439 @cindex controlling debugging output
4440 @cindex debugging output, controlling
4441 The @option{--debug} option to @code{m4} (also spelled
4442 @option{--debugmode} or @option{-d}, @pxref{Debugging options, ,
4443 Invoking m4}) controls the amount of details presented in three
4444 categories of output. Trace output is requested by @code{traceon}
4445 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
4446 relation to a macro invocation. Debug output tracks useful events not
4447 associated with a macro invocation, and each line is prefixed by
4448 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
4449 affected, with no prefix added to the output lines.
4451 The @var{flags} following the option can be one or more of the
4456 In trace output, show the actual arguments that were collected before
4457 invoking the macro. Arguments are subject to length truncation
4458 specified by @code{debuglen} (@pxref{Debuglen}).
4461 In trace output, show an additional line for each macro call, when the
4462 macro is seen, but before the arguments are collected, and show the
4463 definition of the macro that will be used for the expansion. By
4464 default, only one line is printed, after all arguments are collected and
4465 the expansion determined. The definition is subject to length
4466 truncation specified by @code{debuglen} (@pxref{Debuglen}). This is
4467 often used with the @samp{x} flag.
4470 Output a warning on any attempt to dereference an undefined macro via
4471 @code{builtin}, @code{defn}, @code{dumpdef}, @code{indir},
4472 @code{popdef}, or @code{undefine}. Note that @code{indef},
4474 @code{traceon}, and @code{traceoff} do not dereference undefined macros.
4475 Like any other warning, the warnings enabled by this flag go to standard
4476 error regardless of the current @code{debugfile} setting, and will
4477 change exit status if the command line option @option{--fatal-warnings}
4478 was specified. This flag is useful in diagnosing spelling mistakes in
4479 macro names. It is enabled by default when neither @option{--debug} nor
4480 @option{--fatal-warnings} are specified on the command line.
4483 In trace output, show the expansion of each macro call. The expansion
4484 is subject to length truncation specified by @code{debuglen}
4488 In debug and trace output, include the name of the current input file in
4492 In debug output, print a message each time the current input file is
4496 In debug and trace output, include the current input line number in the
4500 In debug output, print a message each time a module is manipulated
4501 (@pxref{Modules}). In trace output when the @samp{c} flag is in effect,
4502 and in dumpdef output, follow builtin macros with their module name,
4503 surrounded by braces (@samp{@{@}}).
4506 Output @code{dumpdef} data to standard error instead of the current
4507 debug file. This can be useful when post-processing trace output, where
4508 interleaving dumpdef and trace output can cause ambiguities.
4511 In debug output, print a message when a named file is found through the
4512 path search mechanism (@pxref{Search Path}), giving the actual file name
4516 In trace and dumpdef output, quote actual arguments and macro expansions
4517 in the display with the current quotes. This is useful in connection
4518 with the @samp{a} and @samp{e} flags above.
4521 In dumpdef output, show the entire stack of definitions associated with
4522 a symbol via @code{pushdef}.
4525 In trace output, trace all macro calls made in this invocation of
4526 @code{m4}. This is equivalent to using @code{traceon} without
4530 In trace output, add a unique `macro call id' to each line of the trace
4531 output. This is useful in connection with the @samp{c} flag above, to
4532 match where a macro is first recognized with where it is finally
4533 expanded, in spite of intermediate expansions that occur while
4534 collecting arguments. It can also be used in isolation to determine how
4535 many macros have been expanded.
4538 A shorthand for all of the above flags.
4541 As special cases, if @var{flags} starts with a @samp{+}, the named flags
4542 are enabled without impacting other flags, and if it starts with a
4543 @samp{-}, the named flags are disabled without impacting other flags.
4544 Without either of these starting characters, @var{flags} simply replaces
4545 the previous setting.
4546 @comment FIXME - should we accept usage like debugmode(+fl-q)? Also,
4547 @comment should we add debugmode(?) which expands to the current
4548 @comment enabled flags, and debugmode(e?) which expands to e if e is
4549 @comment currently enabled?
4551 If no flags are specified with the @option{--debug} option, the default is
4552 @samp{+adeq}. Many examples in this manual show their output using
4555 @cindex @acronym{GNU} extensions
4556 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4557 the debugging output format:
4559 @deffn {Builtin (gnu)} debugmode (@ovar{flags})
4560 The argument @var{flags} should be a subset of the letters listed above.
4561 If no argument is present, all debugging flags are cleared (as if
4562 @var{flags} were an explicit @samp{-V}). With an empty argument, the
4563 most common flags are enabled (as if @var{flags} were an explicit
4564 @samp{+adeq}). If an unknown flag is encountered, an error is issued.
4566 The expansion of @code{debugmode} is void.
4569 @comment options: -d-V
4572 define(`foo', `FOO$1')
4574 traceon(`foo', `divnum')
4576 debugmode()dnl same as debugmode(`+adeq')
4578 @error{}m4trace: -1- foo -> `FOO'
4580 debugmode(`V')debugmode(`-q')
4581 @error{}m4trace:stdin:5: -1- id 7: debugmode ... = <debugmode>@{gnu@}
4582 @error{}m4trace:stdin:5: -1- id 7: debugmode(`-q') -> `'
4586 @error{}m4trace:stdin:6: -1- id 8: foo ... = FOO$1
4587 @error{}m4trace:stdin:6: -1- id 8: foo(BAR) -> FOOBAR
4589 debugmode`'dnl same as debugmode(`-V')
4590 @error{}m4trace:stdin:8: -1- id 9: debugmode ... = <debugmode>@{gnu@}
4591 @error{}m4trace:stdin:8: -1- id 9: debugmode ->@w{ }
4593 @error{}m4trace: -1- foo
4598 @error{}m4trace:11: -1- id 13: foo ... = FOO$1
4599 @error{}m4trace:11: -2- id 14: divnum ... = <divnum>@{m4@}
4600 @error{}m4trace:11: -2- id 14: divnum
4601 @error{}m4trace:11: -1- id 13: foo
4607 This example shows the effects of the debug flags that are not related
4611 @comment options: -dip
4613 $ @kbd{m4 -dip -I examples}
4614 @error{}m4debug: input read from `stdin'
4615 define(`foo', `m4wrap(`wrapped text
4618 include(`incl.m4')dnl
4619 @error{}m4debug: path search for `incl.m4' found `examples/incl.m4'
4620 @error{}m4debug: input read from `examples/incl.m4'
4621 @result{}Include file start
4622 @result{}Include file end
4623 @error{}m4debug: input reverted to stdin, line 3
4625 @error{}m4debug: input exhausted
4626 @error{}m4debug: input from m4wrap recursion level 1
4627 @result{}wrapped text
4628 @error{}m4debug: input from m4wrap exhausted
4632 @section Limiting debug output
4634 @cindex @acronym{GNU} extensions
4637 @cindex limiting trace output length
4638 @cindex trace output, limiting length
4639 @cindex dumpdef output, limiting length
4640 When debugging, sometimes it is desirable to reduce the clutter of
4641 arbitrary-length strings, because the prefix carries enough information
4642 to understand the issues. The builtin macro @code{debuglen}, along with
4643 the command line option counterpart @option{--debuglen} (or @option{-l},
4644 @pxref{Debugging options, , Invoking m4}), allow on-the-fly control of
4645 debugging string lengths:
4647 @deffn {Builtin (gnu)} debuglen (@var{len})
4648 The argument @var{len} is an integer that controls how much of
4649 arbitrary-length strings should be output during trace and dumpdef
4650 output. If specified to a non-zero value, then strings longer than that
4651 length are truncated, and @samp{...} included in the output to show that
4652 truncation took place. A warning is issued if @var{len} cannot be
4653 parsed as an integer.
4654 @comment FIXME - make this understand an optional suffix, similar to how
4655 @comment --debuglen does. Also, we need a section documenting scaling
4657 @comment FIXME - should we allow len to be `?', meaning expand to the
4658 @comment current value?
4660 The macro @code{debuglen} is recognized only with parameters.
4663 The following example demonstrates the behavior of length truncation.
4664 Note that each argument and the final result are individually truncated.
4665 Also, the special tokens for builtin functions are not truncated.
4667 @comment options: -l6 -techo -tdefn
4669 $ @kbd{m4 -d -l 6 -t echo -t defn}
4671 @error{}m4:stdin:1: Warning: debuglen: non-numeric argument `oops'
4673 define(`echo', `$@@')
4675 echo(`1', `long string')
4676 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4677 @result{}1,long string
4678 indir(`echo', defn(`changequote'))
4679 @error{}m4trace: -2- defn(`change...') -> `<changequote>'
4680 @error{}m4trace: -1- echo(<changequote>) -> ``<changequote>''
4687 @error{}m4trace: -1- echo(`long string') -> ``long string''
4688 @result{}long string
4692 @error{}m4trace: -1- echo(`long string') -> ``long string...'
4693 @result{}long string
4697 @section Saving debugging output
4699 @cindex saving debugging output
4700 @cindex debugging output, saving
4701 @cindex output, saving debugging
4702 @cindex @acronym{GNU} extensions
4703 Debug and tracing output can be redirected to files using either the
4704 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4705 Invoking m4}), or with the builtin macro @code{debugfile}:
4707 @deffn {Builtin (gnu)} debugfile (@ovar{file})
4708 Send all further debug and trace output to @var{file}, opened in append
4709 mode. If @var{file} is the empty string, debug and trace output are
4710 discarded. If @code{debugfile} is called without any arguments, debug
4711 and trace output are sent to standard error. Output from @code{dumpdef}
4712 is sent to this file if the debug level @code{o} is not set
4713 (@pxref{Debugmode}). This does not affect
4714 warnings, error messages, or @code{errprint} output, which are
4715 always sent to standard error. If @var{file} cannot be opened, the
4716 current debug file is unchanged, and an error is issued.
4718 When the @option{--safer} option (@pxref{Operation modes, , Invoking
4719 m4}) is in effect, @var{file} must be empty or omitted, since otherwise
4720 an input file could cause the modification of arbitrary files.
4722 The expansion of @code{debugfile} is void.
4730 @error{}m4:stdin:2: Warning: divnum: extra arguments ignored: 1 > 0
4731 @error{}m4trace: -1- divnum(`extra') -> `0'
4736 @error{}m4:stdin:4: Warning: divnum: extra arguments ignored: 1 > 0
4741 @error{}m4trace: -1- divnum -> `0'
4745 Although the @option{--safer} option cripples @code{debugfile} to a
4746 limited subset of capabilities, you may still use the @option{--debugfile}
4747 option from the command line with no restrictions.
4749 @comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l
4752 $ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq}
4753 foo # traced to `trace'
4754 @result{}bar # traced to `trace'
4756 @error{}m4:stdin:2: debugfile: disabled by --safer
4758 foo # traced to `trace'
4759 @result{}bar # traced to `trace'
4762 foo # trace discarded
4763 @result{}bar # trace discarded
4766 foo # traced to stderr
4767 @error{}m4trace:7: -1- foo -> `bar'
4768 @result{}bar # traced to stderr
4769 undivert(`trace')dnl
4770 @result{}m4trace:1: -1- foo -> `bar'
4771 @result{}m4trace:3: -1- foo -> `bar'
4774 Sometimes it is useful to post-process trace output, even though there
4775 is no standardized format for trace output. In this situation, forcing
4776 @code{dumpdef} to output to standard error instead of the default of the
4777 current debug file will avoid any ambiguities between the two types of
4778 output; it also allows debugging via @code{dumpdef} when debug output is
4786 @error{}m4trace: -1- divnum -> `0'
4789 @error{}divnum:@tabchar{}<divnum>
4802 @error{}divnum:@tabchar{}<divnum>
4807 @chapter Input control
4809 This chapter describes various builtin macros for controlling the input
4813 * Dnl:: Deleting whitespace in input
4814 * Changequote:: Changing the quote characters
4815 * Changecom:: Changing the comment delimiters
4816 * Changeresyntax:: Changing the regular expression syntax
4817 * Changesyntax:: Changing the lexical structure of the input
4818 * M4wrap:: Saving text until end of input
4822 @section Deleting whitespace in input
4824 @cindex deleting whitespace in input
4825 @cindex discarding input
4826 @cindex input, discarding
4827 The builtin @code{dnl} stands for ``Discard to Next Line'':
4829 @deffn {Builtin (m4)} dnl
4830 All characters, up to and including the next newline, are discarded
4831 without performing any macro expansion. A warning is issued if the end
4832 of the file is encountered without a newline.
4834 The expansion of @code{dnl} is void.
4837 It is often used in connection with @code{define}, to remove the
4838 newline that follows the call to @code{define}. Thus
4841 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4846 The input up to and including the next newline is discarded, as opposed
4847 to the way comments are treated (@pxref{Comments}), when the command
4848 line option @option{--discard-comments} is not in effect
4849 (@pxref{Operation modes, , Invoking m4}).
4851 Usually, @code{dnl} is immediately followed by an end of line or some
4852 other whitespace. @acronym{GNU} @code{m4} will produce a warning diagnostic if
4853 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
4854 will collect and process all arguments, looking for a matching close
4855 parenthesis. All predictable side effects resulting from this
4856 collection will take place. @code{dnl} will return no output. The
4857 input following the matching close parenthesis up to and including the
4858 next newline, on whatever line containing it, will still be discarded.
4861 dnl(`args are ignored, but side effects occur',
4862 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4863 @error{}m4:stdin:1: Warning: dnl: extra arguments ignored: 2 > 0
4864 See how `foo' was defined, foo?
4865 @result{}See how foo was defined, like this?
4868 If the end of file is encountered without a newline character, a
4869 warning is issued and dnl stops consuming input.
4872 m4wrap(`m4wrap(`2 hi
4878 @error{}m4:stdin:1: Warning: dnl: end of file treated as newline
4883 @section Changing the quote characters
4885 @cindex changing quote delimiters
4886 @cindex quote delimiters, changing
4887 @cindex delimiters, changing
4888 The default quote delimiters can be changed with the builtin
4891 @deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '})
4892 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4893 the new end-quote delimiter. If both arguments are missing, the default
4894 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
4895 quoting is disabled. Otherwise, if @var{end} is missing or void, the
4896 default end-quote delimiter (@code{'}) is used. The quote delimiters
4897 can be of any length.
4899 The expansion of @code{changequote} is void.
4903 changequote(`[', `]')
4905 define([foo], [Macro [foo].])
4911 The quotation strings can safely contain eight-bit characters.
4912 If no single character is appropriate, @var{start} and @var{end} can be
4913 of any length. Other implementations cap the delimiter length to five
4914 characters, but @acronym{GNU} has no inherent limit.
4917 changequote(`[[[', `]]]')
4919 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4922 @result{}Macro [[foo]].
4925 Calling @code{changequote} with @var{start} as the empty string will
4926 effectively disable the quoting mechanism, leaving no way to quote text.
4927 However, using an empty string is not portable, as some other
4928 implementations of @code{m4} revert to the default quoting, while others
4929 preserve the prior non-empty delimiter. If @var{start} is not empty,
4930 then an empty @var{end} will use the default end-quote delimiter of
4931 @samp{'}, as otherwise, it would be impossible to end a quoted string.
4932 Again, this is not portable, as some other @code{m4} implementations
4933 reuse @var{start} as the end-quote delimiter, while others preserve the
4934 previous non-empty value. Omitting both arguments restores the default
4935 begin-quote and end-quote delimiters; fortunately this behavior is
4936 portable to all implementations of @code{m4}.
4939 define(`foo', `Macro `FOO'.')
4944 @result{}Macro `FOO'.
4946 @result{}`Macro `FOO'.'
4953 There is no way in @code{m4} to quote a string containing an unmatched
4954 begin-quote, except using @code{changequote} to change the current
4957 If the quotes should be changed from, say, @samp{[} to @samp{[[},
4958 temporary quote characters have to be defined. To achieve this, two
4959 calls of @code{changequote} must be made, one for the temporary quotes
4960 and one for the new quotes.
4962 Macros are recognized in preference to the begin-quote string, so if a
4963 prefix of @var{start} can be recognized as part of a potential macro
4964 name, the quoting mechanism is effectively disabled. Unless you use
4965 @code{changesyntax} (@pxref{Changesyntax}), this means that @var{start}
4966 should not begin with a letter, digit, or @samp{_} (underscore).
4967 However, even though quoted strings are not recognized, the quote
4968 characters can still be discerned in macro expansion and in trace
4972 define(`echo', `$@@')
4976 changequote(`q', `Q')
4984 changequote(`-', `EOF')
4990 changequote(`1', `2')
4998 Quotes are recognized in preference to argument collection. In
4999 particular, if @var{start} is a single @samp{(}, then argument
5000 collection is effectively disabled. For portability with other
5001 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5002 @samp{)} as the first character in @var{start}.
5005 define(`echo', `$#:$@@:')
5009 changequote(`(',`)')
5015 changequote(`((', `))')
5023 changequote(`,', `)')
5029 However, if you are not worried about portability, using @samp{(} and
5030 @samp{)} as quoting characters has an interesting property---you can use
5031 it to compute a quoted string containing the expansion of any quoted
5032 text, as long as the expansion results in both balanced quotes and
5033 balanced parentheses. The trick is realizing @code{expand} uses
5034 @samp{$1} unquoted, to trigger its expansion using the normal quoting
5035 characters, but uses extra parentheses to group unquoted commas that
5036 occur in the expansion without consuming whitespace following those
5037 commas. Then @code{_expand} uses @code{changequote} to convert the
5038 extra parentheses back into quoting characters. Note that it takes two
5039 more @code{changequote} invocations to restore the original quotes.
5040 Contrast the behavior on whitespace when using @samp{$*}, via
5041 @code{quote}, to attempt the same task.
5044 changequote(`[', `]')dnl
5045 define([a], [1, (b)])dnl
5047 define([quote], [[$*]])dnl
5048 define([expand], [_$0(($1))])dnl
5050 [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
5051 expand([a, a, [a, a], [[a, a]]])
5052 @result{}1, (2), 1, (2), a, a, [a, a]
5053 quote(a, a, [a, a], [[a, a]])
5054 @result{}1,(2),1,(2),a, a,[a, a]
5057 If @var{end} is a prefix of @var{start}, the end-quote will be
5058 recognized in preference to a nested begin-quote. In particular,
5059 changing the quotes to have the same string for @var{start} and
5060 @var{end} disables nesting of quotes. When quote nesting is disabled,
5061 it is impossible to double-quote strings across macro expansions, so
5062 using the same string is not done very often.
5067 changequote(`""', `"')
5079 changequote(`"', `"')
5085 It is an error if the end of file occurs within a quoted string.
5090 @result{}hello world
5093 @error{}m4:stdin:2: end of file in string
5098 ifelse(`dangling quote
5100 @error{}m4:stdin:1: ifelse: end of file in string
5104 @section Changing the comment delimiters
5106 @cindex changing comment delimiters
5107 @cindex comment delimiters, changing
5108 @cindex delimiters, changing
5109 The default comment delimiters can be changed with the builtin
5110 macro @code{changecom}:
5112 @deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}})
5113 This sets @var{start} as the new begin-comment delimiter and @var{end}
5114 as the new end-comment delimiter. If both arguments are missing, or
5115 @var{start} is void, then comments are disabled. Otherwise, if
5116 @var{end} is missing or void, the default end-comment delimiter of
5117 newline is used. The comment delimiters can be of any length.
5119 The expansion of @code{changecom} is void.
5123 define(`comment', `COMMENT')
5126 @result{}# A normal comment
5127 changecom(`/*', `*/')
5129 # Not a comment anymore
5130 @result{}# Not a COMMENT anymore
5131 But: /* this is a comment now */ while this is not a comment
5132 @result{}But: /* this is a comment now */ while this is not a COMMENT
5135 @cindex comments, copied to output
5136 Note how comments are copied to the output, much as if they were quoted
5137 strings. If you want the text inside a comment expanded, quote the
5138 start comment delimiter.
5140 Calling @code{changecom} without any arguments, or with @var{start} as
5141 the empty string, will effectively disable the commenting mechanism. To
5142 restore the original comment start of @samp{#}, you must explicitly ask
5143 for it. If @var{start} is not empty, then an empty @var{end} will use
5144 the default end-comment delimiter of newline, as otherwise, it would be
5145 impossible to end a comment. However, this is not portable, as some
5146 other @code{m4} implementations preserve the previous non-empty
5150 define(`comment', `COMMENT')
5154 # Not a comment anymore
5155 @result{}# Not a COMMENT anymore
5159 @result{}# comment again
5162 The comment strings can safely contain eight-bit characters.
5163 If no single character is appropriate, @var{start} and @var{end} can be
5164 of any length. Other implementations cap the delimiter length to five
5165 characters, but @acronym{GNU} has no inherent limit.
5167 Macros and quotes are recognized in preference to comments, so if a
5168 prefix of @var{start} can be recognized as part of a potential macro
5169 name, or confused with a quoted string, the comment mechanism is
5170 effectively disabled. Unless you use @code{changesyntax}
5171 (@pxref{Changesyntax}), this means that @var{start} should not begin
5172 with a letter, digit, or @samp{_} (underscore), and that neither the
5173 start-quote nor the start-comment string should be a prefix of the
5179 define(`hi1hi2', `hello')
5191 changecom(`[[', `]]')
5193 changequote(`[[[', `]]]')
5203 changecom(`[[[', `]]]')
5205 changequote(`[[', `]]')
5213 Comments are recognized in preference to argument collection. In
5214 particular, if @var{start} is a single @samp{(}, then argument
5215 collection is effectively disabled. For portability with other
5216 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5217 @samp{)} as the first character in @var{start}.
5220 define(`echo', `$#:$*:$@@:')
5230 changecom(`((', `))')
5239 @result{}1:HI,hi)bye:HI,hi)bye:
5243 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
5244 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
5245 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
5248 It is an error if the end of file occurs within a comment.
5252 changecom(`/*', `*/')
5256 @error{}m4:stdin:2: end of file in comment
5261 changecom(`/*', `*/')
5263 len(/*dangling comment
5265 @error{}m4:stdin:2: len: end of file in comment
5268 @node Changeresyntax
5269 @section Changing the regular expression syntax
5271 @cindex regular expression syntax, changing
5272 @cindex basic regular expressions
5273 @cindex extended regular expressions
5274 @cindex regular expressions
5275 @cindex expressions, regular
5276 @cindex syntax, changing regular expression
5277 @cindex flavors of regular expressions
5278 @cindex @acronym{GNU} extensions
5279 The @acronym{GNU} extensions @code{patsubst}, @code{regexp}, and more
5280 recently, @code{renamesyms} each deal with regular expressions. There
5281 are multiple flavors of regular expressions, so the
5282 @code{changeresyntax} builtin exists to allow choosing the default
5285 @deffn {Builtin (gnu)} changeresyntax (@var{resyntax})
5286 Changes the default regular expression syntax used by M4 according to
5287 the value of @var{resyntax}, equivalent to passing @var{resyntax} as the
5288 argument to the command line option @option{--regexp-syntax}
5289 (@pxref{Operation modes, , Invoking m4}). If @var{resyntax} is empty,
5290 the default flavor is reverted to the @code{GNU_M4} style, compatible
5293 @var{resyntax} can be any one of the values in the table below. Case is
5294 not important, and @samp{-} or @samp{ } can be substituted for @samp{_} in
5295 the given names. If @var{resyntax} is unrecognized, a warning is
5296 issued and the default flavor is not changed.
5300 @xref{awk regular expression syntax}, for details.
5306 @xref{posix-basic regular expression syntax}, for details.
5310 @itemx POSIX_EXTENDED
5311 @xref{posix-extended regular expression syntax}, for details.
5315 @xref{gnu-awk regular expression syntax}, for details.
5319 @xref{egrep regular expression syntax}, for details.
5324 @xref{emacs regular expression syntax}, for details. This is the
5325 default regular expression flavor.
5328 @xref{grep regular expression syntax}, for details.
5331 @itemx POSIX_MINIMAL
5332 @itemx POSIX_MINIMAL_BASIC
5333 @xref{posix-minimal-basic regular expression syntax}, for details.
5336 @xref{posix-awk regular expression syntax}, for details.
5339 @xref{posix-egrep regular expression syntax}, for details.
5342 The expansion of @code{changeresyntax} is void.
5343 The macro @code{changeresyntax} is recognized only with parameters.
5344 This macro was added in M4 2.0.
5347 For an example of how @var{resyntax} is recognized, the first three
5348 usages select the @samp{GNU_M4} regular expression flavor:
5351 changeresyntax(`gnu m4')
5353 changeresyntax(`GNU-m4')
5355 changeresyntax(`Gnu_M4')
5357 changeresyntax(`unknown')
5358 @error{}m4:stdin:4: Warning: changeresyntax: bad syntax-spec: `unknown'
5362 Using @code{changeresyntax} makes it possible to omit the optional
5363 @var{resyntax} parameter to other macros, while still using a different
5364 regular expression flavor.
5367 patsubst(`ab', `a|b', `c')
5369 patsubst(`ab', `a\|b', `c')
5371 patsubst(`ab', `a|b', `c', `EXTENDED')
5373 changeresyntax(`EXTENDED')
5375 patsubst(`ab', `a|b', `c')
5377 patsubst(`ab', `a\|b', `c')
5382 @section Changing the lexical structure of the input
5384 @cindex lexical structure of the input
5385 @cindex input, lexical structure of the
5386 @cindex syntax table
5387 @cindex changing syntax
5388 @cindex @acronym{GNU} extensions
5390 The macro @code{changesyntax} and all associated functionality is
5391 experimental (@pxref{Experiments}). The functionality might change in
5392 the future. Please direct your comments about it the same way you would
5396 The input to @code{m4} is read character by character, and these
5397 characters are grouped together to form input tokens (such as macro
5398 names, strings, comments, etc.).
5400 Each token is parsed according to certain rules. For example, a macro
5401 name starts with a letter or @samp{_} and consists of the longest
5402 possible string of letters, @samp{_} and digits. But who is to decide
5403 what characters are letters, digits, quotes, white space? Earlier the
5404 operating system decided, now you do.
5406 Input characters belong to different categories:
5410 Characters that start a macro name. Defaults to the letters as defined
5411 by the locale, and the character @samp{_}.
5414 Characters that, together with the letters, form the remainder of a
5415 macro name. Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any
5416 other digits defined by the locale.
5419 Characters that should be trimmed from the beginning of each argument to
5420 a macro call. The defaults are space, tab, newline, carriage return,
5421 form feed, and vertical tab, and any others as defined by the locale.
5423 @item Open parenthesis
5424 Characters that open the argument list of a macro call. The default is
5425 the single character @samp{(}.
5427 @item Close parenthesis
5428 Characters that close the argument list of a macro call. The default
5429 is the single character @samp{)}.
5431 @item Argument separator
5432 Characters that separate the arguments of a macro call. The default is
5433 the single character @samp{,}.
5436 Characters that can introduce an argument reference in the body of a
5437 macro. The default is the single character @samp{$}.
5440 Characters that introduce an extended argument reference in the body of
5441 a macro immediately after a character in the Dollar category. The
5442 default is the single character @samp{@{}.
5445 Characters that conclude an extended argument reference in the body of a
5446 macro. The default is the single character @samp{@}}.
5449 The set of characters that can start a single-character quoted string.
5450 The default is the single character @samp{`}. For multiple-character
5451 quote delimiters, use @code{changequote} (@pxref{Changequote}).
5454 The set of characters that can start a single-character comment. The
5455 default is the single character @samp{#}. For multiple-character
5456 comment delimiters, use @code{changecom} (@pxref{Changecom}).
5459 Characters that have no special syntactical meaning to @code{m4}.
5460 Defaults to all characters except those in the categories above.
5463 Characters that themselves, alone, form macro names. This is a
5464 @acronym{GNU} extension, and active characters have lower precedence
5465 than comments. By default, no characters are active.
5468 Characters that must precede macro names for them to be recognized.
5469 This is a @acronym{GNU} extension. When an escape character is defined,
5470 then macros are not recognized unless the escape character is present;
5471 however, the macro name, visible by @samp{$0} in macro definitions, does
5472 not include the escape character. By default, no characters are
5475 @comment FIXME - we should also consider supporting:
5476 @comment @item Ignore - characters that are ignored if they appear in
5477 @comment the input; perhaps defaulting to '\0', category 'I'.
5481 Each character can, besides the basic syntax category, have some syntax
5482 attributes. One reason these are attributes rather than categories is
5483 that end delimiters are never recognized except when searching for the
5484 end of a token triggered by a start delimiter; the end delimiter can
5485 have syntax properties of its own when it appears in isolation. These
5490 The set of characters that can end a single-character quoted string.
5491 The default is the single character @samp{'}. For multiple-character
5492 quote delimiters, use @code{changequote} (@pxref{Changequote}). Note
5493 that @samp{'} also defaults to the syntax category `Other', when it
5494 appears in isolation.
5497 The set of characters that can end a single-character comment. The
5498 default is the single character @kbd{newline}. For multiple-character
5499 comment delimiters, use @code{changecom} (@pxref{Changecom}). Note that
5500 newline also defaults to the syntax category `White space', when it
5501 appears in isolation.
5504 The builtin macro @code{changesyntax} is used to change the way
5505 @code{m4} parses the input stream into tokens.
5507 @deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
5508 Each @var{syntax-spec} is a two-part string. The first part is a
5509 command, consisting of a single character describing a syntax category,
5510 and an optional one-character action. The action can be @samp{-} to
5511 remove the listed characters from that category and reassign them to the
5512 `Other' category, @samp{=} to set the category to the listed characters
5513 and reassign all other characters previously in that category to
5514 `Other', or @samp{+} to add the listed characters to the category
5515 without affecting other characters. If an action is not specified, but
5516 additional characters are present, then @samp{=} is assumed. The
5517 case-insensitive characters for the syntax categories are:
5556 The remaining characters of each @var{syntax-spec} form the set of
5557 characters to perform the action on for that syntax category. Character
5558 ranges are expanded as for @code{translit} (@pxref{Translit}). To start
5559 the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
5562 If @var{syntax-spec} is just a category, and no action or characters
5563 were specified, then all characters in that category are reset to their
5564 default state. A warning is issued if the category character is not
5565 valid. If @var{syntax-spec} is the empty string, then all categories
5566 are reset to their default state.
5568 The expansion of @code{changesyntax} is void.
5569 The macro @code{changesyntax} is recognized only with parameters. Use
5570 this macro with caution, as it is possible to change the syntax in such
5571 a way that no further macros can be recognized by @code{m4}.
5572 This macro was added in M4 2.0.
5575 With @code{changesyntax} we can modify what characters form a word.
5578 define(`test.1', `TEST ONE')
5586 changesyntax(`W+.', `W-_')
5592 changesyntax(`W=a-zA-Z0-9_')
5606 Another possibility is to change the syntax of a macro call.
5609 define(`test', `$#')
5613 changesyntax(`(<', `,|', `)>')
5621 Leading spaces are always removed from macro arguments in @code{m4}, but
5622 by changing the syntax categories we can avoid it. The use of
5623 @code{format} is an alternative to using a literal tab character.
5626 define(`test', `$1$2$3')
5630 changesyntax(`O 'format(`%c', `9'))
5636 It is possible to redefine the @samp{$} used to indicate macro arguments
5637 in user defined macros.
5640 define(`argref', `Dollar: $#, Question: ?#')
5643 @result{}Dollar: 3, Question: ?#
5644 changesyntax(`$?', `O$')
5647 @result{}Dollar: $#, Question: 3
5651 Dollar class syntax elements are copied to the output if there is no
5655 define(`escape', `$?`'1$?1?')
5663 Macro calls can be given a @TeX{} or Texinfo like syntax using an
5664 escape. If one or more characters are defined as escapes, macro names
5665 are only recognized if preceded by an escape character.
5667 If the escape is not followed by what is normally a word (a letter
5668 optionally followed by letters and/or numerals), that single character
5669 is returned as a macro name.
5671 As always, words without a macro definition cause no error message.
5672 They and the escape character are simply output.
5675 define(`foo', `bar')
5677 changesyntax(`@@@@')
5685 @@changesyntax(`@@\', `O@@')
5693 define(`#', `No comment')
5694 @result{}define(#, No comment)
5695 \define(`#', `No comment')
5697 \# \foo # Comment \foo
5698 @result{}No comment bar # Comment \foo
5701 Active characters are known from @TeX{}. In @code{m4} an active
5702 character is always seen as a one-letter word, and so, if it has a macro
5703 definition, the macro will be called.
5706 define(`@@', `TEST')
5716 There is obviously an overlap with @code{changecom} and
5717 @code{changequote}. Comment delimiters and quotes can now be defined in
5718 two different ways. To avoid incompatibilities, if the quotes are set
5719 with @code{changequote}, all other characters marked in the syntax table
5720 as quotes will revert to their normal syntax categories, leaving only
5721 one set of defined quotes as before. If the quotes are set with
5722 @code{changesyntax}, it is possible to result in multiple sets of
5723 quotes. This applies to comment delimiters as well, @emph{mutatis
5727 define(`test', `TEST')
5729 changesyntax(`L+<', `R+>')
5737 changequote(<[>, `]')
5747 If several characters are assigned to a category that forms single
5748 character tokens, all such characters are treated as equal. Any open
5749 parenthesis will match any close parenthesis, etc.
5752 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
5758 On the other hand, a multi-character start-quote sequence, which can
5759 only be created by @code{changequote}, will only be matched by the
5760 corresponding end-quote sequence. The same goes for comment delimiters.
5763 define(`test', `==$1==')
5765 changequote(`<<', `>>')
5767 changesyntax(<<L[>>, <<R]>>)
5770 @result{}==testing]==
5772 @result{}==testing>>==
5774 @result{}==testing==
5778 Note how it is possible to have both long and short quotes, if
5779 @code{changequote} is used before @code{changesyntax}.
5781 The syntax table is initialized to be backwards compatible, so if you
5782 never call @code{changesyntax}, nothing will have changed.
5784 For now, debugging output continues to use @kbd{(}, @kbd{,} and @kbd{)}
5785 to show macro calls; and macro expansions that result in a list of
5786 arguments (such as @samp{$@@} or @code{shift}) use @samp{,}, regardless
5787 of the current syntax settings. However, this is likely to change in a
5788 future release, so it should not be relied on, particularly since it is
5789 next to impossible to write recursive macros if the argument separator
5790 doesn't match between expansion and rescanning.
5792 @c FIXME - changing syntax of , should not break iterative macros.
5795 changesyntax(`,=|')traceon(`foo')define(`foo'|`$#:$@@')
5798 @error{}m4trace: -2- foo(`1', `2', `3') -> `3:`1',`2',`3''
5799 @error{}m4trace: -1- foo(`3:1,2,3') -> `1:`3:1,2,3''
5804 @section Saving text until end of input
5806 @cindex saving input
5807 @cindex input, saving
5808 @cindex deferring expansion
5809 @cindex expansion, deferring
5810 It is possible to `save' some text until the end of the normal input has
5811 been seen. Text can be saved, to be read again by @code{m4} when the
5812 normal input has been exhausted. This feature is normally used to
5813 initiate cleanup actions before normal exit, e.g., deleting temporary
5816 To save input text, use the builtin @code{m4wrap}:
5818 @deffn {Builtin (m4)} m4wrap (@var{string}, @dots{})
5819 Stores @var{string} in a safe place, to be reread when end of input is
5820 reached. As a @acronym{GNU} extension, additional arguments are
5821 concatenated with a space to the @var{string}.
5823 Successive invocations of @code{m4wrap} accumulate saved text in
5824 first-in, first-out order, as required by @acronym{POSIX}.
5826 The expansion of @code{m4wrap} is void.
5827 The macro @code{m4wrap} is recognized only with parameters.
5831 define(`cleanup', `This is the `cleanup' action.
5836 This is the first and last normal input line.
5837 @result{}This is the first and last normal input line.
5839 @result{}This is the cleanup action.
5842 The saved input is only reread when the end of normal input is seen, and
5843 not if @code{m4exit} is used to exit @code{m4}.
5845 It is safe to call @code{m4wrap} from wrapped text, where all the
5846 recursively wrapped text is deferred until the current wrapped text is
5847 exhausted. As of M4 1.6, when @code{m4wrap} is not used recursively,
5848 the saved pieces of text are reread in the same order in which they were
5849 saved (FIFO---first in, first out), as required by @acronym{POSIX}.
5863 However, earlier versions had reverse ordering (LIFO---last in, first
5864 out), as this behavior is more like the semantics of the C function
5865 @code{atexit}. It is possible to emulate @acronym{POSIX} behavior even
5866 with older versions of @acronym{GNU} M4 by including the file
5867 @file{m4-@value{VERSION}/@/examples/@/wrapfifo.m4} from the
5872 $ @kbd{m4 -I examples}
5873 undivert(`wrapfifo.m4')dnl
5874 @result{}dnl Redefine m4wrap to have FIFO semantics.
5875 @result{}define(`_m4wrap_level', `0')dnl
5876 @result{}define(`m4wrap',
5877 @result{}`ifdef(`m4wrap'_m4wrap_level,
5878 @result{} `define(`m4wrap'_m4wrap_level,
5879 @result{} defn(`m4wrap'_m4wrap_level)`$1')',
5880 @result{} `builtin(`m4wrap', `define(`_m4wrap_level',
5881 @result{} incr(_m4wrap_level))dnl
5882 @result{}m4wrap'_m4wrap_level)dnl
5883 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5884 include(`wrapfifo.m4')
5886 m4wrap(`a`'m4wrap(`c
5887 ', `d')')m4wrap(`b')
5893 It is likewise possible to emulate LIFO behavior without resorting to
5894 the @acronym{GNU} M4 extension of @code{builtin}, by including the file
5895 @file{m4-@value{VERSION}/@/examples/@/wraplifo.m4} from the
5896 distribution. (Unfortunately, both examples shown here share some
5897 subtle bugs. See if you can find and correct them; or @pxref{Improved
5898 m4wrap, , Answers}).
5902 $ @kbd{m4 -I examples}
5903 undivert(`wraplifo.m4')dnl
5904 @result{}dnl Redefine m4wrap to have LIFO semantics.
5905 @result{}define(`_m4wrap_level', `0')dnl
5906 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
5907 @result{}define(`m4wrap',
5908 @result{}`ifdef(`m4wrap'_m4wrap_level,
5909 @result{} `define(`m4wrap'_m4wrap_level,
5910 @result{} `$1'defn(`m4wrap'_m4wrap_level))',
5911 @result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
5912 @result{}m4wrap'_m4wrap_level)dnl
5913 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5914 include(`wraplifo.m4')
5916 m4wrap(`a`'m4wrap(`c
5917 ', `d')')m4wrap(`b')
5923 Here is an example of implementing a factorial function using
5927 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
5928 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
5929 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
5934 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
5937 Invocations of @code{m4wrap} at the same recursion level are
5938 concatenated and rescanned as usual:
5944 m4wrap(`a')m4wrap(`b')
5951 however, the transition between recursion levels behaves like an end of
5952 file condition between two input files.
5956 m4wrap(`m4wrap(`)')len(abc')
5959 @error{}m4:stdin:1: len: end of file in argument list
5962 As of M4 1.6, @code{m4wrap} transparently handles builtin tokens
5963 generated by @code{defn} (@pxref{Defn}). However, for portability, it
5964 is better to defer the evaluation of @code{defn} along with the rest of
5965 the wrapped text, as is done for @code{foo} in the example below, rather
5966 than computing the builtin token up front, as is done for @code{bar}.
5969 m4wrap(`define(`foo', defn(`divnum'))foo
5972 m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar
5980 @node File Inclusion
5981 @chapter File inclusion
5983 @cindex file inclusion
5984 @cindex inclusion, of files
5985 @code{m4} allows you to include named files at any point in the input.
5988 * Include:: Including named files
5989 * Search Path:: Searching for include files
5993 @section Including named files
5995 There are two builtin macros in @code{m4} for including files:
5997 @deffn {Builtin (m4)} include (@var{file})
5998 @deffnx {Builtin (m4)} sinclude (@var{file})
5999 Both macros cause the file named @var{file} to be read by
6000 @code{m4}. When the end of the file is reached, input is resumed from
6001 the previous input file.
6003 The expansion of @code{include} and @code{sinclude} is therefore the
6004 contents of @var{file}.
6006 If @var{file} does not exist, is a directory, or cannot otherwise be
6007 read, the expansion is void,
6008 and @code{include} will fail with an error while @code{sinclude} is
6009 silent. The empty string counts as a file that does not exist.
6011 The macros @code{include} and @code{sinclude} are recognized only with
6018 @error{}m4:stdin:1: include: cannot open `n': No such file or directory
6021 @error{}m4:stdin:2: include: cannot open `': No such file or directory
6029 This section uses the @option{--include} command-line option (or
6030 @option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab
6031 files from the @file{m4-@value{VERSION}/@/examples}
6032 directory shipped as part of the @acronym{GNU} @code{m4} package. The
6033 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
6038 $ @kbd{cat examples/incl.m4}
6039 @result{}Include file start
6041 @result{}Include file end
6044 Normally file inclusion is used to insert the contents of a file
6045 into the input stream. The contents of the file will be read by
6046 @code{m4} and macro calls in the file will be expanded:
6050 $ @kbd{m4 -I examples}
6051 define(`foo', `FOO')
6054 @result{}Include file start
6056 @result{}Include file end
6060 The fact that @code{include} and @code{sinclude} expand to the contents
6061 of the file can be used to define macros that operate on entire files.
6062 Here is an example, which defines @samp{bar} to expand to the contents
6067 $ @kbd{m4 -I examples}
6068 define(`bar', include(`incl.m4'))
6070 This is `bar': >>bar<<
6071 @result{}This is bar: >>Include file start
6073 @result{}Include file end
6077 This use of @code{include} is not trivial, though, as files can contain
6078 quotes, commas, and parentheses, which can interfere with the way the
6079 @code{m4} parser works. @acronym{GNU} @code{m4} seamlessly concatenates
6080 the file contents with the next character, even if the included file
6081 ended in the middle of a comment, string, or macro call. These
6082 conditions are only treated as end of file errors if specified as input
6083 files on the command line.
6085 In @acronym{GNU} @code{m4}, an alternative method of reading files is
6086 using @code{undivert} (@pxref{Undivert}) on a named file.
6089 @section Searching for include files
6091 @cindex search path for included files
6092 @cindex included files, search path for
6093 @cindex @acronym{GNU} extensions
6094 @acronym{GNU} @code{m4} allows included files to be found in other directories
6095 than the current working directory.
6097 @cindex @env{M4PATH}
6098 If the @option{--prepend-include} or @option{-B} command-line option was
6099 provided (@pxref{Preprocessor features, , Invoking m4}), those
6100 directories are searched first, in reverse order that those options were
6101 listed on the command line. Then @code{m4} looks in the current working
6102 directory. Next comes the directories specified with the
6103 @option{--include} or @option{-I} option, in the order found on the
6104 command line. Finally, if the @env{M4PATH} environment variable is set,
6105 it is expected to contain a colon-separated list of directories, which
6106 will be searched in order.
6108 If the automatic search for include-files causes trouble, the @samp{p}
6109 debug flag (@pxref{Debugmode}) can help isolate the problem.
6112 @chapter Diverting and undiverting output
6114 @cindex deferring output
6115 Diversions are a way of temporarily saving output. The output of
6116 @code{m4} can at any time be diverted to a temporary file, and be
6117 reinserted into the output stream, @dfn{undiverted}, again at a later
6120 @cindex @env{TMPDIR}
6121 Numbered diversions are counted from 0 upwards, diversion number 0
6122 being the normal output stream. The number of simultaneous diversions
6123 is limited mainly by the memory used to describe them, because @acronym{GNU}
6124 @code{m4} tries to keep diversions in memory. However, there is a
6125 limit to the overall memory usable by all diversions taken altogether
6126 (512K, currently). When this maximum is about to be exceeded,
6127 a temporary file is opened to receive the contents of the biggest
6128 diversion still in memory, freeing this memory for other diversions.
6129 When creating the temporary file, @code{m4} honors the value of the
6130 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
6131 So, it is theoretically possible that the number and aggregate size of
6132 diversions is limited only by available disk space.
6134 Diversions make it possible to generate output in a different order than
6135 the input was read. It is possible to implement topological sorting
6136 dependencies. For example, @acronym{GNU} Autoconf makes use of
6137 diversions under the hood to ensure that the expansion of a prerequisite
6138 macro appears in the output prior to the expansion of a dependent macro,
6139 regardless of which order the two macros were invoked in the user's
6143 * Divert:: Diverting output
6144 * Undivert:: Undiverting output
6145 * Divnum:: Diversion numbers
6146 * Cleardivert:: Discarding diverted text
6150 @section Diverting output
6152 @cindex diverting output to files
6153 @cindex output, diverting to files
6154 @cindex files, diverting output to
6155 Output is diverted using @code{divert}:
6157 @deffn {Builtin (m4)} divert (@dvar{number, 0}, @ovar{text})
6158 The current diversion is changed to @var{number}. If @var{number} is left
6159 out or empty, it is assumed to be zero. If @var{number} cannot be
6160 parsed, the diversion is unchanged.
6162 @cindex @acronym{GNU} extensions
6163 As a @acronym{GNU} extension, if optional @var{text} is supplied and
6164 @var{number} was valid, then @var{text} is immediately output to the
6165 new diversion, regardless of whether the expansion of @code{divert}
6166 occurred while collecting arguments for another macro.
6168 The expansion of @code{divert} is void.
6171 When all the @code{m4} input will have been processed, all existing
6172 diversions are automatically undiverted, in numerical order.
6176 This text is diverted.
6179 This text is not diverted.
6180 @result{}This text is not diverted.
6183 @result{}This text is diverted.
6186 Several calls of @code{divert} with the same argument do not overwrite
6187 the previous diverted text, but append to it. Diversions are printed
6188 after any wrapped text is expanded.
6191 define(`text', `TEXT')
6193 divert(`1')`diverted text.'
6196 m4wrap(`Wrapped text precedes ')
6199 @result{}Wrapped TEXT precedes diverted text.
6202 @cindex discarding input
6203 @cindex input, discarding
6204 If output is diverted to a negative diversion, it is simply discarded.
6205 This can be used to suppress unwanted output. A common example of
6206 unwanted output is the trailing newlines after macro definitions. Here
6207 is a common programming idiom in @code{m4} for avoiding them.
6211 define(`foo', `Macro `foo'.')
6212 define(`bar', `Macro `bar'.')
6217 @cindex @acronym{GNU} extensions
6218 Traditional implementations only supported ten diversions. But as a
6219 @acronym{GNU} extension, diversion numbers can be as large as positive
6220 integers will allow, rather than treating a multi-digit diversion number
6221 as a request to discard text.
6224 divert(eval(`1<<28'))world
6231 The ability to immediately output extra text is a @acronym{GNU}
6232 extension, but it can prove useful for ensuring that text goes to a
6233 particular diversion no matter how many pending macro expansions are in
6234 progress. For a demonstration of why this is useful, it is important to
6235 understand in the example below why @samp{one} is output in diversion 2,
6236 not diversion 1, while @samp{three} and @samp{five} both end up in the
6237 correctly numbered diversion. The key point is that when @code{divert}
6238 is executed unquoted as part of the argument collection of another
6239 macro, the side effect takes place immediately, but the text @samp{one}
6240 is not passed to any diversion until after the @samp{divert(`2')} and
6241 the enclosing @code{echo} have also taken place. The example with
6242 @samp{three} shows how following the quoting rule of thumb delays the
6243 invocation of @code{divert} until it is not nested in any argument
6244 collection context, while the example with @samp{five} shows the use of
6245 the optional argument to speed up the output process.
6248 define(`echo', `$1')
6250 echo(divert(`1')`one'divert(`2'))`'dnl
6251 echo(`divert(`3')three`'divert(`4')')`'dnl
6252 echo(divert(`5', `five')divert(`6'))`'dnl
6269 Note that @code{divert} is an English word, but also an active macro
6270 without arguments. When processing plain text, the word might appear in
6271 normal text and be unintentionally swallowed as a macro invocation. One
6272 way to avoid this is to use the @option{-P} option to rename all
6273 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
6274 a wrapper that requires a parameter to be recognized.
6277 We decided to divert the stream for irrigation.
6278 @result{}We decided to the stream for irrigation.
6279 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
6285 We decided to divert the stream for irrigation.
6286 @result{}We decided to divert the stream for irrigation.
6290 @section Undiverting output
6292 Diverted text can be undiverted explicitly using the builtin
6295 @deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}})
6296 Undiverts the numeric @var{diversions} given by the arguments, in the
6297 order given. If no arguments are supplied, all diversions are
6298 undiverted, in numerical order.
6300 @cindex file inclusion
6301 @cindex inclusion, of files
6302 @cindex @acronym{GNU} extensions
6303 As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
6304 strings, which are treated as the names of files to copy into the output
6305 without expansion. A warning is issued if a file could not be opened.
6307 The expansion of @code{undivert} is void.
6312 This text is diverted.
6315 This text is not diverted.
6316 @result{}This text is not diverted.
6319 @result{}This text is diverted.
6323 Notice the last two blank lines. One of them comes from the newline
6324 following @code{undivert}, the other from the newline that followed the
6325 @code{divert}! A diversion often starts with a blank line like this.
6327 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
6328 but rather copied directly to the current output, and it is therefore
6329 not an error to undivert into a diversion. Undiverting the empty string
6330 is the same as specifying diversion 0; in either case nothing happens
6331 since the output has already been flushed.
6334 divert(`1')diverted text
6342 @result{}diverted text
6345 divert(`2')undivert(`1')diverted text`'divert
6351 @result{}diverted text
6354 When a diversion has been undiverted, the diverted text is discarded,
6355 and it is not possible to bring back diverted text more than once.
6359 This text is diverted first.
6360 divert(`0')undivert(`1')dnl
6362 @result{}This text is diverted first.
6366 This text is also diverted but not appended.
6367 divert(`0')undivert(`1')dnl
6369 @result{}This text is also diverted but not appended.
6372 Attempts to undivert the current diversion are silently ignored. Thus,
6373 when the current diversion is not 0, the current diversion does not get
6374 rearranged among the other diversions.
6382 divert(`2')undivert(`5', `2', `4')dnl
6383 undivert`'dnl effectively undivert(`1', `2', `3', `4', `5')
6384 divert`'undivert`'dnl
6392 @cindex @acronym{GNU} extensions
6393 @cindex file inclusion
6394 @cindex inclusion, of files
6395 @acronym{GNU} @code{m4} allows named files to be undiverted. Given a
6396 non-numeric argument, the contents of the file named will be copied,
6397 uninterpreted, to the current output. This complements the builtin
6398 @code{include} (@pxref{Include}). To illustrate the difference, assume
6399 the file @file{foo} contains:
6411 define(`bar', `BAR')
6421 If the file is not found (or cannot be read), an error message is
6422 issued, and the expansion is void. It is possible to intermix files
6423 and diversion numbers.
6426 divert(`1')diversion one
6427 divert(`2')undivert(`foo')dnl
6428 divert(`3')diversion three
6430 undivert(`1', `2', `foo', `3')dnl
6431 @result{}diversion one
6434 @result{}diversion three
6438 @section Diversion numbers
6440 @cindex diversion numbers
6441 The current diversion is tracked by the builtin @code{divnum}:
6443 @deffn {Builtin (m4)} divnum
6444 Expands to the number of the current diversion.
6451 Diversion one: divnum
6453 Diversion two: divnum
6456 @result{}Diversion one: 1
6458 @result{}Diversion two: 2
6462 @section Discarding diverted text
6464 @cindex discarding diverted text
6465 @cindex diverted text, discarding
6466 Often it is not known, when output is diverted, whether the diverted
6467 text is actually needed. Since all non-empty diversion are brought back
6468 on the main output stream when the end of input is seen, a method of
6469 discarding a diversion is needed. If all diversions should be
6470 discarded, the easiest is to end the input to @code{m4} with
6471 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
6475 Diversion one: divnum
6477 Diversion two: divnum
6484 No output is produced at all.
6486 Clearing selected diversions can be done with the following macro:
6488 @deffn Composite cleardivert (@ovar{diversions@dots{}})
6489 Discard the contents of each of the listed numeric @var{diversions}.
6493 define(`cleardivert',
6494 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
6498 It is called just like @code{undivert}, but the effect is to clear the
6499 diversions, given by the arguments. (This macro has a nasty bug! You
6500 should try to see if you can find it and correct it; or @pxref{Improved
6501 cleardivert, , Answers}).
6504 @chapter Extending M4 with dynamic runtime modules
6507 @cindex dynamic modules
6508 @cindex loadable modules
6509 @acronym{GNU} M4 1.4.x had a monolithic architecture. All of its
6510 functionality was contained in a single binary, and additional macros
6511 could be added only by writing more code in the M4 language, or at the
6512 extreme by hacking the sources and recompiling the whole thing to make
6513 a custom M4 installation.
6515 Starting with release 2.0, M4 uses Libtool's @code{libltdl} facilities
6516 (@pxref{Using libltdl, , libltdl, libtool, The GNU Libtool Manual})
6517 to move all of M4's builtins out to pluggable modules. Unless compile
6518 time options are set to change the default build, the installed M4 2.0
6519 binary is virtually identical to 1.4.x, supporting the same builtins.
6520 However, an optional module can be loaded into the running M4 interpreter
6521 to provide a new @code{load} builtin. This facilitates runtime
6522 extension of the M4 builtin macro list using compiled C code linked
6523 against a new shared library, typically named @file{libm4.so}.
6525 For example, you might want to add a @code{setenv} builtin to M4, to
6526 use before invoking @code{esyscmd}. We might write a @file{setenv.c}
6527 something like this:
6531 #include "m4module.h"
6535 m4_builtin m4_builtin_table[] =
6537 /* name handler flags minargs maxargs */
6538 @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2, 3 @},
6540 @{ NULL, NULL, 0, 0, 0 @}
6544 * setenv(NAME, VALUE, [OVERWRITE])
6546 M4BUILTIN_HANDLER (setenv)
6551 if (!m4_numeric_arg (context, argc, argv, 3, &overwrite))
6554 setenv (M4ARG (1), M4ARG (2), overwrite);
6558 Then, having compiled and linked the module, in (somewhat contrived)
6563 $ @kbd{M4MODPATH=`pwd` m4 --load-module=setenv}
6564 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6566 esyscmd(`ifconfig -a')dnl
6570 Or instead of loading the module from the M4 invocation, you can use
6571 the new @code{load} builtin:
6575 $ @kbd{M4MODPATH=`pwd` m4 --load-module=load}
6578 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6582 Also, at build time, you can choose which modules to build into
6583 the core (so that they will be available without dynamic loading).
6584 SUSv3 M4 functionality is contained in the module @samp{m4}, @acronym{GNU}
6585 extensions in the module @samp{gnu}, the @code{load} builtin in the
6586 module @samp{load} and so on.
6588 We hinted earlier that the @code{m4} and @code{gnu} modules are
6589 preloaded into the installed M4 binary, but it is possible to install
6590 a @emph{thinner} binary; for example, omitting the @acronym{GNU}
6591 extensions by configuring the distribution with @kbd{./configure
6592 --with-modules=m4}. For a binary built with that option to understand
6593 code that uses @acronym{GNU} extensions, you must then run @kbd{m4
6594 --load-module=gnu}. It is also possible to build a @emph{fatter}
6595 binary with additional modules preloaded: adding, say, the @code{load}
6596 builtin using @kbd{./configure --with-modules="m4 gnu load"}.
6598 @acronym{GNU} M4 now has a facility for defining additional builtins without
6599 recompiling the sources. In actual fact, all of the builtins provided
6600 by @acronym{GNU} M4 are loaded from such modules. All of the builtin
6601 descriptions in this manual are annotated with the module from which
6602 they are loaded -- mostly from the module @samp{m4}.
6604 When you start @acronym{GNU} M4, the modules @samp{m4} and @samp{gnu} are
6605 loaded by default. If you supply the @option{-G} option at startup, the
6606 module @samp{traditional} is loaded instead of @samp{gnu}.
6607 @xref{Compatibility}, for more details on the differences between these
6608 two modes of startup.
6611 * M4modules:: Listing loaded modules
6612 * Load:: Loading additional modules
6613 * Unload:: Removing loaded modules
6614 * Refcount:: Tracking module references
6615 * Standard Modules:: Standard bundled modules
6619 @section Listing loaded modules
6621 @deffn {Builtin (load)} m4modules
6622 Expands to a quoted ordered list of currently loaded modules,
6623 with the most recently loaded module at the front of the list. Loading
6624 a module multiple times will not affect the order of this list, the
6625 position depends on when the module was @emph{first} loaded.
6628 For example, if @acronym{GNU} @code{m4} is started with the
6629 @option{-m load} option to load the module @samp{load} and make this
6630 builtin available, @code{m4modules} will yield the following:
6632 @comment options: -m load
6636 @result{}load,gnu,m4
6640 @section Loading additional modules
6642 @deffn {Builtin (load)} load (@var{module-name})
6643 @var{module-name} will be searched for along the module search path
6644 (@pxref{Standard Modules}) and loaded if found. Loading a module
6645 consists of running its initialization function (if any) and then adding
6646 any macros it provides to the internal table.
6648 The macro @code{load} is recognized only with parameters.
6651 Once the @code{load} module has successfully loaded, use of the
6652 @samp{load} macro is entirely equivalent to the @option{-m} command line
6655 @c The -mmpeval/--unload=mpeval pair allows the testsuite to skip this
6656 @c test if mpeval was not configured for usage.
6657 @comment options: -m load -m mpeval --unload-module=mpeval
6661 @result{}load,gnu,m4
6665 @result{}mpeval,load,gnu,m4
6669 @section Removing loaded modules
6671 @deffn {Builtin (load)} unload (@var{module-name})
6672 Any loaded modules that can be listed by the @code{m4modules} macro can be
6673 removed by naming them as the @var{module-name} parameter of the
6674 @code{unload} macro. Unloading a module consists of removing all of the
6675 macros it provides from the internal table of visible macros, and
6676 running the module's finalization method (if any).
6678 The macro @code{unload} is recognized only with parameters.
6681 @comment options: -m mpeval -m load
6683 $ @kbd{m4 -m mpeval -m load}
6685 @result{}load,mpeval,gnu,m4
6689 @result{}load,gnu,m4
6693 @section Tracking module references
6695 @deffn {Builtin (load)} refcount (@var{module-name})
6696 This macro expands to an integer representing the number of times
6697 @var{module-name} has been loaded but not yet unloaded. No warning is
6698 issued, even if @var{module-name} does not represent a valid module.
6700 The macro @code{refcount} is recognized only with parameters.
6703 This example demonstrates tracking the reference count of the gnu
6706 @comment options: -m load
6710 @result{}load,gnu,m4
6714 @result{}load,gnu,m4
6722 @result{}load,gnu,m4
6731 refcount(`NoSuchModule')
6735 @node Standard Modules
6736 @section Standard bundled modules
6738 @acronym{GNU} @code{m4} ships with several bundled modules as standard.
6739 By convention, these modules define a text macro that can be tested
6740 with @code{ifdef} when they are loaded; only the @code{m4} module lacks
6741 this feature test macro, since it is not permitted by @acronym{POSIX}.
6742 Each of the feature test macros are intended to be used without
6747 Provides all of the builtins defined by @acronym{POSIX}. This module
6748 is always loaded --- @acronym{GNU} @code{m4} would only be a very slow
6749 version of @command{cat} without the builtins supplied by this module.
6752 Provides all of the @acronym{GNU} extensions, as defined by
6753 @acronym{GNU} M4 through the 1.4.x release series. It also provides a
6754 couple of feature test macros:
6756 @deffn {Macro (gnu)} __gnu__
6757 Expands to the empty string, as an indication that the @samp{gnu}
6761 @deffn {Macro (gnu)} __m4_version__
6762 Expands to an unquoted string containing the release version number of
6763 the running @acronym{GNU} @code{m4} executable.
6766 This module is always loaded, unless the @option{-G} command line
6767 option is supplied at startup (@pxref{Limits control, , Invoking m4}).
6770 This module provides compatibility with System V @code{m4}, for anything
6771 not specified by @acronym{POSIX}, and is loaded instead of the
6772 @samp{gnu} module if the @option{-G} command line option is specified.
6774 @deffn {Macro (traditional)} __traditional__
6775 Expands to the empty string, as an indication that the
6776 @samp{traditional} module is loaded.
6780 This module supplies the builtins required to use modules from within a
6781 @acronym{GNU} @code{m4} program. @xref{Modules}, for more details. The
6782 module also defines the following macro:
6784 @deffn {Macro (load)} __load__
6785 Expands to the empty string, as an indication that the @samp{load}
6790 This module provides the implementation for the experimental
6791 @code{mpeval} feature. If the host machine does not have the
6792 @acronym{GNU} gmp library, the builtin will generate an error if called.
6793 @xref{Mpeval}, for more details. The module also defines the following
6796 @deffn {Macro (mpeval)} __mpeval__
6797 Expands to the empty string, as an indication that the @samp{mpeval}
6802 Here is an example of using the feature test macros.
6806 __gnu__-__traditional__
6807 @result{}-__traditional__
6808 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6809 @result{}Extensions are active
6811 @error{}m4:stdin:3: Warning: __gnu__: extra arguments ignored: 1 > 0
6815 @comment options: -G
6817 $ @kbd{m4 --traditional}
6818 __gnu__-__traditional__
6820 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6821 @result{}Minimal features
6824 Since the version string is unquoted and can potentially contain macro
6825 names (for example, a beta release could be numbered @samp{1.9b}), or be
6826 impacted by the use of @code{changesyntax}), the
6827 @code{__m4_version__} macro should generally be used via @code{defn}
6828 rather than directly invoked (@pxref{Defn}). In general, feature tests
6829 are more reliable than version number checks, so exercise caution when
6832 @comment This test is excluded from the testsuite since it depends on a
6833 @comment texinfo macro; but builtins.at covers the same thing.
6836 defn(`__m4_version__')
6837 @result{}@value{VERSION}
6841 @chapter Macros for text handling
6843 There are a number of builtins in @code{m4} for manipulating text in
6844 various ways, extracting substrings, searching, substituting, and so on.
6847 * Len:: Calculating length of strings
6848 * Index macro:: Searching for substrings
6849 * Regexp:: Searching for regular expressions
6850 * Substr:: Extracting substrings
6851 * Translit:: Translating characters
6852 * Patsubst:: Substituting text by regular expression
6853 * Format:: Formatting strings (printf-like)
6857 @section Calculating length of strings
6859 @cindex length of strings
6860 @cindex strings, length of
6861 The length of a string can be calculated by @code{len}:
6863 @deffn {Builtin (m4)} len (@var{string})
6864 Expands to the length of @var{string}, as a decimal number.
6866 The macro @code{len} is recognized only with parameters.
6877 @section Searching for substrings
6879 @cindex substrings, locating
6880 Searching for substrings is done with @code{index}:
6882 @deffn {Builtin (m4)} index (@var{string}, @var{substring}, @ovar{offset})
6883 Expands to the index of the first occurrence of @var{substring} in
6884 @var{string}. The first character in @var{string} has index 0. If
6885 @var{substring} does not occur in @var{string}, @code{index} expands to
6886 @samp{-1}. If @var{offset} is provided, it determines the index at
6887 which the search starts; a negative @var{offset} specifies the offset
6888 relative to the end of @var{string}.
6890 The macro @code{index} is recognized only with parameters.
6894 index(`gnus, gnats, and armadillos', `nat')
6896 index(`gnus, gnats, and armadillos', `dag')
6900 Omitting @var{substring} evokes a warning, but still produces output;
6901 contrast this with an empty @var{substring}.
6905 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
6913 @cindex @acronym{GNU} extensions
6914 As an extension, an @var{offset} can be provided to limit the search to
6915 the tail of the @var{string}. A negative offset is interpreted relative
6916 to the end of @var{string}, and it is not an error if @var{offset}
6917 exceeds the bounds of @var{string}.
6920 index(`aba', `a', `1')
6922 index(`ababa', `ba', `-3')
6924 index(`abc', `ab', `4')
6926 index(`abc', `bc', `-4')
6931 @section Searching for regular expressions
6933 @cindex regular expressions
6934 @cindex expressions, regular
6935 @cindex @acronym{GNU} extensions
6936 Searching for regular expressions is done with the builtin
6939 @deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax})
6940 @deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @
6941 @ovar{replacement}, @ovar{resyntax})
6942 Searches for @var{regexp} in @var{string}.
6944 If @var{resyntax} is given, the particular flavor of regular expression
6945 understood with respect to @var{regexp} can be changed from the current
6946 default. @xref{Changeresyntax}, for details of the values that can be
6947 given for this argument. If exactly three arguments given, then the
6948 third argument is treated as @var{resyntax} only if it matches a known
6949 syntax name, otherwise it is treated as @var{replacement}.
6951 If @var{replacement} is omitted, @code{regexp} expands to the index of
6952 the first match of @var{regexp} in @var{string}. If @var{regexp} does
6953 not match anywhere in @var{string}, it expands to -1.
6955 If @var{replacement} is supplied, and there was a match, @code{regexp}
6956 changes the expansion to this argument, with @samp{\@var{n}} substituted
6957 by the text matched by the @var{n}th parenthesized sub-expression of
6958 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
6959 replaced by the text of the entire regular expression matched. For
6960 all other characters, @samp{\} treats the next character literally. A
6961 warning is issued if there were fewer sub-expressions than the
6962 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
6963 was no match, @code{regexp} expands to the empty string.
6965 The macro @code{regexp} is recognized only with parameters.
6969 regexp(`GNUs not Unix', `\<[a-z]\w+')
6971 regexp(`GNUs not Unix', `\<Q\w*')
6973 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
6974 @result{}*** Unix *** nix ***
6975 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
6979 Here are some more examples on the handling of backslash:
6982 regexp(`abc', `\(b\)', `\\\10\a')
6984 regexp(`abc', `b', `\1\')
6985 @error{}m4:stdin:2: Warning: regexp: sub-expression 1 not present
6986 @error{}m4:stdin:2: Warning: regexp: trailing \ ignored in replacement
6988 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
6989 @error{}m4:stdin:3: Warning: regexp: sub-expression 4 not present
6990 @error{}m4:stdin:3: Warning: regexp: sub-expression 5 not present
6991 @error{}m4:stdin:3: Warning: regexp: sub-expression 6 not present
6995 Omitting @var{regexp} evokes a warning, but still produces output;
6996 contrast this with an empty @var{regexp} argument.
7000 @error{}m4:stdin:1: Warning: regexp: too few arguments: 1 < 2
7004 regexp(`abc', `', `\\def')
7008 If @var{resyntax} is given, @var{regexp} must be given according to
7009 the syntax chosen, though the default regular expression syntax
7010 remains unchanged for other invocations:
7013 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***',
7015 @result{}*** Unix *** nix ***
7016 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***')
7020 Occasionally, you might want to pass an @var{resyntax} argument without
7021 wishing to give @var{replacement}. If there are exactly three
7022 arguments, and the last argument is a valid @var{resyntax}, it is used
7023 as such, rather than as a replacement.
7026 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED')
7028 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED')
7029 @result{}POSIX_EXTENDED
7030 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `')
7032 regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `')
7033 @result{}POSIX_EXTENDED
7037 @section Extracting substrings
7039 @cindex extracting substrings
7040 @cindex substrings, extracting
7041 Substrings are extracted with @code{substr}:
7043 @deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length}, @
7045 Performs a substring operation on @var{string}. If @var{from} is
7046 positive, it represents the 0-based index where the substring begins.
7047 If @var{length} is omitted, the substring ends at the end of
7048 @var{string}; if it is positive, @var{length} is added to the starting
7049 index to determine the ending index.
7051 @cindex @acronym{GNU} extensions
7052 As a @acronym{GNU} extension, if @var{from} is negative, it is added to
7053 the length of @var{string} to determine the starting index; if it is
7054 empty, the start of the string is used. Likewise, if @var{length} is
7055 negative, it is added to the length of @var{string} to determine the
7056 ending index, and an emtpy @var{length} behaves like an omitted
7057 @var{length}. It is not an error if either of the resulting indices lie
7058 outside the string, but the selected substring only contains the bytes
7059 of @var{string} that overlap the selected indices. If the end point
7060 lies before the beginning point, the substring chosen is the empty
7061 string located at the starting index.
7063 If @var{replace} is omitted, then the expansion is only the selected
7064 substring, which may be empty. As a @acronym{GNU} extension,if
7065 @var{replace} is provided, then the expansion is the original
7066 @var{string} with the selected substring replaced by @var{replace}. The
7067 expansion is empty and a warning issued if @var{from} or @var{length}
7068 cannot be parsed, or if @var{replace} is provided but the selected
7069 indices do not overlap with @var{string}.
7071 The macro @code{substr} is recognized only with parameters.
7075 substr(`gnus, gnats, and armadillos', `6')
7076 @result{}gnats, and armadillos
7077 substr(`gnus, gnats, and armadillos', `6', `5')
7081 Omitting @var{from} evokes a warning, but still produces output. On the
7082 other hand, selecting a @var{from} or @var{length} that lies beyond
7083 @var{string} is not a problem.
7087 @error{}m4:stdin:1: Warning: substr: too few arguments: 1 < 2
7093 substr(`abc', `1', `4')
7097 Using negative values for @var{from} or @var{length} are @acronym{GNU}
7098 extensions, useful for accessing a fixed size tail of an
7099 arbitrary-length string. Prior to M4 1.6, using these values would
7100 silently result in the empty string. Some other implementations crash
7101 on negative values, and many treat an explicitly empty @var{length} as
7102 0, which is different from the omitted @var{length} implying the rest of
7103 the original @var{string}.
7106 substr(`abcde', `2', `')
7108 substr(`abcde', `-3')
7110 substr(`abcde', `', `-3')
7112 substr(`abcde', `-6')
7114 substr(`abcde', `-6', `5')
7116 substr(`abcde', `-7', `1')
7118 substr(`abcde', `1', `-2')
7120 substr(`abcde', `-4', `-1')
7122 substr(`abcde', `4', `-3')
7124 substr(`abcdefghij', `-09', `08')
7128 Another useful @acronym{GNU} extension, also added in M4 1.6, is the
7129 ability to replace a substring within the original @var{string}. An
7130 empty length substring at the beginning or end of @var{string} is valid,
7131 but selecting a substring that does not overlap @var{string} causes a
7135 substr(`abcde', `1', `3', `t')
7137 substr(`abcde', `5', `', `f')
7139 substr(`abcde', `-3', `-4', `f')
7141 substr(`abcde', `-6', `1', `f')
7143 substr(`abcde', `-7', `1', `f')
7144 @error{}m4:stdin:5: Warning: substr: substring out of range
7146 substr(`abcde', `6', `', `f')
7147 @error{}m4:stdin:6: Warning: substr: substring out of range
7151 If backwards compabitility to M4 1.4.x behavior is necessary, the
7152 following macro is sufficient to do the job (mimicking warnings about
7153 empty @var{from} or @var{length} or an ignored fourth argument is left
7154 as an exercise to the reader).
7157 define(`substr', `ifelse(`$#', `0', ``$0'',
7158 eval(`2 < $#')`$3', `1', `',
7159 index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')')
7161 substr(`abcde', `3')
7163 substr(`abcde', `3', `')
7165 substr(`abcde', `-1')
7167 substr(`abcde', `1', `-1')
7169 substr(`abcde', `2', `1', `C')
7173 On the other hand, it is possible to portably emulate the @acronym{GNU}
7174 extension of negative @var{from} and @var{length} arguments across all
7175 @code{m4} implementations, albeit with a lot more overhead. This
7176 example uses @code{incr} and @code{decr} to normalize @samp{-08} to
7177 something that a later @code{eval} will treat as a decimal value, rather
7178 than looking like an invalid octal number, while avoiding using these
7179 macros on an empty string. The helper macro @code{_substr_normalize} is
7180 recursive, since it is easier to fix @var{length} after @var{from} has
7181 been normalized, with the final iteration supplying two non-negative
7182 arguments to the original builtin, now named @code{_substr}.
7184 @comment options: -daq -t_substr
7186 $ @kbd{m4 -daq -t _substr}
7187 define(`_substr', defn(`substr'))dnl
7188 define(`substr', `ifelse(`$#', `0', ``$0'',
7189 `_$0(`$1', _$0_normalize(len(`$1'),
7190 ifelse(`$2', `', `0', `incr(decr(`$2'))'),
7191 ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl
7192 define(`_substr_normalize', `ifelse(
7193 eval(`$2 < 0 && $1 + $2 >= 0'), `1',
7194 `$0(`$1', eval(`$1 + $2'), `$3')',
7195 eval(`$2 < 0')`$3', `1', ``0', `$1'',
7196 eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1',
7197 `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))',
7198 eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'',
7199 eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')',
7200 `$3', `', ``$2', `$1'',
7201 eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1',
7202 ``$2', eval(`$1 - $2 + $3')',
7203 eval(`$3 - 0 < 0'), `1', ``$2', `0'',
7205 substr(`abcde', `2', `')
7206 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7208 substr(`abcde', `-3')
7209 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7211 substr(`abcde', `', `-3')
7212 @error{}m4trace: -1- _substr(`abcde', `0', `2')
7214 substr(`abcde', `-6')
7215 @error{}m4trace: -1- _substr(`abcde', `0', `5')
7217 substr(`abcde', `-6', `5')
7218 @error{}m4trace: -1- _substr(`abcde', `0', `4')
7220 substr(`abcde', `-7', `1')
7221 @error{}m4trace: -1- _substr(`abcde', `0', `0')
7223 substr(`abcde', `1', `-2')
7224 @error{}m4trace: -1- _substr(`abcde', `1', `2')
7226 substr(`abcde', `-4', `-1')
7227 @error{}m4trace: -1- _substr(`abcde', `1', `3')
7229 substr(`abcde', `4', `-3')
7230 @error{}m4trace: -1- _substr(`abcde', `4', `0')
7232 substr(`abcdefghij', `-09', `08')
7233 @error{}m4trace: -1- _substr(`abcdefghij', `1', `8')
7238 @section Translating characters
7240 @cindex translating characters
7241 @cindex characters, translating
7242 Character translation is done with @code{translit}:
7244 @deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement})
7245 Expands to @var{string}, with each character that occurs in
7246 @var{chars} translated into the character from @var{replacement} with
7249 If @var{replacement} is shorter than @var{chars}, the excess characters
7250 of @var{chars} are deleted from the expansion; if @var{chars} is
7251 shorter, the excess characters in @var{replacement} are silently
7252 ignored. If @var{replacement} is omitted, all characters in
7253 @var{string} that are present in @var{chars} are deleted from the
7254 expansion. If a character appears more than once in @var{chars}, only
7255 the first instance is used in making the translation. Only a single
7256 translation pass is made, even if characters in @var{replacement} also
7257 appear in @var{chars}.
7259 As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
7260 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
7261 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
7262 in @var{chars} or @var{replacement}, place it first or last in the
7263 entire string, or as the last character of a range. Back-to-back ranges
7264 can share a common endpoint. It is not an error for the last character
7265 in the range to be `larger' than the first. In that case, the range
7266 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
7267 The expansion of a range is dependent on the underlying encoding of
7268 characters, so using ranges is not always portable between machines.
7270 The macro @code{translit} is recognized only with parameters.
7274 translit(`GNUs not Unix', `A-Z')
7276 translit(`GNUs not Unix', `a-z', `A-Z')
7277 @result{}GNUS NOT UNIX
7278 translit(`GNUs not Unix', `A-Z', `z-a')
7279 @result{}tmfs not fnix
7280 translit(`+,-12345', `+--1-5', `<;>a-c-a')
7282 translit(`abcdef', `aabdef', `bcged')
7286 In the @sc{ascii} encoding, the first example deletes all uppercase
7287 letters, the second converts lowercase to uppercase, and the third
7288 `mirrors' all uppercase letters, while converting them to lowercase.
7289 The two first cases are by far the most common, even though they are not
7290 portable to @sc{ebcdic} or other encodings. The fourth example shows a
7291 range ending in @samp{-}, as well as back-to-back ranges. The final
7292 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
7293 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
7294 @samp{e} are swapped, and the @samp{f} is discarded.
7296 Omitting @var{chars} evokes a warning, but still produces output.
7300 @error{}m4:stdin:1: Warning: translit: too few arguments: 1 < 2
7305 @section Substituting text by regular expression
7307 @cindex regular expressions
7308 @cindex expressions, regular
7309 @cindex pattern substitution
7310 @cindex substitution by regular expression
7311 @cindex @acronym{GNU} extensions
7312 Global substitution in a string is done by @code{patsubst}:
7314 @deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @
7315 @ovar{replacement}, @ovar{resyntax})
7316 Searches @var{string} for matches of @var{regexp}, and substitutes
7317 @var{replacement} for each match.
7319 If @var{resyntax} is given, the particular flavor of regular expression
7320 understood with respect to @var{regexp} can be changed from the current
7321 default. @xref{Changeresyntax}, for details of the values that can be
7322 given for this argument. Unlike @var{regexp}, if exactly three
7323 arguments given, the third argument is always treated as
7324 @var{replacement}, even if it matches a known syntax name.
7326 The parts of @var{string} that are not covered by any match of
7327 @var{regexp} are copied to the expansion. Whenever a match is found, the
7328 search proceeds from the end of the match, so a character from
7329 @var{string} will never be substituted twice. If @var{regexp} matches a
7330 string of zero length, the start position for the search is incremented,
7331 to avoid infinite loops.
7333 When a replacement is to be made, @var{replacement} is inserted into
7334 the expansion, with @samp{\@var{n}} substituted by the text matched by
7335 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
7336 nine sub-expressions. The escape @samp{\&} is replaced by the text of
7337 the entire regular expression matched. For all other characters,
7338 @samp{\} treats the next character literally. A warning is issued if
7339 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
7340 if there is a trailing @samp{\}.
7342 The @var{replacement} argument can be omitted, in which case the text
7343 matched by @var{regexp} is deleted.
7345 The macro @code{patsubst} is recognized only with parameters.
7348 When used with two arguments, @code{regexp} returns the position of the
7349 match, but @code{patsubst} deletes the match:
7352 patsubst(`GNUs not Unix', `^', `OBS: ')
7353 @result{}OBS: GNUs not Unix
7354 patsubst(`GNUs not Unix', `\<', `OBS: ')
7355 @result{}OBS: GNUs OBS: not OBS: Unix
7356 patsubst(`GNUs not Unix', `\w*', `(\&)')
7357 @result{}(GNUs)() (not)() (Unix)()
7358 patsubst(`GNUs not Unix', `\w+', `(\&)')
7359 @result{}(GNUs) (not) (Unix)
7360 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
7361 @result{}GN not@w{ }
7362 patsubst(`GNUs not Unix', `not', `NOT\')
7363 @error{}m4:stdin:6: Warning: patsubst: trailing \ ignored in replacement
7364 @result{}GNUs NOT Unix
7367 Here is a slightly more realistic example, which capitalizes individual
7368 words or whole sentences, by substituting calls of the macros
7369 @code{upcase} and @code{downcase} into the strings.
7371 @deffn Composite upcase (@var{text})
7372 @deffnx Composite downcase (@var{text})
7373 @deffnx Composite capitalize (@var{text})
7374 Expand to @var{text}, but with capitalization changed: @code{upcase}
7375 changes all letters to upper case, @code{downcase} changes all letters
7376 to lower case, and @code{capitalize} changes the first character of each
7377 word to upper case and the remaining characters to lower case.
7380 First, an example of their usage, using implementations distributed in
7381 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}.
7385 $ @kbd{m4 -I examples}
7386 include(`capitalize.m4')
7388 upcase(`GNUs not Unix')
7389 @result{}GNUS NOT UNIX
7390 downcase(`GNUs not Unix')
7391 @result{}gnus not unix
7392 capitalize(`GNUs not Unix')
7393 @result{}Gnus Not Unix
7396 Now for the implementation. There is a helper macro @code{_capitalize}
7397 which puts only its first word in mixed case. Then @code{capitalize}
7398 merely parses out the words, and replaces them with an invocation of
7399 @code{_capitalize}. (As presented here, the @code{capitalize} macro has
7400 some subtle flaws. You should try to see if you can find and correct
7401 them; or @pxref{Improved capitalize, , Answers}).
7405 $ @kbd{m4 -I examples}
7406 undivert(`capitalize.m4')dnl
7407 @result{}divert(`-1')
7408 @result{}# upcase(text)
7409 @result{}# downcase(text)
7410 @result{}# capitalize(text)
7411 @result{}# change case of text, simple version
7412 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
7413 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
7414 @result{}define(`_capitalize',
7415 @result{} `regexp(`$1', `^\(\w\)\(\w*\)',
7416 @result{} `upcase(`\1')`'downcase(`\2')')')
7417 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
7418 @result{}divert`'dnl
7421 If @var{resyntax} is given, @var{regexp} must be given according to
7422 the syntax chosen, though the default regular expression syntax
7423 remains unchanged for other invocations:
7427 `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl
7428 epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_')
7429 @result{}_bar_ _baz_
7430 patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_')
7431 @result{}_bar_ _baz_
7434 While @code{regexp} replaces the whole input with the replacement as
7435 soon as there is a match, @code{patsubst} replaces each
7436 @emph{occurrence} of a match and preserves non-matching pieces:
7442 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
7443 @result{}bar FOO baz FOO
7445 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
7446 @result{}bab abb 212
7450 Omitting @var{regexp} evokes a warning, but still produces output;
7451 contrast this with an empty @var{regexp} argument.
7455 @error{}m4:stdin:1: Warning: patsubst: too few arguments: 1 < 2
7459 patsubst(`abc', `', `\\-')
7460 @result{}\-a\-b\-c\-
7464 @section Formatting strings (printf-like)
7466 @cindex formatted output
7467 @cindex output, formatted
7468 @cindex @acronym{GNU} extensions
7469 Formatted output can be made with @code{format}:
7471 @deffn {Builtin (gnu)} format (@var{format-string}, @dots{})
7472 Works much like the C function @code{printf}. The first argument
7473 @var{format-string} can contain @samp{%} specifications which are
7474 satisfied by additional arguments, and the expansion of @code{format} is
7475 the formatted string.
7477 The macro @code{format} is recognized only with parameters.
7480 Its use is best described by a few examples:
7482 @comment This test is a bit fragile, if someone tries to port to a
7483 @comment platform without infinity.
7485 define(`foo', `The brown fox jumped over the lazy dog')
7487 format(`The string "%s" uses %d characters', foo, len(foo))
7488 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
7489 format(`%*.*d', `-1', `-1', `1')
7491 format(`%.0f', `56789.9876')
7493 len(format(`%-*X', `5000', `1'))
7495 ifelse(format(`%010F', `infinity'), ` INF', `success',
7496 format(`%010F', `infinity'), ` INFINITY', `success',
7497 format(`%010F', `infinity'))
7499 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
7500 format(`%.1A', `1.999'), `0X2.0P+0', `success',
7501 format(`%.1A', `1.999'))
7503 format(`%g', `0xa.P+1')
7507 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
7508 example shows how @code{format} can be used to produce tabular output.
7512 $ @kbd{m4 -I examples}
7513 include(`forloop.m4')
7515 forloop(`i', `1', `10', `format(`%6d squared is %10d
7517 @result{} 1 squared is 1
7518 @result{} 2 squared is 4
7519 @result{} 3 squared is 9
7520 @result{} 4 squared is 16
7521 @result{} 5 squared is 25
7522 @result{} 6 squared is 36
7523 @result{} 7 squared is 49
7524 @result{} 8 squared is 64
7525 @result{} 9 squared is 81
7526 @result{} 10 squared is 100
7530 The builtin @code{format} is modeled after the ANSI C @samp{printf}
7531 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
7532 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
7533 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
7534 @samp{%}; it supports field widths and precisions, and the flags
7535 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
7536 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
7537 @samp{l} are recognized, and for floating point specifiers, the width
7538 modifier @samp{l} is recognized. Items not yet supported include
7539 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
7540 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
7541 modifiers, and any platform extensions available in the native
7542 @code{printf}. For more details on the functioning of @code{printf},
7543 see the C Library Manual, or the @acronym{POSIX} specification (for
7544 example, @samp{%a} is supported even on platforms that haven't yet
7545 implemented C99 hexadecimal floating point output natively).
7547 @c FIXME - format still needs some improvements.
7548 Warnings are issued for unrecognized specifiers, an improper number of
7549 arguments, or difficulty parsing an argument according to the format
7550 string (such as overflow or extra characters). It is anticipated that a
7551 future release of @acronym{GNU} @code{m4} will support more specifiers.
7552 Likewise, escape sequences are not yet recognized.
7556 @error{}m4:stdin:1: Warning: format: unrecognized specifier in `%p'
7559 @error{}m4:stdin:2: Warning: format: empty string treated as 0
7560 @error{}m4:stdin:2: Warning: format: too few arguments: 2 < 3
7562 format(`%.1f', `2a')
7563 @error{}m4:stdin:3: Warning: format: non-numeric argument `2a'
7568 @chapter Macros for doing arithmetic
7571 @cindex integer arithmetic
7572 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
7573 convenient shorthands, there are builtins for simple increment and
7574 decrement operations.
7577 * Incr:: Decrement and increment operators
7578 * Eval:: Evaluating integer expressions
7579 * Mpeval:: Multiple precision arithmetic
7583 @section Decrement and increment operators
7585 @cindex decrement operator
7586 @cindex increment operator
7587 Increment and decrement of integers are supported using the builtins
7588 @code{incr} and @code{decr}:
7590 @deffn {Builtin (m4)} incr (@var{number})
7591 @deffnx {Builtin (m4)} decr (@var{number})
7592 Expand to the numerical value of @var{number}, incremented
7593 or decremented, respectively, by one. Except for the empty string, the
7594 expansion is empty if @var{number} could not be parsed.
7596 The macros @code{incr} and @code{decr} are recognized only with
7606 @error{}m4:stdin:3: Warning: incr: empty string treated as 0
7609 @error{}m4:stdin:4: Warning: decr: empty string treated as 0
7613 The builtin macros @code{incr} and @code{decr} are recognized only when
7617 @section Evaluating integer expressions
7619 @cindex integer expression evaluation
7620 @cindex evaluation, of integer expressions
7621 @cindex expressions, evaluation of integer
7622 Integer expressions are evaluated with @code{eval}:
7624 @deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
7625 Expands to the value of @var{expression}. The expansion is empty
7626 if a problem is encountered while parsing the arguments. If specified,
7627 @var{radix} and @var{width} control the format of the output.
7629 Calculations are done with signed numbers, using at least 31-bit
7630 precision, but as a @acronym{GNU} extension, @code{m4} will use wider
7631 integers if available. Precision is finite, based on the platform's
7632 notion of @code{intmax_t}, and overflow silently results in wraparound.
7633 A warning is issued if division by zero is attempted, or if
7634 @var{expression} could not be parsed.
7636 Expressions can contain the following operators, listed in order of
7637 decreasing precedence.
7643 Unary plus and minus, and bitwise and logical negation
7647 Multiplication, division, modulo, and ratio
7649 Addition and subtraction
7651 Shift left, shift right, unsigned shift right
7653 Relational operators
7659 Bitwise exclusive-or
7669 Sequential evaluation
7672 The macro @code{eval} is recognized only with parameters.
7675 All binary operators, except exponentiation, are left associative. C
7676 operators that perform variable assignment, such as @samp{+=} or
7677 @samp{--}, are not implemented, since @code{eval} only operates on
7678 constants, not variables. Attempting to use them results in an error.
7679 @comment FIXME - since XCU ERN 137 is approved, we could provide an
7680 @comment extension that supported assignment operators.
7682 Note that some older @code{m4} implementations use @samp{^} as an
7683 alternate operator for the exponentiation, although @acronym{POSIX}
7684 requires the C behavior of bitwise exclusive-or. The precedence of the
7685 negation operators, @samp{~} and @samp{!}, was traditionally lower than
7686 equality. The unary operators could not be used reliably more than once
7687 on the same term without intervening parentheses. The traditional
7688 precedence of the equality operators @samp{==} and @samp{!=} was
7689 identical instead of lower than the relational operators such as
7690 @samp{<}, even through @acronym{GNU} M4 1.4.8. Starting with version
7691 1.4.9, @acronym{GNU} M4 correctly follows @acronym{POSIX} precedence
7692 rules. M4 scripts designed to be portable between releases must be
7693 aware that parentheses may be required to enforce C precedence rules.
7694 Likewise, division by zero, even in the unused branch of a
7695 short-circuiting operator, is not always well-defined in other
7698 Following are some examples where the current version of M4 follows C
7699 precedence rules, but where older versions and some other
7700 implementations of @code{m4} require explicit parentheses to get the
7706 eval(`(1 == 2) > 0')
7716 eval(`+ + - ~ ! ~ 0')
7719 @error{}m4:stdin:8: Warning: eval: invalid operator: `++0'
7722 @error{}m4:stdin:9: Warning: eval: invalid operator: `1 = 1'
7725 @error{}m4:stdin:10: Warning: eval: invalid operator: `0 |= 1'
7730 @error{}m4:stdin:12: Warning: eval: divide by zero: `0 || 1 / 0'
7735 @error{}m4:stdin:14: Warning: eval: modulo by zero: `2 && 1 % 0'
7739 @cindex @acronym{GNU} extensions
7740 As a @acronym{GNU} extension, @code{eval} supports several operators
7741 that do not appear in C@. A right-associative exponentiation operator
7742 @samp{**} computes the value of the left argument raised to the right,
7743 modulo the numeric precision width. If evaluated, the exponent must be
7744 non-negative, and at least one of the arguments must be non-zero, or a
7745 warning is issued. An unsigned shift operator @samp{>>>} allows
7746 shifting a negative number as though it were an unsigned bit pattern,
7747 which shifts in 0 bits rather than twos-complement sign-extension. A
7748 ratio operator @samp{\} behaves like normal division @samp{/} on
7749 integers, but is provided for symmetry with @code{mpeval}.
7750 Additionally, the C operators @samp{,} and @samp{?:} are supported.
7755 eval(`(2 ** 3) ** 2')
7763 @error{}m4:stdin:5: Warning: eval: divide by zero: `0 ** 0'
7765 @error{}m4:stdin:6: Warning: eval: negative exponent: `4 ** -2'
7767 eval(`2 || 4 ** -2')
7769 eval(`(-1 >> 1) == -1')
7771 eval(`(-1 >>> 1) > (1 << 30)')
7787 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
7788 without a special prefix are decimal. A simple @samp{0} prefix
7789 introduces an octal number. @samp{0x} introduces a hexadecimal number.
7790 As @acronym{GNU} extensions, @samp{0b} introduces a binary number.
7791 @samp{0r} introduces a number expressed in any radix between 1 and 36:
7792 the prefix should be immediately followed by the decimal expression of
7793 the radix, a colon, then the digits making the number. For radix 1,
7794 leading zeros are ignored, and all remaining digits must be @samp{1};
7795 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
7796 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
7797 to @samp{z}. Lower and upper case letters can be used interchangeably
7798 in numbers prefixes and as number digits.
7800 Parentheses may be used to group subexpressions whenever needed. For the
7801 relational operators, a true relation returns @code{1}, and a false
7802 relation return @code{0}.
7804 Here are a few examples of use of @code{eval}.
7815 eval(index(`Hello world', `llo') >= 0)
7817 eval(`0r1:0111 + 0b100 + 0r3:12')
7819 define(`square', `eval(`($1) ** 2')')
7823 square(square(`5')` + 1')
7825 define(`foo', `666')
7828 @error{}m4:stdin:11: Warning: eval: bad expression: `foo / 6'
7834 As the last two lines show, @code{eval} does not handle macro
7835 names, even if they expand to a valid expression (or part of a valid
7836 expression). Therefore all macros must be expanded before they are
7837 passed to @code{eval}.
7838 @comment update this if we add support for variables.
7840 Some calculations are not portable to other implementations, since they
7841 have undefined semantics in C, but @acronym{GNU} @code{m4} has
7842 well-defined behavior on overflow. When shifting, an out-of-range shift
7843 amount is implicitly brought into the range of the precision using
7844 modulo arithmetic (for example, on 32-bit integers, this would be an
7845 implicit bit-wise and with 0x1f). This example should work whether your
7846 platform uses 32-bit integers, 64-bit integers, or even some other
7850 define(`max_int', eval(`-1 >>> 1'))
7852 define(`min_int', eval(max_int` + 1'))
7858 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
7859 @result{}overflow occurred
7860 eval(`0x80000000 % -1')
7864 eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2'))
7868 If @var{radix} is specified, it specifies the radix to be used in the
7869 expansion. The default radix is 10; this is also the case if
7870 @var{radix} is the empty string. A warning results if the radix is
7871 outside the range of 1 through 36, inclusive. The result of @code{eval}
7872 is always taken to be signed. No radix prefix is output, and for
7873 radices greater than 10, the digits are lower case (although some
7874 other implementations use upper case). The output is unquoted, and
7875 subject to further macro expansion. The @var{width}
7876 argument specifies the minimum output width, excluding any negative
7877 sign. The result is zero-padded to extend the expansion to the
7878 requested width. A warning results if the width is negative. If
7879 @var{radix} or @var{width} is out of bounds, the expansion of
7880 @code{eval} is empty.
7889 eval(`666', `6', `10')
7891 eval(`-666', `6', `10')
7892 @result{}-0000003030
7895 `0r1:'eval(`10', `1', `11')
7896 @result{}0r1:01111111111
7900 @error{}m4:stdin:9: Warning: eval: radix out of range: 37
7903 @error{}m4:stdin:10: Warning: eval: negative width: -1
7906 @error{}m4:stdin:11: Warning: eval: empty string treated as 0
7909 @error{}m4:stdin:12: Warning: eval: empty string treated as 0
7911 define(`a', `hi')eval(` 10 ', `16')
7916 @section Multiple precision arithmetic
7918 When @code{m4} is compiled with a multiple precision arithmetic library
7919 (@pxref{Experiments}), a builtin @code{mpeval} is defined.
7921 @deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @
7923 Behaves similarly to @code{eval}, except the calculations are done with
7924 infinite precision, and rational numbers are supported. Numbers may be
7927 The macro @code{mpeval} is recognized only with parameters.
7930 For the most part, using @code{mpeval} is similar to using @code{eval}:
7932 @comment options: -m mpeval
7934 $ @kbd{m4 -m mpeval}
7935 mpeval(`(1 << 70) + 2 ** 68 * 3', `16')
7936 @result{}700000000000000000
7937 `0r24:'mpeval(`0r36:zYx', `24', `5')
7941 The ratio operator, @samp{\}, is provided with the same precedence as
7942 division, and rationally divides two numbers and canonicalizes the
7943 result, whereas the division operator @samp{/} always returns the
7944 integer quotient of the division. To convert a rational value to
7945 integral, divide (@samp{/}) by 1. Some operators, such as @samp{%},
7946 @samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate
7947 only on integers and will truncate any rational remainder. The unsigned
7948 shift operator, @samp{>>>}, behaves identically with regular right
7949 shifts, @samp{>>}, since with infinite precision, it is not possible to
7950 convert a negative number to a positive using shifts. The
7951 exponentiation operator, @samp{**}, assumes that the exponent is
7952 integral, but allows negative exponents. With the short-circuit logical
7953 operators, @samp{||} and @samp{&&}, a non-zero result preserves the
7954 value of the argument that ended evaluation, rather than collapsing to
7955 @samp{1}. The operators @samp{?:} and @samp{,} are always available,
7956 even in @acronym{POSIX} mode, since @code{mpeval} does not have to
7957 conform to the @acronym{POSIX} rules for @code{eval}.
7959 @comment options: -m mpeval
7961 $ @kbd{m4 -m mpeval}
7976 @node Shell commands
7977 @chapter Macros for running shell commands
7979 @cindex UNIX commands, running
7980 @cindex executing shell commands
7981 @cindex running shell commands
7982 @cindex shell commands, running
7983 @cindex commands, running shell
7984 There are a few builtin macros in @code{m4} that allow you to run shell
7985 commands from within @code{m4}.
7987 Note that the definition of a valid shell command is system dependent.
7988 On UNIX systems, this is the typical @command{/bin/sh}. But on other
7989 systems, such as native Windows, the shell has a different syntax of
7990 commands that it understands. Some examples in this chapter assume
7991 @command{/bin/sh}, and also demonstrate how to quit early with a known
7992 exit value if this is not the case.
7995 * Platform macros:: Determining the platform
7996 * Syscmd:: Executing simple commands
7997 * Esyscmd:: Reading the output of commands
7998 * Sysval:: Exit status
7999 * Mkstemp:: Making temporary files
8000 * Mkdtemp:: Making temporary directories
8003 @node Platform macros
8004 @section Determining the platform
8006 @cindex platform macros
8007 Sometimes it is desirable for an input file to know which platform
8008 @code{m4} is running on. @acronym{GNU} @code{m4} provides several
8009 macros that are predefined to expand to the empty string; checking for
8010 their existence will confirm platform details.
8012 @deffn {Optional builtin (gnu)} __os2__
8013 @deffnx {Optional builtin (traditional)} os2
8014 @deffnx {Optional builtin (gnu)} __unix__
8015 @deffnx {Optional builtin (traditional)} unix
8016 @deffnx {Optional builtin (gnu)} __windows__
8017 @deffnx {Optional builtin (traditional)} windows
8018 Each of these macros is conditionally defined as needed to describe the
8019 environment of @code{m4}. If defined, each macro expands to the empty
8023 On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
8024 in the @samp{gnu} module, and @code{unix} in the @samp{traditional}
8027 On native Windows systems, @acronym{GNU} @code{m4} will define
8028 @code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in
8029 the @samp{traditional} module.
8031 On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
8032 in the @samp{gnu} module, and @code{os2} in the @samp{traditional}
8035 If @acronym{GNU} M4 does not provide a platform macro for your system,
8036 please report that as a bug.
8039 define(`provided', `0')
8041 ifdef(`__unix__', `define(`provided', incr(provided))')
8043 ifdef(`__windows__', `define(`provided', incr(provided))')
8045 ifdef(`__os2__', `define(`provided', incr(provided))')
8052 @section Executing simple commands
8054 Any shell command can be executed, using @code{syscmd}:
8056 @deffn {Builtin (m4)} syscmd (@var{shell-command})
8057 Executes @var{shell-command} as a shell command.
8059 The expansion of @code{syscmd} is void, @emph{not} the output from
8060 @var{shell-command}! Output or error messages from @var{shell-command}
8061 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
8064 Prior to executing the command, @code{m4} flushes its buffers.
8065 The default standard input, output and error of @var{shell-command} are
8066 the same as those of @code{m4}.
8068 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8069 m4}) is in effect, @code{syscmd} results in an error, since otherwise an
8070 input file could execute arbitrary code.
8072 The macro @code{syscmd} is recognized only with parameters.
8076 define(`foo', `FOO')
8083 Note how the expansion of @code{syscmd} keeps the trailing newline of
8084 the command, as well as using the newline that appeared after the macro.
8086 The following is an example of @var{shell-command} using the same
8087 standard input as @code{m4}:
8089 @comment The testsuite does not know how to parse pipes from the
8090 @comment texinfo. Fortunately, there are other tests in the testsuite
8091 @comment that test this same feature.
8094 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
8098 It tells @code{m4} to read all of its input before executing the wrapped
8099 text, then hands a valid (albeit emptied) pipe as standard input for the
8100 @code{cat} subcommand. Therefore, you should be careful when using
8101 standard input (either by specifying no files, or by passing @samp{-} as
8102 a file name on the command line, @pxref{Command line files, , Invoking
8103 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
8104 that consume data from standard input. When standard input is a
8105 seekable file, the subprocess will pick up with the next character not
8106 yet processed by @code{m4}; when it is a pipe or other non-seekable
8107 file, there is no guarantee how much data will already be buffered by
8108 @code{m4} and thus unavailable to the child.
8110 Following is an example of how potentially unsafe actions can be
8113 @comment options: --safer
8118 @error{}m4:stdin:1: syscmd: disabled by --safer
8123 @section Reading the output of commands
8125 @cindex @acronym{GNU} extensions
8126 If you want @code{m4} to read the output of a shell command, use
8129 @deffn {Builtin (gnu)} esyscmd (@var{shell-command})
8130 Expands to the standard output of the shell command
8131 @var{shell-command}.
8133 Prior to executing the command, @code{m4} flushes its buffers.
8134 The default standard input and standard error of @var{shell-command} are
8135 the same as those of @code{m4}. The error output of @var{shell-command}
8136 is not a part of the expansion: it will appear along with the error
8137 output of @code{m4}.
8139 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8140 m4}) is in effect, @code{esyscmd} results in an error, since otherwise
8141 an input file could execute arbitrary code.
8143 The macro @code{esyscmd} is recognized only with parameters.
8147 define(`foo', `FOO')
8154 Note how the expansion of @code{esyscmd} keeps the trailing newline of
8155 the command, as well as using the newline that appeared after the macro.
8157 Just as with @code{syscmd}, care must be exercised when sharing standard
8158 input between @code{m4} and the child process of @code{esyscmd}.
8159 Likewise, potentially unsafe actions can be suppressed.
8161 @comment options: --safer
8166 @error{}m4:stdin:1: esyscmd: disabled by --safer
8171 @section Exit status
8173 @cindex UNIX commands, exit status from
8174 @cindex exit status from shell commands
8175 @cindex shell commands, exit status from
8176 @cindex commands, exit status from shell
8177 @cindex status of shell commands
8178 To see whether a shell command succeeded, use @code{sysval}:
8180 @deffn {Builtin (m4)} sysval
8181 Expands to the exit status of the last shell command run with
8182 @code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
8191 ifelse(sysval, `0', `zero', `non-zero')
8203 ifelse(sysval, `0', `zero', `non-zero')
8215 @code{sysval} results in 127 if there was a problem executing the
8216 command, for example, if the system-imposed argument length is exceeded,
8217 or if there were not enough resources to fork. It is not possible to
8218 distinguish between failed execution and successful execution that had
8219 an exit status of 127.
8221 On UNIX platforms, where it is possible to detect when command execution
8222 is terminated by a signal, rather than a normal exit, the result is the
8223 signal number shifted left by eight bits.
8225 @comment This test has difficulties being portable, even on platforms
8226 @comment where syscmd invokes /bin/sh. Kill is not portable with signal
8227 @comment names. According to autoconf, the only portable signal numbers
8228 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
8229 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
8230 @comment exits normally rather than letting the signal terminate it).
8231 @comment Also, TERM is flaky, as it can also kill the running m4 on
8232 @comment systems where /bin/sh does not create its own process group.
8233 @comment And PIPE is unreliable, since people tend to run with it
8234 @comment ignored, with m4 inheriting that choice. That leaves KILL as
8235 @comment the only signal we can reliably test.
8237 dnl This test assumes kill is a shell builtin, and that signals are
8240 `errprint(` skipping: syscmd does not have unix semantics
8242 syscmd(`kill -9 $$')
8250 esyscmd(`kill -9 $$')
8256 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8257 m4}) is in effect, @code{sysval} will always remain at its default value
8260 @comment options: --safer
8267 @error{}m4:stdin:2: syscmd: disabled by --safer
8274 @section Making temporary files
8276 @cindex temporary file names
8277 @cindex files, names of temporary
8278 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8279 temporary file, for output or for some other purpose. There is a
8280 builtin macro, @code{mkstemp}, for making a temporary file:
8282 @deffn {Builtin (m4)} mkstemp (@var{template})
8283 @deffnx {Builtin (m4)} maketemp (@var{template})
8284 Expands to the quoted name of a new, empty file, made from the string
8285 @var{template}, which should end with the string @samp{XXXXXX}. The six
8286 @samp{X} characters are then replaced with random characters matching
8287 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
8288 name unique. If fewer than six @samp{X} characters are found at the end
8289 of @code{template}, the result will be longer than the template. The
8290 created file will have access permissions as if by @kbd{chmod =rw,go=},
8291 meaning that the current umask of the @code{m4} process is taken into
8292 account, and at most only the current user can read and write the file.
8294 The traditional behavior, standardized by @acronym{POSIX}, is that
8295 @code{maketemp} merely replaces the trailing @samp{X} with the process
8296 id, without creating a file or quoting the expansion, and without
8297 ensuring that the resulting
8298 string is a unique file name. In part, this means that using the same
8299 @var{template} twice in the same input file will result in the same
8300 expansion. This behavior is a security hole, as it is very easy for
8301 another process to guess the name that will be generated, and thus
8302 interfere with a subsequent use of @code{syscmd} trying to manipulate
8303 that file name. Hence, @acronym{POSIX} has recommended that all new
8304 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
8305 and that users of @code{m4} check for its existence.
8307 The expansion is void and an error issued if a temporary file could
8310 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8311 is in effect, @code{mkstemp} and @acronym{GNU}-mode @code{maketemp}
8312 result in an error, since otherwise an input file could perform a mild
8313 denial-of-service attack by filling up a disk with multiple empty files.
8315 The macros @code{mkstemp} and @code{maketemp} are recognized only with
8319 If you try this next example, you will most likely get different output
8320 for the two file names, since the replacement characters are randomly
8326 define(`tmp', `oops')
8328 maketemp(`/tmp/fooXXXXXX')
8329 @error{}m4:stdin:1: Warning: maketemp: recommend using mkstemp instead
8330 @result{}/tmp/fooa07346
8331 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
8332 `define(`mkstemp', defn(`maketemp'))dnl
8333 errprint(`warning: potentially insecure maketemp implementation
8340 @comment options: --safer
8344 maketemp(`/tmp/fooXXXXXX')
8345 @error{}m4:stdin:1: Warning: maketemp: recommend using mkstemp instead
8346 @error{}m4:stdin:1: maketemp: disabled by --safer
8348 mkstemp(`/tmp/fooXXXXXX')
8349 @error{}m4:stdin:2: mkstemp: disabled by --safer
8353 @cindex @acronym{GNU} extensions
8354 Unless you use the @option{--traditional} command line option (or
8355 @option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
8356 version of @code{maketemp} is secure. This means that using the same
8357 template to multiple calls will generate multiple files. However, we
8358 recommend that you use the new @code{mkstemp} macro, introduced in
8359 @acronym{GNU} M4 1.4.8, which is secure even in traditional mode. Also,
8360 as of M4 1.4.11, the secure implementation quotes the resulting file
8361 name, so that you are guaranteed to know what file was created even if
8362 the random file name happens to match an existing macro. Notice that
8363 this example is careful to use @code{defn} to avoid unintended expansion
8368 define(`foo', `errprint(`oops')')
8370 syscmd(`rm -f foo-??????')sysval
8372 define(`file1', maketemp(`foo-XXXXXX'))dnl
8373 @error{}m4:stdin:3: Warning: maketemp: recommend using mkstemp instead
8374 ifelse(esyscmd(`echo \` foo-?????? \''), `foo-??????',
8375 `no file', `created')
8377 define(`file2', maketemp(`foo-XX'))dnl
8378 @error{}m4:stdin:6: Warning: maketemp: recommend using mkstemp instead
8379 define(`file3', mkstemp(`foo-XXXXXX'))dnl
8380 ifelse(len(defn(`file1')), len(defn(`file2')),
8381 `same length', `different')
8382 @result{}same length
8383 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
8384 @result{}different file
8385 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
8386 @result{}different file
8387 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
8388 @result{}different file
8389 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
8395 @comment options: -G
8398 syscmd(`rm -f foo-*')sysval
8400 define(`file1', maketemp(`foo-XXXXXX'))dnl
8401 @error{}m4:stdin:2: Warning: maketemp: recommend using mkstemp instead
8402 define(`file2', maketemp(`foo-XXXXXX'))dnl
8403 @error{}m4:stdin:3: Warning: maketemp: recommend using mkstemp instead
8404 ifelse(file1, file2, `same', `different file')
8406 len(maketemp(`foo-XXXXX'))
8407 @error{}m4:stdin:5: Warning: maketemp: recommend using mkstemp instead
8409 define(`abc', `def')
8413 @error{}m4:stdin:7: Warning: maketemp: recommend using mkstemp instead
8414 syscmd(`test -f foo-*')sysval
8419 @section Making temporary directories
8421 @cindex temporary directory
8422 @cindex directories, temporary
8423 @cindex @acronym{GNU} extensions
8424 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8425 temporary directory, for holding multiple temporary files; such a
8426 directory can be created with @code{mkdtemp}:
8428 @deffn {Builtin (gnu)} mkdtemp (@var{template})
8429 Expands to the quoted name of a new, empty directory, made from the string
8430 @var{template}, which should end with the string @samp{XXXXXX}. The six
8431 @samp{X} characters are then replaced with random characters matching
8432 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name
8433 unique. If fewer than six @samp{X} characters are found at the end of
8434 @code{template}, the result will be longer than the template. The
8435 created directory will have access permissions as if by @kbd{chmod
8436 =rwx,go=}, meaning that the current umask of the @code{m4} process is
8437 taken into account, and at most only the current user can read, write,
8438 and search the directory.
8440 The expansion is void and an error issued if a temporary directory could
8443 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8444 is in effect, @code{mkdtemp} results in an error, since otherwise an
8445 input file could perform a mild denial-of-service attack by filling up a
8446 disk with multiple directories.
8448 The macro @code{mkdtemp} is recognized only with parameters.
8449 This macro was added in M4 2.0.
8452 If you try this next example, you will most likely get different output
8453 for the directory names, since the replacement characters are randomly
8459 define(`tmp', `oops')
8461 mkdtemp(`/tmp/fooXXXXXX')
8462 @result{}/tmp/foo2h89Vo
8467 @comment options: --safer
8471 mkdtemp(`/tmp/fooXXXXXX')
8472 @error{}m4:stdin:1: mkdtemp: disabled by --safer
8476 Multiple calls with the same template will generate multiple
8481 syscmd(`echo foo??????')dnl
8483 define(`dir1', mkdtemp(`fooXXXXXX'))dnl
8484 ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created')
8486 define(`dir2', mkdtemp(`fooXXXXXX'))dnl
8487 ifelse(dir1, dir2, `same', `different directories')
8488 @result{}different directories
8489 syscmd(`rmdir 'dir1 dir2)
8496 @chapter Miscellaneous builtin macros
8498 This chapter describes various builtins, that do not really belong in
8499 any of the previous chapters.
8502 * Errprint:: Printing error messages
8503 * Location:: Printing current location
8504 * M4exit:: Exiting from @code{m4}
8505 * Syncoutput:: Turning on and off sync lines
8509 @section Printing error messages
8511 @cindex printing error messages
8512 @cindex error messages, printing
8513 @cindex messages, printing error
8514 @cindex standard error, output to
8515 You can print error messages using @code{errprint}:
8517 @deffn {Builtin (m4)} errprint (@var{message}, @dots{})
8518 Prints @var{message} and the rest of the arguments to standard error,
8519 separated by spaces. Standard error is used, regardless of the
8520 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
8522 The expansion of @code{errprint} is void.
8523 The macro @code{errprint} is recognized only with parameters.
8527 errprint(`Invalid arguments to forloop
8529 @error{}Invalid arguments to forloop
8531 errprint(`1')errprint(`2',`3
8537 A trailing newline is @emph{not} printed automatically, so it should be
8538 supplied as part of the argument, as in the example. Unfortunately, the
8539 exact output of @code{errprint} is not very portable to other @code{m4}
8540 implementations: @acronym{POSIX} requires that all arguments be printed,
8541 but some implementations of @code{m4} only print the first.
8542 Furthermore, some @acronym{BSD} implementations always append a newline
8543 for each @code{errprint} call, regardless of whether the last argument
8544 already had one, and @acronym{POSIX} is silent on whether this is
8548 @section Printing current location
8550 @cindex location, input
8551 @cindex input location
8552 To make it possible to specify the location of an error, three
8553 utility builtins exist:
8555 @deffn {Builtin (gnu)} __file__
8556 @deffnx {Builtin (gnu)} __line__
8557 @deffnx {Builtin (gnu)} __program__
8558 Expand to the quoted name of the current input file, the
8559 current input line number in that file, and the quoted name of the
8560 current invocation of @code{m4}.
8564 errprint(__program__:__file__:__line__: `input error
8566 @error{}m4:stdin:1: input error
8570 Line numbers start at 1 for each file. If the file was found due to the
8571 @option{-I} option or @env{M4PATH} environment variable, that is
8572 reflected in the file name. Synclines, via @code{syncoutput}
8573 (@pxref{Syncoutput}) or the command line option @option{--synclines}
8574 (or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the
8575 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}),
8576 also use this notion of current file and line. Redefining the three
8577 location macros has no effect on syncline, debug, warning, or error
8580 This example reuses the file @file{incl.m4} mentioned earlier
8585 $ @kbd{m4 -I examples}
8586 define(`foo', ``$0' called at __file__:__line__')
8589 @result{}foo called at stdin:2
8591 @result{}Include file start
8592 @result{}foo called at examples/incl.m4:2
8593 @result{}Include file end
8597 The location of macros invoked during the rescanning of macro expansion
8598 text corresponds to the location in the file where the expansion was
8599 triggered, regardless of how many newline characters the expansion text
8600 contains. As of @acronym{GNU} M4 1.4.8, the location of text wrapped
8601 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
8602 @code{m4wrap} was invoked. Previous versions, however, behaved as
8603 though wrapped text came from line 0 of the file ``''.
8606 define(`echo', `$@@')
8608 define(`foo', `echo(__line__
8618 foo(errprint(__line__
8636 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
8637 terminology. If you invoke @code{m4} through an absolute path or a link
8638 with a different spelling, rather than by relying on a @env{PATH} search
8639 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
8640 The intent is that you can use it to produce error messages with the
8641 same formatting that @code{m4} produces internally. It can also be used
8642 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
8643 @code{m4} that is currently running, rather than whatever version of
8644 @code{m4} happens to be first in @env{PATH}. It was first introduced in
8645 @acronym{GNU} M4 1.4.6.
8648 @section Exiting from @code{m4}
8650 @cindex exiting from @code{m4}
8651 @cindex status, setting @code{m4} exit
8652 If you need to exit from @code{m4} before the entire input has been
8653 read, you can use @code{m4exit}:
8655 @deffn {Builtin (m4)} m4exit (@ovar{code})
8656 Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
8657 left out, the exit status is zero. If @var{code} cannot be parsed, or
8658 is outside the range of 0 to 255, the exit status is one. No further
8659 input is read, and all wrapped and diverted text is discarded.
8663 m4wrap(`This text is lost due to `m4exit'.')
8665 divert(`1') So is this.
8668 m4exit And this is never read.
8671 A common use of this is to abort processing:
8673 @deffn Composite fatal_error (@var{message})
8674 Abort processing with an error message and non-zero status. Prefix
8675 @var{message} with details about where the error occurred, and print the
8676 resulting string to standard error.
8681 define(`fatal_error',
8682 `errprint(__program__:__file__:__line__`: fatal error: $*
8685 fatal_error(`this is a BAD one, buster')
8686 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
8689 After this macro call, @code{m4} will exit with exit status 1. This macro
8690 is only intended for error exits, since the normal exit procedures are
8691 not followed, i.e., diverted text is not undiverted, and saved text
8692 (@pxref{M4wrap}) is not reread. (This macro could be made more robust
8693 to earlier versions of @code{m4}. You should try to see if you can find
8694 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
8696 Note that it is still possible for the exit status to be different than
8697 what was requested by @code{m4exit}. If @code{m4} detects some other
8698 error, such as a write error on standard output, the exit status will be
8699 non-zero even if @code{m4exit} requested zero.
8701 If standard input is seekable, then the file will be positioned at the
8702 next unread character. If it is a pipe or other non-seekable file,
8703 then there are no guarantees how much data @code{m4} might have read
8704 into buffers, and thus discarded.
8707 @section Turning on and off sync lines
8709 @cindex toggling synchronization lines
8710 @cindex synchronization lines
8711 @cindex location, input
8712 @cindex input location
8713 It is possible to adjust whether synclines are printed to output:
8715 @deffn {Builtin (gnu)} syncoutput (@var{truth})
8716 If @var{truth} matches the extended regular expression
8717 @samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the
8718 form: @samp{#line <number> ["<file>"]}.
8720 If @var{truth} is empty, or matches the extended regular expression
8721 @samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off.
8723 All other arguments are ignored and issue a warning.
8725 The macro @code{syncoutput} is recognized only with parameters.
8726 This macro was added in M4 2.0.
8730 define(`twoline', `1
8733 changecom(`/*', `*/')
8735 define(`comment', `/*1
8743 @result{}#line 8 "stdin"
8771 @error{}m4:stdin:18: Warning: syncoutput: unknown directive `blah'
8775 Notice that a syncline is output any time a single source line expands
8776 to multiple output lines, or any time multiple source lines expand to a
8777 single output line. When there is a one-for-one correspondence, no
8778 additional synclines are needed.
8780 Synchronization lines can be used to track where input comes from; an
8781 optional file designation is printed when the syncline algorithm
8782 detects that consecutive output lines come from different files. You
8783 can also use the @option{--synclines} command-line option (or
8784 @option{-s}, @pxref{Preprocessor features, , Invoking m4}) to start
8785 with synchronization on. This example reuses the file @file{incl.m4}
8786 mentioned earlier (@pxref{Include}):
8789 @comment options: -s
8791 $ @kbd{m4 --synclines -I examples}
8793 @result{}#line 1 "examples/incl.m4"
8794 @result{}Include file start
8796 @result{}Include file end
8797 @result{}#line 1 "stdin"
8802 @chapter Fast loading of frozen state
8804 Some bigger @code{m4} applications may be built over a common base
8805 containing hundreds of definitions and other costly initializations.
8806 Usually, the common base is kept in one or more declarative files,
8807 which files are listed on each @code{m4} invocation prior to the
8808 user's input file, or else each input file uses @code{include}.
8810 Reading the common base of a big application, over and over again, may
8811 be time consuming. @acronym{GNU} @code{m4} offers some machinery to
8812 speed up the start of an application using lengthy common bases.
8815 * Using frozen files:: Using frozen files
8816 * Frozen file format 1:: Frozen file format 1
8817 * Frozen file format 2:: Frozen file format 2
8820 @node Using frozen files
8821 @section Using frozen files
8823 @cindex fast loading of frozen files
8824 @cindex frozen files for fast loading
8825 @cindex initialization, frozen state
8826 @cindex dumping into frozen file
8827 @cindex reloading a frozen file
8828 @cindex @acronym{GNU} extensions
8829 Suppose a user has a library of @code{m4} initializations in
8830 @file{base.m4}, which is then used with multiple input files:
8834 $ @kbd{m4 base.m4 input1.m4}
8835 $ @kbd{m4 base.m4 input2.m4}
8836 $ @kbd{m4 base.m4 input3.m4}
8839 Rather than spending time parsing the fixed contents of @file{base.m4}
8840 every time, the user might rather execute:
8844 $ @kbd{m4 -F base.m4f base.m4}
8848 once, and further execute, as often as needed:
8852 $ @kbd{m4 -R base.m4f input1.m4}
8853 $ @kbd{m4 -R base.m4f input2.m4}
8854 $ @kbd{m4 -R base.m4f input3.m4}
8858 with the varying input. The first call, containing the @option{-F}
8859 option, only reads and executes file @file{base.m4}, defining
8860 various application macros and computing other initializations.
8861 Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
8862 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
8863 file which contains a kind of snapshot of the @code{m4} internal state.
8865 Later calls, containing the @option{-R} option, are able to reload
8866 the internal state of @code{m4}, from @file{base.m4f},
8867 @emph{prior} to reading any other input files. This means
8868 instead of starting with a virgin copy of @code{m4}, input will be
8869 read after having effectively recovered the effect of a prior run.
8870 In our example, the effect is the same as if file @file{base.m4} has
8871 been read anew. However, this effect is achieved a lot faster.
8873 Only one frozen file may be created or read in any one @code{m4}
8874 invocation. It is not possible to recover two frozen files at once.
8875 However, frozen files may be updated incrementally, through using
8876 @option{-R} and @option{-F} options simultaneously. For example, if
8877 some care is taken, the command:
8881 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
8885 could be broken down in the following sequence, accumulating the same
8890 $ @kbd{m4 -F file1.m4f file1.m4}
8891 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
8892 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
8893 $ @kbd{m4 -R file3.m4f file4.m4}
8896 Some care is necessary because the frozen file does not save all state
8897 information. Stacks of macro definitions via @code{pushdef} are
8898 accurately stored, along with all renamed or undefined builtins, as are
8899 the current syntax rules such as from @code{changequote}. However, the
8900 value of @code{sysval} and text saved in @code{m4wrap} are not currently
8901 preserved. Also, changing command line options between runs may cause
8902 unexpected behavior. A future release of @acronym{GNU} M4 may improve
8903 on the quality of frozen files.
8905 When an @code{m4} run is to be frozen, the automatic undiversion
8906 which takes place at end of execution is inhibited. Instead, all
8907 positively numbered diversions are saved into the frozen file.
8908 The active diversion number is also transmitted.
8910 A frozen file to be reloaded need not reside in the current directory.
8911 It is looked up the same way as an @code{include} file (@pxref{Search
8914 If the frozen file was generated with a newer version of @code{m4}, and
8915 contains directives that an older @code{m4} cannot parse, attempting to
8916 load the frozen file with option @option{-R} will cause @code{m4} to
8917 exit with status 63 to indicate version mismatch.
8919 @node Frozen file format 1
8920 @section Frozen file format 1
8922 @cindex frozen file format 1
8923 @cindex file format, frozen file version 1
8924 Frozen files are sharable across architectures. It is safe to write
8925 a frozen file on one machine and read it on another, given that the
8926 second machine uses the same or newer version of @acronym{GNU} @code{m4}.
8927 It is conventional, but not required, to give a frozen file the suffix
8930 Older versions of @acronym{GNU} @code{m4} create frozen files with
8931 syntax version 1. These files can be read by the current version, but
8932 are no longer produced. Version 1 files are mostly text files, although
8933 any macros or diversions that contained nonprintable characters or long
8934 lines cause the resulting frozen file to do likewise, since there are no
8935 escape sequences. The file can be edited to change the state that
8936 @code{m4} will start with. It is composed of several directives, each
8937 starting with a single letter and ending with a newline (@key{NL}).
8938 Wherever a directive is expected, the character @samp{#} can be used
8939 instead to introduce a comment line; empty lines are also ignored if
8940 they are not part of an embedded string.
8942 In the following descriptions, each @var{len} refers to the length of a
8943 corresponding subsequent @var{str}. Numbers are always expressed in
8944 decimal, and an omitted number defaults to 0. The valid directives in
8948 @item V @var{number} @key{NL}
8949 Confirms the format of the file. Version 1 is recognized when
8950 @var{number} is 1. This directive must be the first non-comment in the
8951 file, and may not appear more than once.
8953 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8954 Uses @var{str1} and @var{str2} as the begin-comment and
8955 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
8958 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
8959 Selects diversion @var{number}, making it current, then copy @var{str}
8960 in the current diversion. @var{number} may be a negative number for a
8961 diversion that discards text. To merely specify an active selection,
8962 use this command with an empty @var{str}. With 0 as the diversion
8963 @var{number}, @var{str} will be issued on standard output at reload
8964 time. @acronym{GNU} @code{m4} will not produce the @samp{D} directive
8965 with non-zero length for diversion 0, but this can be done with manual
8966 edits. This directive may appear more than once for the same diversion,
8967 in which case the diversion is the concatenation of the various uses.
8968 If omitted, then diversion 0 is current.
8970 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8971 Defines, through @code{pushdef}, a definition for @var{str1} expanding
8972 to the function whose builtin name is @var{str2}. If the builtin does
8973 not exist (for example, if the frozen file was produced by a copy of
8974 @code{m4} compiled with the now-abandoned @code{changeword} support),
8975 the reload is silent, but any subsequent use of the definition of
8976 @var{str1} will result in a warning. This directive may appear more
8977 than once for the same name, and its order, along with @samp{T}, is
8978 important. If omitted, you will have no access to any builtins.
8980 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8981 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
8982 strings. If omitted, then @samp{`} and @samp{'} are the quote
8985 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8986 Defines, though @code{pushdef}, a definition for @var{str1}
8987 expanding to the text given by @var{str2}. This directive may appear
8988 more than once for the same name, and its order, along with @samp{F}, is
8992 When loading format 1, the syntax categories @samp{@{} and @samp{@}} are
8993 disabled (reverting braces to be treated like plain characters). This
8994 is because frozen files created with M4 1.4.x did not understand
8995 @samp{$@{@dots{}@}} extended argument notation, and a frozen macro that
8996 contained this character sequence should not behave differently just
8997 because a newer version of M4 reloaded the file.
8999 @node Frozen file format 2
9000 @section Frozen file format 2
9002 @cindex frozen file format 2
9003 @cindex file format, frozen file version 2
9004 The syntax of version 1 has some drawbacks; if any macro or diversion
9005 contained non-printable characters or long lines, the resulting frozen
9006 file would not qualify as a text file, making it harder to edit with
9007 some vendor tools. The concatenation of multiple strings on a single
9008 line, such as for the @samp{T} directive, makes distinguishing the two
9009 strings a bit more difficult. Finally, the format lacks support for
9010 several items of @code{m4} state, such that a reloaded file did not
9011 always behave the same as the original file.
9013 These shortcomings have been addressed in version 2 of the frozen file
9014 syntax. New directives have been added, and existing directives have
9015 additional, and sometimes optional, parameters. All @var{str} instances
9016 in the grammar are now followed by @key{NL}, which makes the split
9017 between consecutive strings easier to recognize. Strings may now
9018 contain escape sequences modeled after C, such as @samp{\n} for newline
9019 or @samp{\0} for @sc{nul}, so that the frozen file can be pure
9020 @sc{ascii} (although when hand-editing a frozen file, it is still
9021 acceptable to use the original byte rather than an escape sequence for
9022 all bytes except @samp{\}). Also in the context of a @var{str}, the
9023 escape sequence @samp{\@key{NL}} is discarded, allowing a user to split
9024 lines that are too long for some platform tools.
9027 @item V @var{number} @key{NL}
9028 Confirms the format of the file. @code{m4} @value{VERSION} only creates
9029 frozen files where @var{number} is 2. This directive must be the first
9030 non-comment in the file, and may not appear more than once.
9032 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9033 Uses @var{str1} and @var{str2} as the begin-comment and
9034 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
9037 @item d @var{len} @key{NL} @var{str} @key{NL}
9038 Sets the debug flags, using @var{str} as the argument to
9039 @code{debugmode}. If omitted, then the debug flags start in their
9040 default disabled state.
9042 @item D @var{number} , @var{len} @key{NL} @var{str} @key{NL}
9043 Selects diversion @var{number}, making it current, then copy @var{str}
9044 in the current diversion. @var{number} may be a negative number for a
9045 diversion that discards text. To merely specify an active selection,
9046 use this command with an empty @var{string}. With 0 as the diversion
9047 @var{number}, @var{str} will be issued on standard output at reload
9048 time. @acronym{GNU} @code{m4} will not produce the @samp{D} directive
9049 with non-zero length for diversion 0, but this can be done with manual
9050 edits. This directive may appear more than once for the same diversion,
9051 in which case the diversion is the concatenation of the various uses.
9052 If omitted, then diversion 0 is current.
9054 @comment FIXME - the first usage, with only one string, is not supported
9055 @comment in the current code
9056 @c @item F @var{len1} @key{NL} @var{str1} @key{NL}
9057 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9058 @itemx F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9059 Defines, through @code{pushdef}, a definition for @var{str1} expanding
9060 to the function whose builtin name is given by @var{str2} (defaulting to
9061 @var{str1} if not present). With two arguments, the builtin name is
9062 searched for among the intrinsic builtin functions only; with three
9063 arguments, the builtin name is searched for amongst the builtin
9064 functions defined by the module named by @var{str3}.
9066 @item M @var{len} @key{NL} @var{str} @key{NL}
9067 Names a module which will be searched for according to the module search
9068 path and loaded. Modules loaded from a frozen file don't add their
9069 builtin entries to the symbol table. Modules must be loaded prior to
9070 specifying module-specific builtins via the three-argument @code{F} or
9073 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9074 Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings.
9075 If omitted, then @samp{`} and @samp{'} are the quote delimiters.
9077 @item R @var{len} @key{NL} @var{str} @key{NL}
9078 Sets the default regexp syntax, where @var{str} encodes one of the
9079 regular expression syntaxes supported by @acronym{GNU} M4.
9080 @xref{Changeresyntax}, for more details.
9082 @item S @var{syntax-code} @var{len} @key{NL} @var{str} @key{NL}
9083 Defines, through @code{changesyntax}, a syntax category for each of the
9084 characters in @var{str}. The @var{syntax-code} must be one of the
9085 characters described in @ref{Changesyntax}.
9087 @item t @var{len} @key{NL} @var{str} @key{NL}
9088 Enables tracing for any macro named @var{str}, similar to using the
9089 @code{traceon} builtin. This option may occur more than once for
9090 multiple macros; if omitted, no macro starts out as traced.
9092 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9093 @itemx T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9094 Defines, though @code{pushdef}, a definition for @var{str1} expanding to
9095 the text given by @var{str2}. This directive may appear more than once
9096 for the same name, and its order, along with @samp{F}, is important. If
9097 present, the optional third argument associates the macro with a module
9098 named by @var{str3}.
9102 @chapter Compatibility with other versions of @code{m4}
9104 @cindex compatibility
9105 This chapter describes the many of the differences between this
9106 implementation of @code{m4}, and of other implementations found under
9107 UNIX, such as System V Release 3, Solaris, and @acronym{BSD} flavors.
9108 In particular, it lists the known differences and extensions to
9109 @acronym{POSIX}. However, the list is not necessarily comprehensive.
9111 At the time of this writing, @acronym{POSIX} 2001 (also known as IEEE
9112 Std 1003.1-2001) is the latest standard, although a new version of
9113 @acronym{POSIX} is under development and includes several proposals for
9114 modifying what @code{m4} is required to do. The requirements for
9115 @code{m4} are shared between @acronym{SUSv3} and @acronym{POSIX}, and
9117 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
9120 * Extensions:: Extensions in @acronym{GNU} M4
9121 * Incompatibilities:: Other incompatibilities
9122 * Experiments:: Experimental features in @acronym{GNU} M4
9126 @section Extensions in @acronym{GNU} M4
9128 @cindex @acronym{GNU} extensions
9129 @cindex @acronym{POSIX}
9130 @cindex @env{POSIXLY_CORRECT}
9131 This version of @code{m4} contains a few facilities that do not exist
9132 in System V @code{m4}. These extra facilities are all suppressed by
9133 using the @option{-G} command line option, unless overridden by other
9134 command line options.
9135 Most of these extensions are compatible with
9136 @uref{http://www.unix.org/single_unix_specification/,
9137 @acronym{POSIX}}; the few exceptions are suppressed if the
9138 @env{POSIXLY_CORRECT} environment variable is set.
9142 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
9143 several digits, while the System V @code{m4} only accepts one digit.
9144 This allows macros in @acronym{GNU} @code{m4} to take any number of
9145 arguments, and not only nine (@pxref{Arguments}).
9146 @acronym{POSIX} does not allow this extension, so it is disabled if
9147 @env{POSIXLY_CORRECT} is set.
9148 @c FIXME - update this bullet when ${11} is implemented.
9151 The @code{divert} (@pxref{Divert}) macro can manage more than 9
9152 diversions. @acronym{GNU} @code{m4} treats all positive numbers as valid
9153 diversions, rather than discarding diversions greater than 9.
9156 Files included with @code{include} and @code{sinclude} are sought in a
9157 user specified search path, if they are not found in the working
9158 directory. The search path is specified by the @option{-I} option and the
9159 @samp{M4PATH} environment variable (@pxref{Search Path}).
9162 Arguments to @code{undivert} can be non-numeric, in which case the named
9163 file will be included uninterpreted in the output (@pxref{Undivert}).
9166 Formatted output is supported through the @code{format} builtin, which
9167 is modeled after the C library function @code{printf} (@pxref{Format}).
9170 Searches and text substitution through regular expressions are supported
9171 by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
9172 (@pxref{Patsubst}) builtins.
9174 The syntax of regular expressions in M4 has never been clearly
9175 formalized. While Open@acronym{BSD} M4 uses extended regular
9176 expressions for @code{regexp} and @code{patsubst}, @acronym{GNU} M4
9177 defaults to basic regular expressions, but provides
9178 @code{changeresyntax} (@pxref{Changeresyntax}) to change the flavor of
9179 regular expression syntax in use.
9182 The output of shell commands can be read into @code{m4} with
9183 @code{esyscmd} (@pxref{Esyscmd}).
9186 There is indirect access to any builtin macro with @code{builtin}
9190 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
9193 The name of the program, the current input file, and the current input
9194 line number are accessible through the builtins @code{@w{__program__}},
9195 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
9198 The generation of sync lines can be controlled through @code{syncoutput}
9199 (@pxref{Syncoutput}).
9202 The format of the output from @code{dumpdef} and macro tracing can be
9203 controlled with @code{debugmode} (@pxref{Debugmode}).
9206 The destination of trace and debug output can be controlled with
9207 @code{debugfile} (@pxref{Debugfile}).
9210 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
9211 creating a new file with a unique name on every invocation, rather than
9212 following the insecure behavior of replacing the trailing @samp{X}
9213 characters with the @code{m4} process id. @acronym{POSIX} does not
9214 allow this extension, so @code{maketemp} is insecure if
9215 @env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in
9219 @acronym{POSIX} only requires support for the command line options
9220 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
9221 by @acronym{GNU} M4 are extensions. @xref{Invoking m4}, for a
9222 description of these options.
9225 The debugging and tracing facilities in @acronym{GNU} @code{m4} are much
9226 more extensive than in most other versions of @code{m4}.
9229 Some traditional implementations only allow reading standard input
9230 once, but @acronym{GNU} @code{m4} correctly handles multiple instances
9231 of @samp{-} on the command line.
9234 @acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
9235 (first-in, first-out) order, and most other implementations obey this.
9236 However, versions of @acronym{GNU} @code{m4} earlier than 1.6 used
9237 LIFO order. Furthermore, @acronym{POSIX} states that only the first
9238 argument to @code{m4wrap} is saved for later evaluation, but
9239 @acronym{GNU} @code{m4} saves and processes all arguments, with output
9240 separated by spaces.
9243 @acronym{POSIX} states that builtins that require arguments, but are
9244 called without arguments, have undefined behavior. Traditional
9245 implementations simply behave as though empty strings had been passed.
9246 For example, @code{a`'define`'b} would expand to @code{ab}. But
9247 @acronym{GNU} @code{m4} ignores certain builtins if they have missing
9248 arguments, giving @code{adefineb} for the above example.
9251 @node Incompatibilities
9252 @section Other incompatibilities
9254 There are a few other incompatibilities between this implementation of
9255 @code{m4}, and what @acronym{POSIX} requires, or what the System V
9256 version implemented.
9260 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
9261 by undefining the entire stack of previous definitions, and if doing
9262 @code{undefine(`f')} first. @acronym{GNU} @code{m4} replaces just the top
9263 definition on the stack, as if doing @code{popdef(`f')} followed by
9264 @code{pushdef(`f',`1')}. @acronym{POSIX} allows either behavior.
9267 At one point, @acronym{POSIX} required @code{changequote(@var{arg})}
9268 (@pxref{Changequote}) to use newline as the close quote, but this was a
9269 bug, and the next version of @acronym{POSIX} is anticipated to state
9270 that using empty strings or just one argument is unspecified.
9271 Meanwhile, the @acronym{GNU} @code{m4} behavior of treating an empty
9272 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
9273 repeating the start-quote delimiter, and BSD treats it as leaving the
9274 previous end-quote delimiter unchanged. For predictable results, never
9275 call changequote with just one argument, or with empty strings for
9279 At one point, @acronym{POSIX} required @code{changecom(@var{arg},)}
9280 (@pxref{Changecom}) to make it impossible to end a comment, but this is
9281 a bug, and the next version of @acronym{POSIX} is anticipated to state
9282 that using empty strings is unspecified. Meanwhile, the @acronym{GNU}
9283 @code{m4} behavior of treating an empty end-comment delimiter as newline
9284 is not portable, as BSD treats it as leaving the previous end-comment
9285 delimiter unchanged. It is also impossible in BSD implementations to
9286 disable comments, even though that is required by @acronym{POSIX}. For
9287 predictable results, never call changecom with empty strings for
9291 Traditional implementations allow argument collection, but not string
9292 and comment processing, to span file boundaries. Thus, if @file{a.m4}
9293 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
9294 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
9295 gives an error message that the end of file was encountered inside a
9296 macro with @acronym{GNU} @code{m4}. On the other hand, traditional
9297 implementations do end of file processing for files included with
9298 @code{include} or @code{sinclude} (@pxref{Include}), while @acronym{GNU}
9299 @code{m4} seamlessly integrates the content of those files. Thus
9300 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
9304 @acronym{POSIX} requires @code{eval} (@pxref{Eval}) to treat all
9305 operators with the same precedence as C@. However, earlier versions of
9306 @acronym{GNU} @code{m4} followed the traditional behavior of other
9307 @code{m4} implementations, where bitwise and logical negation (@samp{~}
9308 and @samp{!}) have lower precedence than equality operators; and where
9309 equality operators (@samp{==} and @samp{!=}) had the same precedence as
9310 relational operators (such as @samp{<}). Use explicit parentheses to
9311 ensure proper precedence. As extensions to @acronym{POSIX},
9312 @acronym{GNU} @code{m4} gives well-defined semantics to operations that
9313 C leaves undefined, such as when overflow occurs, when shifting negative
9314 numbers, or when performing division by zero. @acronym{POSIX} also
9315 requires @samp{=} to cause an error, but many traditional
9316 implementations allowed it as an alias for @samp{==}.
9319 @acronym{POSIX} 2001 requires @code{translit} (@pxref{Translit}) to
9320 treat each character of the second and third arguments literally.
9321 However, it is anticipated that the next version of @acronym{POSIX} will
9322 allow the @acronym{GNU} @code{m4} behavior of treating @samp{-} as a
9326 @acronym{POSIX} requires @code{m4} to honor the locale environment
9327 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
9328 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
9329 implemented in @acronym{GNU} @code{m4}.
9332 @acronym{GNU} @code{m4} implements sync lines differently from System V
9333 @code{m4}, when text is being diverted. @acronym{GNU} @code{m4} outputs
9334 the sync lines when the text is being diverted, and System V @code{m4}
9335 when the diverted text is being brought back.
9337 The problem is which lines and file names should be attached to text
9338 that is being, or has been, diverted. System V @code{m4} regards all
9339 the diverted text as being generated by the source line containing the
9340 @code{undivert} call, whereas @acronym{GNU} @code{m4} regards the
9341 diverted text as being generated at the time it is diverted.
9343 The sync line option is used mostly when using @code{m4} as
9344 a front end to a compiler. If a diverted line causes a compiler error,
9345 the error messages should most probably refer to the place where the
9346 diversion was made, and not where it was inserted again.
9348 @comment options: -s
9353 @result{}#line 3 "stdin"
9356 @result{}#line 2 "stdin"
9358 @result{}#line 1 "stdin"
9362 @comment FIXME - this needs to be fixed before 2.0.
9363 The current @code{m4} implementation has a limitation that the syncline
9364 output at the start of each diversion occurs no matter what, even if the
9365 previous diversion did not end with a newline. This goes contrary to
9366 the claim that synclines appear on a line by themselves, so this
9367 limitation may be corrected in a future version of @code{m4}. In the
9368 meantime, when using @option{-s}, it is wisest to make sure all
9369 diversions end with newline.
9372 @acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
9384 There is nothing inherently wrong with defining @samp{x} to
9385 return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
9386 because that would cause an infinite rescan loop.
9387 In @code{m4}, one might use macros to hold strings, as we do for
9388 variables in other programming languages, further checking them with:
9392 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
9396 In cases like this one, an interdiction for a macro to hold its own name
9397 would be a useless limitation. Of course, this leaves more rope for the
9398 @acronym{GNU} @code{m4} user to hang himself! Rescanning hangs may be
9399 avoided through careful programming, a little like for endless loops in
9400 traditional programming languages.
9403 @acronym{POSIX} states that only unquoted leading newlines and blanks
9404 (that is, space and tab) are ignored when collecting macro arguments.
9405 However, this appears to be a bug in @acronym{POSIX}, since most
9406 traditional implementations also ignore all whitespace (formfeed,
9407 carriage return, and vertical tab). @acronym{GNU} @code{m4} follows
9408 tradition and ignores all leading unquoted whitespace.
9412 @section Experimental features in @acronym{GNU} M4
9414 Certain features of GNU @code{m4} are experimental.
9416 Some are only available if activated by an option given to
9417 @file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation
9418 time. The functionality
9419 might change or even go away in the future. @emph{Do not rely on it}.
9420 Please direct your comments about it the same way you would do for bugs.
9422 @section Changesyntax
9424 An experimental feature, which improves the flexibility of @code{m4},
9425 allows for changing the way the input is parsed (@pxref{Changesyntax}).
9426 No compile time option is needed for @code{changesyntax}. The
9427 implementation is careful to not slow down @code{m4} parsing, unlike the
9428 withdrawn experiment of @code{changeword} that appeared earlier in M4
9431 @section Multiple precision arithmetic
9433 Another experimental feature, which would improve @code{m4} usefulness,
9434 allows for multiple precision rational arithmetic similar to
9435 @code{eval}. You must have the @acronym{GNU} multi-precision (gmp)
9436 library installed, and should use @kbd{./configure --with-gmp} if you
9437 want this feature compiled in. The current implementation is unproven
9438 and might go away. Do not count on it yet.
9441 @chapter Correct version of some examples
9443 Some of the examples in this manuals are buggy or not very robust, for
9444 demonstration purposes. Improved versions of these composite macros are
9448 * Improved exch:: Solution for @code{exch}
9449 * Improved forloop:: Solution for @code{forloop}
9450 * Improved foreach:: Solution for @code{foreach}
9451 * Improved copy:: Solution for @code{copy}
9452 * Improved m4wrap:: Solution for @code{m4wrap}
9453 * Improved cleardivert:: Solution for @code{cleardivert}
9454 * Improved capitalize:: Solution for @code{capitalize}
9455 * Improved fatal_error:: Solution for @code{fatal_error}
9459 @section Solution for @code{exch}
9461 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
9462 to double quote their arguments. A nicer definition, which lets
9463 clients follow the rule of thumb of one level of quoting per level of
9464 parentheses, involves adding quotes in the definition of @code{exch}, as
9468 define(`exch', ``$2', `$1'')
9470 define(exch(`expansion text', `macro'))
9473 @result{}expansion text
9476 @node Improved forloop
9477 @section Solution for @code{forloop}
9479 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
9480 into an infinite loop if given an iterator that is not parsed as a macro
9481 name. It does not do any sanity checking on its numeric bounds, and
9482 only permits decimal numbers for bounds. Here is an improved version,
9483 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
9484 version also optimizes overhead by calling four macros instead of six
9485 per iteration (excluding those in @var{text}), by not dereferencing the
9486 @var{iterator} in the helper @code{@w{_forloop}}.
9490 $ @kbd{m4 -I examples}
9491 undivert(`forloop2.m4')dnl
9492 @result{}divert(`-1')
9493 @result{}# forloop(var, from, to, stmt) - improved version:
9494 @result{}# works even if VAR is not a strict macro name
9495 @result{}# performs sanity check that FROM is larger than TO
9496 @result{}# allows complex numerical expressions in TO and FROM
9497 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9498 @result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
9499 @result{} eval(`$3'), `$4')popdef(`$1')')')
9500 @result{}define(`_forloop',
9501 @result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
9502 @result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
9503 @result{}divert`'dnl
9504 include(`forloop2.m4')
9506 forloop(`i', `2', `1', `no iteration occurs')
9508 forloop(`', `1', `2', ` odd iterator name')
9509 @result{} odd iterator name odd iterator name
9510 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
9511 @result{} 0xa 0xb 0xc
9512 forloop(`i', `a', `b', `non-numeric bounds')
9513 @error{}m4:stdin:6: Warning: eval: bad input: `(a) <= (b)'
9517 One other change to notice is that the improved version used @samp{_$0}
9518 rather than @samp{_foreach} to invoke the helper routine. In general,
9519 this is a good practice to follow, because then the set of macros can be
9520 uniformly transformed. The following example shows a transformation
9521 that doubles the current quoting and appends a suffix @samp{2} to each
9522 transformed macro. If @code{foreach} refers to the literal
9523 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
9524 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
9525 to an infinite recursion loop in this example.
9527 @comment options: -L9
9531 $ @kbd{m4 -d -L 9 -I examples}
9532 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
9534 define(`double', `define(`$1'`2',
9535 arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
9537 double(`forloop')double(`_forloop')defn(`forloop2')
9538 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
9539 @result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
9540 @result{} eval(``$3''), ``$4'')popdef(``$1'')'')
9541 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9543 changequote(`[', `]')changequote([``], [''])
9545 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9547 changequote`'include(`forloop.m4')
9549 double(`forloop')double(`_forloop')defn(`forloop2')
9550 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
9551 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9553 changequote(`[', `]')changequote([``], [''])
9555 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9556 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
9559 One more optimization is still possible. Instead of repeatedly
9560 assigning a variable then invoking or dereferencing it, it is possible
9561 to pass the current iterator value as a single argument. Coupled with
9562 @code{curry} if other arguments are needed (@pxref{Composition}), or
9563 with helper macros if the argument is needed in more than one place in
9564 the expansion, the output can be generated with three, rather than four,
9565 macros of overhead per iteration. Notice how the file
9566 @file{m4-@value{VERSION}/@/examples/@/forloop3.m4} rearranges the
9567 arguments of the helper @code{_forloop} to take two arguments that are
9568 placed around the current value. By splitting a balanced set of
9569 parantheses across multiple arguments, the helper macro can now be
9570 shared by @code{forloop} and the new @code{forloop_arg}.
9574 $ @kbd{m4 -I examples}
9575 include(`forloop3.m4')
9577 undivert(`forloop3.m4')dnl
9578 @result{}divert(`-1')
9579 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
9580 @result{}# each value between FROM and TO, without define overhead
9581 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
9582 @result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
9583 @result{}# forloop(var, from, to, stmt) - refactored to share code
9584 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9585 @result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
9586 @result{} `define(`$1',', `)$4')popdef(`$1')')')
9587 @result{}define(`_forloop',
9588 @result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
9589 @result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
9590 @result{}divert`'dnl
9591 forloop(`i', `1', `3', ` i')
9593 define(`echo', `$@@')
9595 forloop_arg(`1', `3', ` echo')
9599 forloop_arg(`1', `3', `curry(`pushdef', `a')')
9611 Of course, it is possible to make even more improvements, such as
9612 adding an optional step argument, or allowing iteration through
9613 descending sequences. @acronym{GNU} Autoconf provides some of these
9614 additional bells and whistles in its @code{m4_for} macro.
9616 @node Improved foreach
9617 @section Solution for @code{foreach}
9619 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
9620 presented earlier each have flaws. First, we will examine and fix the
9621 quadratic behavior of @code{foreachq}:
9625 $ @kbd{m4 -I examples}
9626 include(`foreachq.m4')
9628 traceon(`shift')debugmode(`aq')
9630 foreachq(`x', ``1', `2', `3', `4'', `x
9633 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9634 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9636 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9637 @error{}m4trace: -3- shift(`2', `3', `4')
9638 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9639 @error{}m4trace: -2- shift(`2', `3', `4')
9641 @error{}m4trace: -5- shift(`1', `2', `3', `4')
9642 @error{}m4trace: -4- shift(`2', `3', `4')
9643 @error{}m4trace: -3- shift(`3', `4')
9644 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9645 @error{}m4trace: -3- shift(`2', `3', `4')
9646 @error{}m4trace: -2- shift(`3', `4')
9648 @error{}m4trace: -6- shift(`1', `2', `3', `4')
9649 @error{}m4trace: -5- shift(`2', `3', `4')
9650 @error{}m4trace: -4- shift(`3', `4')
9651 @error{}m4trace: -3- shift(`4')
9654 @cindex quadratic behavior, avoiding
9655 @cindex avoiding quadratic behavior
9656 Each successive iteration was adding more quoted @code{shift}
9657 invocations, and the entire list contents were passing through every
9658 iteration. In general, when recursing, it is a good idea to make the
9659 recursion use fewer arguments, rather than adding additional quoted
9660 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
9661 fewer macros, is less likely to run into machine limits, and most
9662 importantly, performs faster. The fixed version of @code{foreachq} can
9663 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
9667 $ @kbd{m4 -I examples}
9668 include(`foreachq2.m4')
9670 undivert(`foreachq2.m4')dnl
9671 @result{}include(`quote.m4')dnl
9672 @result{}divert(`-1')
9673 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9674 @result{}# quoted list, improved version
9675 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
9676 @result{}define(`_arg1q', ``$1'')
9677 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
9678 @result{}define(`_foreachq', `ifelse(`$2', `', `',
9679 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
9680 @result{}divert`'dnl
9681 traceon(`shift')debugmode(`aq')
9683 foreachq(`x', ``1', `2', `3', `4'', `x
9686 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9688 @error{}m4trace: -3- shift(`2', `3', `4')
9690 @error{}m4trace: -3- shift(`3', `4')
9694 Note that the fixed version calls unquoted helper macros in
9695 @code{@w{_foreachq}} to trim elements immediately; those helper macros
9696 in turn must re-supply the layer of quotes lost in the macro invocation.
9697 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
9698 element, with @code{@w{_arg1}} of the earlier implementation that
9699 returned the first list element directly. Additionally, by calling the
9700 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
9701 contains unexpanded macros.
9703 The astute m4 programmer might notice that the solution above still uses
9704 more macro invocations than strictly necessary. Note that @samp{$2},
9705 which contains an arbitrarily long quoted list, is expanded and
9706 rescanned three times per iteration of @code{_foreachq}. Furthermore,
9707 every iteration of the algorithm effectively unboxes then reboxes the
9708 list, which costs a couple of macro invocations. It is possible to
9709 rewrite the algorithm by swapping the order of the arguments to
9710 @code{_foreachq} in order to operate on an unboxed list in the first
9711 place, and by using the fixed-length @samp{$#} instead of an arbitrary
9712 length list as the key to end recursion. The result is an overhead of
9713 six macro invocations per loop (excluding any macros in @var{text}),
9714 instead of eight. This alternative approach is available as
9715 @file{m4-@value{VERSION}/@/examples/@/foreach3.m4}:
9719 $ @kbd{m4 -I examples}
9720 include(`foreachq3.m4')
9722 undivert(`foreachq3.m4')dnl
9723 @result{}divert(`-1')
9724 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9725 @result{}# quoted list, alternate improved version
9726 @result{}define(`foreachq', `ifelse(`$2', `', `',
9727 @result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
9728 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
9729 @result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
9730 @result{} shift(shift(shift($@@))))')')
9731 @result{}divert`'dnl
9732 traceon(`shift')debugmode(`aq')
9734 foreachq(`x', ``1', `2', `3', `4'', `x
9737 @error{}m4trace: -4- shift(`x', `x
9738 @error{}', `', `1', `2', `3', `4')
9739 @error{}m4trace: -3- shift(`x
9740 @error{}', `', `1', `2', `3', `4')
9741 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
9743 @error{}m4trace: -4- shift(`x', `x
9744 @error{}', `1', `2', `3', `4')
9745 @error{}m4trace: -3- shift(`x
9746 @error{}', `1', `2', `3', `4')
9747 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9749 @error{}m4trace: -4- shift(`x', `x
9750 @error{}', `2', `3', `4')
9751 @error{}m4trace: -3- shift(`x
9752 @error{}', `2', `3', `4')
9753 @error{}m4trace: -2- shift(`2', `3', `4')
9755 @error{}m4trace: -4- shift(`x', `x
9756 @error{}', `3', `4')
9757 @error{}m4trace: -3- shift(`x
9758 @error{}', `3', `4')
9759 @error{}m4trace: -2- shift(`3', `4')
9762 Prior to M4 1.6, every instance of @samp{$@@} was rescanned as it was
9763 encountered. Thus, the @file{foreachq3.m4} alternative used much less
9764 memory than @file{foreachq2.m4}, and executed as much as 10% faster,
9765 since each iteration encountered fewer @samp{$@@}. However, the
9766 implementation of rescanning every byte in @samp{$@@} was quadratic in
9767 the number of bytes scanned (for example, making the broken version in
9768 @file{foreachq.m4} cubic, rather than quadratic, in behavior). Once the
9769 underlying M4 implementation was improved in 1.6 to reuse results of
9770 previous scans, both styles of @code{foreachq} become linear in the
9771 number of bytes scanned, but the @file{foreachq3.m4} version remains
9772 noticeably faster because of fewer macro invocations. Notice how the
9773 implementation injects an empty argument prior to expanding @samp{$2}
9774 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
9775 the third argument altogether, and ends recursion when there are three
9776 arguments left because there was nothing left to pass through
9777 @code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
9778 than the two conditionals used in the version from @file{foreachq2.m4}.
9780 @cindex nine arguments, more than
9781 @cindex more than nine arguments
9782 @cindex arguments, more than nine
9783 So far, all of the implementations of @code{foreachq} presented have
9784 been quadratic with M4 1.4.x. But @code{forloop} is linear, because
9785 each iteration parses a constant amount of arguments. So, it is
9786 possible to design a variant that uses @code{forloop} to do the
9787 iteration, then uses @samp{$@@} only once at the end, giving a linear
9788 result even with older M4 implementations. This implementation relies
9789 on the @acronym{GNU} extension that @samp{$10} expands to the tenth
9790 argument rather than the first argument concatenated with @samp{0}. The
9791 trick is to define an intermediate macro that repeats the text
9792 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
9793 integers corresponding to each argument. The helper macro
9794 @code{_foreachq_} is needed in order to generate the literal sequences
9795 such as @samp{$1} into the intermediate macro, rather than expanding
9796 them as the arguments of @code{_foreachq}. With this approach, no
9797 @code{shift} calls are even needed! However, when linear recursion is
9798 available in new enough M4, the time and memory cost of using
9799 @code{forloop} to build an intermediate macro outweigh the costs of any
9800 of the previous implementations (there are seven macros of overhead per
9801 iteration instead of six in @file{foreachq3.m4}, and the entire
9802 intermediate macro must be built in memory before any iteration is
9803 expanded). Additionally, this approach will need adjustment when a
9804 future version of M4 follows @acronym{POSIX} by no longer treating
9805 @samp{$10} as the tenth argument; the anticipation is that
9806 @samp{$@{10@}} can be used instead, although that alternative syntax is
9811 $ @kbd{m4 -I examples}
9812 include(`foreachq4.m4')
9814 undivert(`foreachq4.m4')dnl
9815 @result{}include(`forloop2.m4')dnl
9816 @result{}divert(`-1')
9817 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9818 @result{}# quoted list, version based on forloop
9819 @result{}define(`foreachq',
9820 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
9821 @result{}define(`_foreachq',
9822 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
9823 @result{} `$0_(`1', `2', indir(`$1'))')`popdef(
9824 @result{} `$1')')indir(`$1', $@@)')
9825 @result{}define(`_foreachq_',
9826 @result{}``define(`$$1', `$$3')$$2`''')
9827 @result{}divert`'dnl
9828 traceon(`shift')debugmode(`aq')
9830 foreachq(`x', ``1', `2', `3', `4'', `x
9838 For yet another approach, the improved version of @code{foreach},
9839 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
9840 overquotes the arguments to @code{@w{_foreach}} to begin with, using
9841 @code{dquote_elt}. Then @code{@w{_foreach}} can just use
9842 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
9847 $ @kbd{m4 -I examples}
9848 include(`foreach2.m4')
9850 undivert(`foreach2.m4')dnl
9851 @result{}include(`quote.m4')dnl
9852 @result{}divert(`-1')
9853 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
9854 @result{}# parenthesized list, improved version
9855 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
9856 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
9857 @result{}define(`_arg1', `$1')
9858 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
9859 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
9860 @result{}divert`'dnl
9861 traceon(`shift')debugmode(`aq')
9863 foreach(`x', `(`1', `2', `3', `4')', `x
9865 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9866 @error{}m4trace: -4- shift(`2', `3', `4')
9867 @error{}m4trace: -4- shift(`3', `4')
9869 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
9871 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
9873 @error{}m4trace: -3- shift(``3'', ``4'')
9875 @error{}m4trace: -3- shift(``4'')
9878 It is likewise possible to write a variant of @code{foreach} that
9879 performs in linear time on M4 1.4.x; the easiest method is probably
9880 writing a version of @code{foreach} that unboxes its list, then invokes
9881 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
9883 @cindex filtering defined symbols
9884 @cindex subset of defined symbols
9885 @cindex defined symbols, filtering
9886 With a robust @code{foreachq} implementation, it is possible to create a
9887 filter on a list of defined symbols. This next example will find all
9888 symbols that contain @samp{if} or @samp{def}, via two different
9889 approaches. In the first approach, @code{dquote_elt} is used to
9890 overquote each list element, then @code{dquote} forms the list; that
9891 way, the iterator @code{macro} can be expanded in place because its
9892 contents are already quoted. This approach also uses a self-modifying
9893 macro @code{sep} to provide the correct number of commas. In the second
9894 approach, the iterator @code{macro} contains live text, so it must be
9895 used with @code{defn} to avoid unintentional expansion. The correct
9896 number of commas is achieved by using @code{shift} to ignore the first
9897 one, although a leading space still remains.
9901 $ @kbd{m4 -I examples}
9902 include(`quote.m4')include(`foreachq2.m4')
9904 pushdef(`sep', `define(`sep', ``, '')')
9906 foreachq(`macro', dquote(dquote_elt(m4symbols)),
9907 `regexp(macro, `.*if.*', `sep`\&'')')
9908 @result{}ifdef, ifelse, shift
9911 shift(foreachq(`macro', dquote(m4symbols),
9912 `regexp(defn(`macro'), `def', `,` ''dquote(defn(`macro')))'))
9913 @result{} define, defn, dumpdef, ifdef, popdef, pushdef, undefine
9916 In summary, recursion over list elements is trickier than it appeared at
9917 first glance, but provides a powerful idiom within @code{m4} processing.
9918 As a final demonstration, both list styles are now able to handle
9919 several scenarios that would wreak havoc on one or both of the original
9920 implementations. This points out one other difference between the
9921 list styles. @code{foreach} evaluates unquoted list elements only once,
9922 in preparation for calling @code{@w{_foreach}}, similary for
9923 @code{foreachq} as provided by @file{foreachq3.m4} or
9924 @file{foreachq4.m4}. But
9925 @code{foreachq}, as provided by @file{foreachq2.m4},
9926 evaluates unquoted list elements twice while visiting the first list
9927 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
9928 deciding which list style to use, one must take into account whether
9929 repeating the side effects of unquoted list elements will have any
9930 detrimental effects.
9934 $ @kbd{m4 -d -I examples}
9935 include(`foreach2.m4')
9937 include(`foreachq2.m4')
9940 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
9942 dnl 1-element list of empty element
9943 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
9945 dnl 2-element list of empty elements
9946 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
9947 @result{}<><> / <><>
9948 dnl 1-element list of a comma
9949 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
9951 dnl 2-element list of unbalanced parentheses
9952 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
9953 @result{}<(><)> / <(><)>
9954 define(`ab', `oops')dnl using defn(`iterator')
9955 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
9956 foreachq(`x', ``a', `b'', `defn(`x')')
9958 define(`active', `ACT, IVE')
9962 dnl list of unquoted macros; expansion occurs before recursion
9963 foreach(`x', `(active, active)', `<x>
9965 @error{}m4trace: -4- active -> `ACT, IVE'
9966 @error{}m4trace: -4- active -> `ACT, IVE'
9971 foreachq(`x', `active, active', `<x>
9973 @error{}m4trace: -3- active -> `ACT, IVE'
9974 @error{}m4trace: -3- active -> `ACT, IVE'
9976 @error{}m4trace: -3- active -> `ACT, IVE'
9977 @error{}m4trace: -3- active -> `ACT, IVE'
9981 dnl list of quoted macros; expansion occurs during recursion
9982 foreach(`x', `(`active', `active')', `<x>
9984 @error{}m4trace: -1- active -> `ACT, IVE'
9986 @error{}m4trace: -1- active -> `ACT, IVE'
9988 foreachq(`x', ``active', `active'', `<x>
9990 @error{}m4trace: -1- active -> `ACT, IVE'
9992 @error{}m4trace: -1- active -> `ACT, IVE'
9994 dnl list of double-quoted macro names; no expansion
9995 foreach(`x', `(``active'', ``active'')', `<x>
9999 foreachq(`x', ```active'', ``active''', `<x>
10005 @node Improved copy
10006 @section Solution for @code{copy}
10008 The macro @code{copy} presented above works with M4 1.6 and newer, but
10009 is unable to handle builtin tokens with M4 1.4.x, because it tries to
10010 pass the builtin token through the macro @code{curry}, where it is
10011 silently flattened to an empty string (@pxref{Composition}). Rather
10012 than using the problematic @code{curry} to work around the limitation
10013 that @code{stack_foreach} expects to invoke a macro that takes exactly
10014 one argument, we can write a new macro that lets us form the exact
10015 two-argument @code{pushdef} call sequence needed, so that we are no
10016 longer passing a builtin token through a text macro.
10018 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
10020 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
10021 @var{post}, @var{sep})
10022 For each of the @code{pushdef} definitions associated with @var{macro},
10023 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
10024 Additionally, expand @var{sep} between definitions.
10025 @code{stack_foreach_sep} visits the oldest definition first, while
10026 @code{stack_foreach_sep_lifo} visits the current definition first. The
10027 expansion may dereference @var{macro}, but should not modify it. There
10028 are a few special macros, such as @code{defn}, which cannot be used as
10029 the @var{macro} parameter.
10032 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
10033 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
10034 `)')}. By supplying explicit parentheses, split among the @var{pre} and
10035 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
10036 construct macro calls with more than one argument, without passing
10037 builtin tokens through a macro call. It is likewise possible to
10038 directly reference the stack definitions without a macro call, by
10039 leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
10040 @code{copy} on builtin tokens, it also executes with fewer macro
10043 The new macro also adds a separator that is only output after the first
10044 iteration of the helper @code{_stack_reverse_sep}, implemented by
10045 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
10046 argument in subsequent iterations. Note that the empty string that
10047 separates @var{sep} from @var{pre} is provided as part of the fourth
10048 argument when originally calling @code{_stack_reverse_sep}, and not by
10049 writing @code{$4`'$3} as the third argument in the recursive call; while
10050 the other approach would give the same output, it does so at the expense
10051 of increasing the argument size on each iteration of
10052 @code{_stack_reverse_sep}, which results in quadratic instead of linear
10053 execution time. The improved stack walking macros are available in
10054 @file{m4-@value{VERSION}/@/examples/@/stack_sep.m4}:
10058 $ @kbd{m4 -I examples}
10059 include(`stack_sep.m4')
10061 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
10063 `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
10064 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
10074 pushdef(`c', `1')pushdef(`c', `2')
10076 stack_foreach_sep_lifo(`c', `', `', `, ')
10078 undivert(`stack_sep.m4')dnl
10079 @result{}divert(`-1')
10080 @result{}# stack_foreach_sep(macro, pre, post, sep)
10081 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
10082 @result{}# from the definition stack of MACRO, starting with the oldest, and
10083 @result{}# separated by SEP between definitions.
10084 @result{}define(`stack_foreach_sep',
10085 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
10086 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
10087 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
10088 @result{}# Like stack_foreach_sep, but starting with the newest definition.
10089 @result{}define(`stack_foreach_sep_lifo',
10090 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
10091 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
10092 @result{}define(`_stack_reverse_sep',
10093 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
10094 @result{} `$1', `$2', `$4$3')')')
10095 @result{}divert`'dnl
10098 @node Improved m4wrap
10099 @section Solution for @code{m4wrap}
10101 The replacement @code{m4wrap} versions presented above, designed to
10102 guarantee FIFO or LIFO order regardless of the underlying M4
10103 implementation, share a bug when dealing with wrapped text that looks
10104 like parameter expansion. Note how the invocation of
10105 @code{m4wrap@var{n}} interprets these parameters, while using the
10106 builtin preserves them for their intended use.
10110 $ @kbd{m4 -I examples}
10111 include(`wraplifo.m4')
10113 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10116 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
10120 @result{}m4wrap0:---0-
10121 @result{}bar:-a-a,b-2-
10124 Additionally, the computation of @code{_m4wrap_level} and creation of
10125 multiple @code{m4wrap@var{n}} placeholders in the original examples is
10126 more expensive in time and memory than strictly necessary. Notice how
10127 the improved version grabs the wrapped text via @code{defn} to avoid
10128 parameter expansion, then undefines @code{_m4wrap_text}, before
10129 stripping a level of quotes with @code{_arg1} to expand the text. That
10130 way, each level of wrapping reuses the single placeholder, which starts
10131 each nesting level in an undefined state.
10133 Finally, it is worth emulating the @acronym{GNU} M4 extension of saving
10134 all arguments to @code{m4wrap}, separated by a space, rather than saving
10135 just the first argument. This is done with the @code{join} macro
10136 documented previously (@pxref{Shift}). The improved LIFO example is
10137 shipped as @file{m4-@value{VERSION}/@/examples/@/wraplifo2.m4}, and can
10138 easily be converted to a FIFO solution by swapping the adjacent
10139 invocations of @code{joinall} and @code{defn}.
10143 $ @kbd{m4 -I examples}
10144 include(`wraplifo2.m4')
10146 undivert(`wraplifo2.m4')dnl
10147 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
10148 @result{}include(`join.m4')dnl
10149 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
10150 @result{}define(`_arg1', `$1')dnl
10151 @result{}define(`m4wrap',
10152 @result{}`ifdef(`_$0_text',
10153 @result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
10154 @result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
10155 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
10156 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10160 m4wrap(`nested', `', `$@@
10165 @result{}foo:-a-a,b-2-
10166 @result{}nested $@@
10169 @node Improved cleardivert
10170 @section Solution for @code{cleardivert}
10172 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
10173 called without arguments to clear all pending diversions. That is
10174 because using undivert with an empty string for an argument is different
10175 than using it with no arguments at all. Compare the earlier definition
10176 with one that takes the number of arguments into account:
10179 define(`cleardivert',
10180 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
10190 define(`cleardivert',
10191 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
10192 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
10203 @node Improved capitalize
10204 @section Solution for @code{capitalize}
10206 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
10207 not allow clients to follow the quoting rule of thumb. Consider the
10208 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
10209 difference between calling @code{capitalize} with the expansion of a
10210 macro, expanding the result of a case change, and changing the case of a
10211 double-quoted string:
10215 $ @kbd{m4 -I examples}
10216 include(`capitalize.m4')dnl
10217 define(`active', `act1, ive')dnl
10218 define(`Active', `Act2, Ive')dnl
10219 define(`ACTIVE', `ACT3, IVE')dnl
10230 downcase(``ACTIVE'')
10234 capitalize(`active')
10236 capitalize(``active'')
10237 @result{}_capitalize(`active')
10238 define(`A', `OOPS')
10242 capitalize(`active')
10246 First, when @code{capitalize} is called with more than one argument, it
10247 was throwing away later arguments, whereas @code{upcase} and
10248 @code{downcase} used @samp{$*} to collect them all. The fix is simple:
10249 use @samp{$*} consistently.
10251 Next, with single-quoting, @code{capitalize} outputs a single character,
10252 a set of quotes, then the rest of the characters, making it impossible
10253 to invoke @code{Active} after the fact, and allowing the alternate macro
10254 @code{A} to interfere. Here, the solution is to use additional quoting
10255 in the helper macros, then pass the final over-quoted output string
10256 through @code{_arg1} to remove the extra quoting and finally invoke the
10257 concatenated portions as a single string.
10259 Finally, when passed a double-quoted string, the nested macro
10260 @code{_capitalize} is never invoked because it ended up nested inside
10261 quotes. This one is the toughest to fix. In short, we have no idea how
10262 many levels of quotes are in effect on the substring being altered by
10263 @code{patsubst}. If the replacement string cannot be expressed entirely
10264 in terms of literal text and backslash substitutions, then we need a
10265 mechanism to guarantee that the helper macros are invoked outside of
10266 quotes. In other words, this sounds like a job for @code{changequote}
10267 (@pxref{Changequote}). By changing the active quoting characters, we
10268 can guarantee that replacement text injected by @code{patsubst} always
10269 occurs in the middle of a string that has exactly one level of
10270 over-quoting using alternate quotes; so the replacement text closes the
10271 quoted string, invokes the helper macros, then reopens the quoted
10272 string. In turn, that means the replacement text has unbalanced quotes,
10273 necessitating another round of @code{changequote}.
10275 In the fixed version below, (also shipped as
10276 @file{m4-@value{VERSION}/@/examples/@/capitalize.m4}), @code{capitalize}
10277 uses the alternate quotes of @samp{<<[} and @samp{]>>} (the longer
10278 strings are chosen so as to be less likely to appear in the text being
10279 converted). The helpers @code{_to_alt} and @code{_from_alt} merely
10280 reduce the number of characters required to perform a
10281 @code{changequote}, since the definition changes twice. The outermost
10282 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
10283 with alternate quoting; the innermost pair is used so that the third
10284 argument to @code{patsubst} can contain an unbalanced
10285 @samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
10286 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
10287 they contain nested quotes but are invoked with the alternate quoting
10292 $ @kbd{m4 -I examples}
10293 include(`capitalize2.m4')dnl
10294 define(`active', `act1, ive')dnl
10295 define(`Active', `Act2, Ive')dnl
10296 define(`ACTIVE', `ACT3, IVE')dnl
10297 define(`A', `OOPS')dnl
10298 capitalize(active; `active'; ``active''; ```actIVE''')
10299 @result{}Act1,Ive; Act2, Ive; Active; `Active'
10300 undivert(`capitalize2.m4')dnl
10301 @result{}divert(`-1')
10302 @result{}# upcase(text)
10303 @result{}# downcase(text)
10304 @result{}# capitalize(text)
10305 @result{}# change case of text, improved version
10306 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
10307 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
10308 @result{}define(`_arg1', `$1')
10309 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
10310 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
10311 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
10312 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
10313 @result{}define(`_capitalize_alt',
10314 @result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
10315 @result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
10316 @result{}define(`capitalize',
10317 @result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
10318 @result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
10319 @result{}divert`'dnl
10322 @node Improved fatal_error
10323 @section Solution for @code{fatal_error}
10325 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
10326 of @acronym{GNU} M4 earlier than 1.4.8, where invoking
10327 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
10328 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
10329 though all files start at line 1. Furthermore, versions earlier than
10330 1.4.6 did not support the @code{@w{__program__}} macro. If you want
10331 @code{fatal_error} to work across the entire 1.4.x release series, a
10332 better implementation would be:
10336 define(`fatal_error',
10337 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
10338 `:ifelse(__line__, `0', `',
10339 `__file__:__line__:')` fatal error: $*
10342 m4wrap(`divnum(`demo of internal message')
10343 fatal_error(`inside wrapped text')')
10346 @error{}m4:stdin:6: Warning: divnum: extra arguments ignored: 1 > 0
10348 @error{}m4:stdin:6: fatal error: inside wrapped text
10351 @c ========================================================== Appendices
10353 @node Copying This Package
10354 @appendix How to make copies of the overall M4 package
10355 @cindex License, code
10357 This appendix covers the license for copying the source code of the
10358 overall M4 package. This manual is under a different set of
10359 restrictions, covered later (@pxref{Copying This Manual}).
10362 * GNU General Public License:: License for copying the M4 package
10365 @node GNU General Public License
10366 @appendixsec License for copying the M4 package
10367 @cindex GPL, GNU General Public License
10368 @cindex GNU General Public License
10369 @cindex General Public License (GPL), GNU
10370 @include gpl-3.0.texi
10372 @node Copying This Manual
10373 @appendix How to make copies of this manual
10374 @cindex License, manual
10376 This appendix covers the license for copying this manual. Note that
10377 some of the longer examples in this manual are also distributed in the
10378 directory @file{m4-@value{VERSION}/@/examples/}, where a more
10379 permissive license is in effect when copying just the examples.
10382 * GNU Free Documentation License:: License for copying this manual
10385 @node GNU Free Documentation License
10386 @appendixsec License for copying this manual
10387 @cindex FDL, GNU Free Documentation License
10388 @cindex GNU Free Documentation License
10389 @cindex Free Documentation License (FDL), GNU
10390 @include fdl-1.3.texi
10393 @appendix Indices of concepts and macros
10396 * Macro index:: Index for all @code{m4} macros
10397 * Concept index:: Index for many concepts
10401 @appendixsec Index for all @code{m4} macros
10403 This index covers all @code{m4} builtins, as well as several useful
10404 composite macros. References are exclusively to the places where a
10405 macro is introduced the first time.
10409 @node Concept index
10410 @appendixsec Index for many concepts
10416 @c Local Variables:
10418 @c ispell-local-dictionary: "american"
10419 @c indent-tabs-mode: nil
10420 @c whitespace-check-buffer-indent: nil