1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
5 @settitle GNU M4 macro processor
8 @setcontentsaftertitlepage
18 @c The testsuite expects literal tab output in some examples, but
19 @c literal tabs in texinfo lead to formatting issues.
25 @c -------------------
26 @c The ARG is an optional argument. To be used for macro arguments in
27 @c their documentation.
29 @r{[}@var{\varname\}@r{]}
32 @c @dvar{ARG, DEFAULT}
33 @c -------------------
34 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
35 @c for macro arguments in their documentation.
36 @macro dvar{varname, default}
37 @r{[}@var{\varname\} = @samp{\default\}@r{]}
40 @comment %**end of header
41 @comment ========================================================
45 This manual is for @acronym{GNU} M4 (version @value{VERSION}, @value{UPDATED}),
46 a package containing an implementation of the m4 macro language.
48 Copyright @copyright{} 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999,
49 2000, 2001, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
52 Permission is granted to copy, distribute and/or modify this document
53 under the terms of the @acronym{GNU} Free Documentation License,
54 Version 1.2 or any later version published by the Free Software
55 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
56 Back-Cover Texts. A copy of the license is included in the section
57 entitled ``@acronym{GNU} Free Documentation License.''
61 @dircategory GNU programming tools
63 * M4: (m4). A powerful macro processor.
67 @title GNU M4, version @value{VERSION}
68 @subtitle A powerful macro processor
69 @subtitle Edition @value{EDITION}, @value{UPDATED}
70 @author by Ren@'e Seindal
71 @author and Gary V. Vaughan
72 @author and Eric Blake
75 @vskip 0pt plus 1filll
87 @acronym{GNU} @code{m4} is an implementation of the traditional UNIX macro
88 processor. It is mostly SVR4 compatible, although it has some
89 extensions (for example, handling more than 9 positional parameters
90 to macros). @code{m4} also has builtin functions for including
91 files, running shell commands, doing arithmetic, etc. Autoconf needs
92 @acronym{GNU} @code{m4} for generating @file{configure} scripts, but not for
95 @acronym{GNU} @code{m4} was originally written by Ren@'e Seindal, with
96 subsequent changes by Fran@,{c}ois Pinard and other volunteers
97 on the Internet. All names and email addresses can be found in the
98 files @file{m4-@value{VERSION}/@/AUTHORS} and
99 @file{m4-@value{VERSION}/@/THANKS} from the @acronym{GNU} M4
103 This is release @value{VERSION}. It is now considered stable: future
104 releases on this branch are only meant to fix bugs, increase speed, or
105 improve documentation.
109 This is BETA release @value{VERSION}. This is a development release,
110 and as such, is prone to bugs, crashes, unforeseen features, incomplete
111 documentation@dots{}, therefore, use at your own peril. In case of
112 problems, please do not hesitate to report them (see the
113 @file{m4-@value{VERSION}/@/README} file in the distribution).
118 * Preliminaries:: Introduction and preliminaries
119 * Invoking m4:: Invoking @code{m4}
120 * Syntax:: Lexical and syntactic conventions
122 * Macros:: How to invoke macros
123 * Definitions:: How to define new macros
124 * Conditionals:: Conditionals, loops, and recursion
126 * Debugging:: How to debug macros and input
128 * Input Control:: Input control
129 * File Inclusion:: File inclusion
130 * Diversions:: Diverting and undiverting output
132 * Modules:: Extending M4 with dynamic runtime modules
134 * Text handling:: Macros for text handling
135 * Arithmetic:: Macros for doing arithmetic
136 * Shell commands:: Macros for running shell commands
137 * Miscellaneous:: Miscellaneous builtin macros
138 * Frozen files:: Fast loading of frozen state
140 * Compatibility:: Compatibility with other versions of @code{m4}
141 * Answers:: Correct version of some examples
142 * Copying This Manual:: How to make copies of this manual
143 * Indices:: Indices of concepts and macros
146 --- The Detailed Node Listing ---
148 Introduction and preliminaries
150 * Intro:: Introduction to @code{m4}
151 * History:: Historical references
152 * Bugs:: Problems and bugs
153 * Manual:: Using this manual
157 * Operation modes:: Command line options for operation modes
158 * Dynamic loading features:: Command line options for dynamic loading
159 * Preprocessor features:: Command line options for preprocessor features
160 * Limits control:: Command line options for limits control
161 * Frozen state:: Command line options for frozen state
162 * Debugging options:: Command line options for debugging
163 * Command line files:: Specifying input files on the command line
165 Lexical and syntactic conventions
167 * Names:: Macro names
168 * Quoted strings:: Quoting input to @code{m4}
169 * Comments:: Comments in @code{m4} input
170 * Other tokens:: Other kinds of input tokens
171 * Input processing:: How @code{m4} copies input to output
172 * Regular expression syntax:: How @code{m4} interprets regular expressions
176 * Invocation:: Macro invocation
177 * Inhibiting Invocation:: Preventing macro invocation
178 * Macro Arguments:: Macro arguments
179 * Quoting Arguments:: On Quoting Arguments to macros
180 * Macro expansion:: Expanding macros
182 How to define new macros
184 * Define:: Defining a new macro
185 * Arguments:: Arguments to macros
186 * Pseudo Arguments:: Special arguments to macros
187 * Undefine:: Deleting a macro
188 * Defn:: Renaming macros
189 * Pushdef:: Temporarily redefining macros
190 * Renamesyms:: Renaming macros with regular expressions
192 * Indir:: Indirect call of macros
193 * Builtin:: Indirect call of builtins
194 * M4symbols:: Getting the defined macro names
196 Conditionals, loops, and recursion
198 * Ifdef:: Testing if a macro is defined
199 * Ifelse:: If-else construct, or multibranch
200 * Shift:: Recursion in @code{m4}
201 * Forloop:: Iteration by counting
202 * Foreach:: Iteration by list contents
204 How to debug macros and input
206 * Dumpdef:: Displaying macro definitions
207 * Trace:: Tracing macro calls
208 * Debugmode:: Controlling debugging options
209 * Debuglen:: Limiting debug output
210 * Debugfile:: Saving debugging output
214 * Dnl:: Deleting whitespace in input
215 * Changequote:: Changing the quote characters
216 * Changecom:: Changing the comment delimiters
217 * Changeresyntax:: Changing the regular expression syntax
218 * Changesyntax:: Changing the lexical structure of the input
219 * M4wrap:: Saving text until end of input
223 * Include:: Including named files
224 * Search Path:: Searching for include files
226 Diverting and undiverting output
228 * Divert:: Diverting output
229 * Undivert:: Undiverting output
230 * Divnum:: Diversion numbers
231 * Cleardivert:: Discarding diverted text
233 Extending M4 with dynamic runtime modules
235 * M4modules:: Listing loaded modules
236 * Load:: Loading additional modules
237 * Unload:: Removing loaded modules
238 * Standard Modules:: Standard bundled modules
240 Macros for text handling
242 * Len:: Calculating length of strings
243 * Index macro:: Searching for substrings
244 * Regexp:: Searching for regular expressions
245 * Substr:: Extracting substrings
246 * Translit:: Translating characters
247 * Patsubst:: Substituting text by regular expression
248 * Format:: Formatting strings (printf-like)
250 Macros for doing arithmetic
252 * Incr:: Decrement and increment operators
253 * Eval:: Evaluating integer expressions
254 * Mpeval:: Multiple precision arithmetic
256 Macros for running shell commands
258 * Platform macros:: Determining the platform
259 * Syscmd:: Executing simple commands
260 * Esyscmd:: Reading the output of commands
261 * Sysval:: Exit status
262 * Mkstemp:: Making temporary files
263 * Mkdtemp:: Making temporary directories
265 Miscellaneous builtin macros
267 * Errprint:: Printing error messages
268 * Location:: Printing current location
269 * M4exit:: Exiting from @code{m4}
270 * Syncoutput:: Turning on and off sync lines
272 Fast loading of frozen state
274 * Using frozen files:: Using frozen files
275 * Frozen file format 1:: Frozen file format 1
276 * Frozen file format 2:: Frozen file format 2
278 Compatibility with other versions of @code{m4}
280 * Extensions:: Extensions in @acronym{GNU} M4
281 * Incompatibilities:: Other incompatibilities
282 * Experiments:: Experimental features in @acronym{GNU} M4
284 Correct version of some examples
286 * Improved exch:: Solution for @code{exch}
287 * Improved forloop:: Solution for @code{forloop}
288 * Improved foreach:: Solution for @code{foreach}
289 * Improved cleardivert:: Solution for @code{cleardivert}
290 * Improved fatal_error:: Solution for @code{fatal_error}
292 How to make copies of this manual
294 * GNU Free Documentation License:: License for copying this manual
296 Indices of concepts and macros
298 * Concept index:: Index for many concepts
299 * Macro index:: Index for all @code{m4} macros
305 @chapter Introduction and preliminaries
307 This first chapter explains what @acronym{GNU} @code{m4} is, where @code{m4}
308 comes from, how to read and use this documentation, how to call the
309 @code{m4} program, and how to report bugs about it. It concludes by
310 giving tips for reading the remainder of the manual.
312 The following chapters then detail all the features of the @code{m4}
313 language, as shipped in the @acronym{GNU} M4 package.
316 * Intro:: Introduction to @code{m4}
317 * History:: Historical references
318 * Bugs:: Problems and bugs
319 * Manual:: Using this manual
323 @section Introduction to @code{m4}
325 @code{m4} is a macro processor, in the sense that it copies its
326 input to the output, expanding macros as it goes. Macros are either
327 builtin or user-defined, and can take any number of arguments.
328 Besides just doing macro expansion, @code{m4} has builtin functions
329 for including named files, running shell commands, doing integer
330 arithmetic, manipulating text in various ways, performing recursion,
331 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
332 or as a macro processor in its own right.
334 The @code{m4} macro processor is widely available on all UNIXes, and has
335 been standardized by @acronym{POSIX}.
336 Usually, only a small percentage of users are aware of its existence.
337 However, those who find it often become committed users. The
338 popularity of @acronym{GNU} Autoconf, which requires @acronym{GNU}
339 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
340 for many to install it, while these people will not themselves
341 program in @code{m4}. @acronym{GNU} @code{m4} is mostly compatible with the
342 System V, Release 3 version, except for some minor differences.
343 @xref{Compatibility}, for more details.
345 Some people find @code{m4} to be fairly addictive. They first use
346 @code{m4} for simple problems, then take bigger and bigger challenges,
347 learning how to write complex sets of @code{m4} macros along the way.
348 Once really addicted, users pursue writing of sophisticated @code{m4}
349 applications even to solve simple problems, devoting more time
350 debugging their @code{m4} scripts than doing real work. Beware that
351 @code{m4} may be dangerous for the health of compulsive programmers.
354 @section Historical references
356 @code{GPM} was an important ancestor of @code{m4}. See
357 C. Stratchey: ``A General Purpose Macro generator'', Computer Journal
358 8,3 (1965), pp. 225 ff. @code{GPM} is also succinctly described into
359 David Gries classic ``Compiler Construction for Digital Computers''.
361 The classic B. Kernighan and P.J. Plauger: ``Software Tools'',
362 Addison-Wesley, Inc. (1976) describes and implements a Unix
363 macro-processor language, which inspired Dennis Ritchie to write
364 @code{m3}, a macro processor for the AP-3 minicomputer.
366 Kernighan and Ritchie then joined forces to develop the original
367 @code{m4}, as described in ``The M4 Macro Processor'', Bell
368 Laboratories (1977). It had only 21 builtin macros.
370 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
371 the true intricacies of real life: macros can be recognized without
372 being pre-announced, skipping whitespace or end-of-lines is easier,
373 more constructs are builtin instead of derived, etc.
375 Originally, the Kernighan and Plauger macro-processor, and then
376 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
377 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
378 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
380 Ren@'e Seindal released his implementation of @code{m4}, @acronym{GNU}
382 in 1990, with the aim of removing the artificial limitations in many
383 of the traditional @code{m4} implementations, such as maximum line
384 length, macro size, or number of macros.
386 The late Professor A. Dain Samples described and implemented a further
387 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
388 Language: 2nd edition'', Electronic Announcement on comp.compilers
391 Fran@,{c}ois Pinard took over maintenance of @acronym{GNU} @code{m4} in
392 1992, until 1994 when he released @acronym{GNU} @code{m4} 1.4, which was
393 the stable release for 10 years. It was at this time that @acronym{GNU}
394 Autoconf decided to require @acronym{GNU} @code{m4} as its underlying
395 engine, since all other implementations of @code{m4} had too many
398 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
399 addressed some long standing bugs in the venerable 1.4 release. Then in
400 2005, Gary V. Vaughan collected together the many patches to
401 @acronym{GNU} @code{m4} 1.4 that were floating around the net and
402 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
403 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8. The
404 1.4.x series remains open for bug fixes, including release 1.4.9 in
407 Meanwhile, development was underway for new features for @code{m4},
408 such as dynamic module loading and additional builtins, practically
409 rewriting the entire code base. This development has spurred
410 improvements to other @acronym{GNU} software, such as @acronym{GNU}
411 Libtool. @acronym{GNU} M4 2.0 is the result of this effort.
414 @section Problems and bugs
416 If you have problems with @acronym{GNU} M4 or think you've found a bug,
417 please report it. Before reporting a bug, make sure you've actually
418 found a real bug. Carefully reread the documentation and see if it
419 really says you can do what you're trying to do. If it's not clear
420 whether you should be able to do something or not, report that too; it's
421 a bug in the documentation!
423 Before reporting a bug or trying to fix it yourself, try to isolate it
424 to the smallest possible input file that reproduces the problem. Then
425 send us the input file and the exact results @code{m4} gave you. Also
426 say what you expected to occur; this will help us decide whether the
427 problem was really in the documentation.
429 Once you've got a precise problem, send e-mail to (Internet)
430 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
431 you are using. You can get this information with the command
432 @kbd{m4 --version}. You can also run @kbd{make check} to generate the file
433 @file{tests/@/testsuite.log}, useful for including in your report.
435 Non-bug suggestions are always welcome as well. If you have questions
436 about things that are unclear in the documentation or are just obscure
437 features, please report them too.
440 @section Using this manual
442 This manual contains a number of examples of @code{m4} input and output,
443 and a simple notation is used to distinguish input, output and error
444 messages from @code{m4}. Examples are set out from the normal text, and
445 shown in a fixed width font, like this
449 This is an example of an example!
452 To distinguish input from output, all output from @code{m4} is prefixed
453 by the string @samp{@result{}}, and all error messages by the string
454 @samp{@error{}}. When showing how command line options affect matters,
455 the command line is shown with a prompt @samp{$ @kbd{like this}},
456 otherwise, you can assume that a simple @kbd{m4} invocation will work.
461 $ @kbd{command line to invoke m4}
462 Example of input line
463 @result{}Output line from m4
464 @error{}and an error message
467 The sequence @samp{^D} in an example indicates the end of the input file.
468 The majority of these examples are self-contained, and you can run them
469 with similar results. In fact, the testsuite that is bundled in the
470 @acronym{GNU} M4 package consists in part of the examples
471 in this document! Some of the examples assume that your current
472 directory is located where you unpacked the installation, so if you plan
473 on following along, you may find it helpful to do this now:
477 $ @kbd{cd m4-@value{VERSION}}
480 As each of the predefined macros in @code{m4} is described, a prototype
481 call of the macro will be shown, giving descriptive names to the
484 @deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @
485 @ovar{argument}@dots{})
486 This is a sample prototype. There is not really a macro named
487 @code{example}, but this documents that if there were, it would be a
488 Composite macro, rather than a Builtin, and would be provided by the
491 It requires at least one argument, @var{string}. Remember that in
492 @code{m4}, there must not be a space between the macro name and the
493 opening parenthesis, unless it was intended to call the macro without
494 any arguments. The brackets around @var{count} and @var{argument} show
495 that these arguments are optional. If @var{count} is omitted, the macro
496 behaves as if count were @samp{1}, whereas if @var{argument} is omitted,
497 the macro behaves as if it were the empty string. A blank argument is
498 not the same as an omitted argument. For example, @samp{example(`a')},
499 @samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave
500 identically with @var{count} set to @samp{1}; while @samp{example(`a',)}
501 and @samp{example(`a',`')} would explicitly pass the empty string for
502 @var{count}. The ellipses (@samp{@dots{}}) show that the macro
503 processes additional arguments after @var{argument}, rather than
507 Each builtin definition will list, in parentheses, the module that must
508 be loaded to use that macro. The standard modules include
509 @samp{m4} (which is always available), @samp{gnu} (for @acronym{GNU} specific
510 m4 extensions), and @samp{traditional} (for compatibility with System V
514 All macro arguments in @code{m4} are strings, but some are given
515 special interpretation, e.g., as numbers, file names, regular
516 expressions, etc. The documentation for each macro will state how the
517 parameters are interpreted, and what happens if the argument cannot be
518 parsed according to the desired interpretation. Unless specified
519 otherwise, a parameter specified to be a number is parsed as a decimal,
520 even if the argument has leading zeros; and parsing the empty string as
521 a number results in 0 rather than an error, although a warning will be
524 This document consistently writes and uses @dfn{builtin}, without a
525 hyphen, as if it were an English word. This is how the @code{builtin}
526 primitive is spelled within @code{m4}.
529 @chapter Invoking @code{m4}
531 The format of the @code{m4} command is:
535 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
538 @cindex command line, options
539 @cindex options, command line
540 @cindex @env{POSIXLY_CORRECT}
541 All options begin with @samp{-}, or if long option names are used, with
542 @samp{--}. A long option name need not be written completely, any
543 unambiguous prefix is sufficient. @acronym{POSIX} requires @code{m4} to
544 recognize arguments intermixed with files, even when
545 @env{POSIXLY_CORRECT} is set in the environment. Most options take
546 effect at startup regardless of their position, but some are documented
547 below as taking effect after any files that occurred earlier in the
548 command line. The argument @option{--} is a marker to denote the end of
551 With short options, options that do not take arguments may be combined
552 into a single command line argument with subsequent options, options
553 with mandatory arguments may be provided either as a single command line
554 argument or as two arguments, and options with optional arguments must
555 be provided as a single argument. In other words,
556 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
557 @kbd{m4 -Q -P -D foo -d -d+f -- ./a}, although the latter form is
558 considered canonical.
560 With long options, options with mandatory arguments may be provided with
561 an equal sign (@samp{=}) in a single argument, or as two arguments, and
562 options with optional arguments must be provided as a single argument.
563 In other words, @kbd{m4 --def foo --debug a} is equivalent to
564 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
565 considered canonical (not to mention more robust, in case a future
566 version of @code{m4} introduces an option named @option{--default}).
568 @code{m4} understands the following options, grouped by functionality.
571 * Operation modes:: Command line options for operation modes
572 * Dynamic loading features:: Command line options for dynamic loading
573 * Preprocessor features:: Command line options for preprocessor features
574 * Limits control:: Command line options for limits control
575 * Frozen state:: Command line options for frozen state
576 * Debugging options:: Command line options for debugging
577 * Command line files:: Specifying input files on the command line
580 @node Operation modes
581 @section Command line options for operation modes
583 Several options control the overall operation of @code{m4}:
587 Print a help summary on standard output, then immediately exit
588 @code{m4} without reading any input files or performing any other
592 Print the version number of the program on standard output, then
593 immediately exit @code{m4} without reading any input files or
594 performing any other actions.
598 Makes this invocation of @code{m4} non-interactive. This means that
599 output will be buffered, and interrupts will halt execution. If neither
600 @option{-b} nor @option{-i} are specified, this is activated by default
601 when any input files are specified, or when either standard input or
602 standard error is not a terminal. Note that this means that @kbd{m4}
603 alone might be interactive, but @kbd{m4 -} is not, even though both
604 commands process only standard input. If both @option{-b} and
605 @option{-i} are specified, only the last one takes effect.
608 @itemx --discard-comments
609 Discard all comments instead of copying them to the output.
612 @itemx --fatal-warnings
613 Stop execution and exit @code{m4} once the first warning or error has
614 been issued, considering all of them to be fatal.
619 Makes this invocation of @code{m4} interactive. This means that all
620 output will be unbuffered, and interrupts will be ignored. If neither
621 @option{-b} nor @option{-i} are specified, this is activated by default
622 when no input files are specified, and when both standard input and
623 standard error are terminals (similar to the way that /bin/sh determines
624 when to be interactive). If both @option{-b} and @option{-i} are
625 specified, only the last one takes effect. The spelling @option{-e}
626 exists for compatibility with other @code{m4} implementations, and
627 issues a warning because it may be withdrawn in a future version of
631 @itemx --prefix-builtins
632 Internally modify @emph{all} builtin macro names so they all start with
633 the prefix @samp{m4_}. For example, using this option, one should write
634 @samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}}
635 instead of @samp{@w{__file__}}. This option has no effect if @option{-R}
641 Suppress warnings, such as missing or superfluous arguments in macro
642 calls, or treating the empty string as zero. Error messages are still
643 printed. The distinction between error and warning is fuzzy, and if
644 you encounter a situation where the message output did not match your
645 expectations, please report that as a bug. This option is implied if
646 @env{POSIXLY_CORRECT} is set in the environment.
648 @item -r@r{[}@var{RESYNTAX-SPEC}@r{]}
649 @itemx --regexp-syntax@r{[}=@var{RESYNTAX-SPEC}@r{]}
650 Set the regular expression syntax according to @var{RESYNTAX-SPEC}.
651 When this option is not given, @var{RESYNTAX-SPEC} is omitted,
652 @acronym{GNU} M4 uses emacs compatible regular expressions.
653 @xref{Changeresyntax}, for more details on the format and meaning of
654 @var{RESYNTAX-SPEC}. This option may be given more than once, and order
655 with respect to file names is significant.
658 Cripple the following builtins, since each can perform potentially
659 unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}),
660 @code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}),
661 @code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}).
662 An attempt to use any of these macros will result in an error. This
663 option is intended to make it safer to preprocess an input file of
668 Enable warnings. Warnings are on by default unless
669 @env{POSIXLY_CORRECT} was set in the environment; this option exists to
670 allow overriding @option{--silent}.
671 @comment FIXME should we accept -Wall, -Wnone, -Wcategory,
672 @comment -Wno-category...?
675 @node Dynamic loading features
676 @section Command line options for dynamic loading
678 On platforms that support dynamic libraries, there are some options
679 that affect dynamic loading.
682 @item -M @var{DIRECTORY}
683 @itemx --module-directory=@var{DIRECTORY}
684 Specify an alternate @var{DIRECTORY} to search for modules. This option
685 can be used multiple times to add several different directories to the
686 module search path. @xref{Modules}, for more details.
688 @item -m @var{MODULE}
689 @itemx --load-module=@var{MODULE}
690 Load @var{MODULE} before parsing more input files. @var{MODULE} is
691 searched for in each directory of the module search path, until the
692 first match is found or the list is exhausted. @xref{Modules}, for more
693 details. By default, the modules @samp{m4}, @samp{traditional}, and
694 @samp{gnu} are preloaded, although this can be controlled during
695 configuration with the @option{--with-modules} option to
696 @file{m4-@value{VERSION}/@/configure}. This option may be given more
697 than once, and order with respect to file names is significant.
699 @item --unload-module=@var{MODULE}
700 Unload @var{MODULE} before parsing more input files. @xref{Modules},
701 for more details. This option may be given more than once, and order
702 with respect to file names is significant.
705 @node Preprocessor features
706 @section Command line options for preprocessor features
708 @cindex macro definitions, on the command line
709 @cindex command line, macro definitions on the
710 Several options allow @code{m4} to behave more like a preprocessor.
711 Macro definitions and deletions can be made on the command line, the
712 search path can be altered, and the output file can track where the
713 input came from. These features occur with the following options:
716 @item -B @var{DIRECTORY}
717 @itemx --prepend-include=@var{DIRECTORY}
718 Make @code{m4} search @var{DIRECTORY} for included files, prior to
719 searching the current working directory. @xref{Search Path}, for more
720 details. This option may be given more than once. Some other
721 implementations of @code{m4} use @code{-B @var{number}} to change their
722 hard-coded limits, but that is unnecessary in @acronym{GNU} where the
723 only limit is your hardware capability. So although it is unlikely that
724 you will want to include a relative directory whose name is purely
725 numeric, @acronym{GNU} @code{m4} will warn you about this potential
726 compatibility issue; you can avoid the warning by using the long
727 spelling, or by using @samp{./@var{number}} if you really meant it.
729 @item -D @var{NAME}@r{[}=@var{VALUE}@r{]}
730 @itemx --define=@var{NAME}@r{[}=@var{VALUE}@r{]}
731 This enters @var{NAME} into the symbol table, before any input files are
732 read. If @samp{=@var{VALUE}} is missing, the value is taken to be the
733 empty string. The @var{VALUE} can be any string, and the macro can be
734 defined to take arguments, just as if it was defined from within the
735 input. This option may be given more than once; order with respect to
736 file names is significant, and redefining the same @var{NAME} loses the
739 @item --import-environment
740 Imports every variable in the environment as a macro. This is done
741 before @option{-D} and @option{-U}, so they can override the
744 @item -I @var{DIRECTORY}
745 @itemx --include=@var{DIRECTORY}
746 Make @code{m4} search @var{DIRECTORY} for included files that are not
747 found in the current working directory. @xref{Search Path}, for more
748 details. This option may be given more than once.
750 @item --popdef=@var{NAME}
751 This deletes the top-most meaning @var{NAME} might have. Obviously,
752 only predefined macros can be deleted in this way. This option may be
753 given more than once; popping a @var{NAME} that does not have a
754 definition is silently ignored. Order is significant with respect to
757 @item -p @var{NAME}@r{[}=@var{VALUE}@r{]}
758 @itemx --pushdef=@var{NAME}@r{[}=@var{VALUE}@r{]}
759 This enters @var{NAME} into the symbol table, before any input files are
760 read. If @samp{=@var{VALUE}} is missing, the value is taken to be the
761 empty string. The @var{VALUE} can be any string, and the macro can be
762 defined to take arguments, just as if it was defined from within the
763 input. This option may be given more than once; order with respect to
764 file names is significant, and redefining the same @var{NAME} adds
765 another definition to its stack.
769 Short for @option{--syncoutput=1}, turning synchronization lines on.
771 @item --syncoutput@r{[}=STATE@r{]}
772 Control the generation of synchronization lines from the command line.
773 Synchronization lines are for use by the C preprocessor or other
774 similar tools. Order is significant with respect to file names. This
775 option is useful, for example, when @code{m4} is used as a
776 front end to a compiler. Source file name and line number information
777 is conveyed by directives of the form @samp{#line @var{linenum}
778 "@var{file}"}, which are inserted as needed into the middle of the
779 output. Such directives mean that the following line originated or was
780 expanded from the contents of input file @var{file} at line
781 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
782 the file name did not change from the previous directive.
784 Synchronization directives are always given on complete lines by
785 themselves. When a synchronization discrepancy occurs in the middle of
786 an output line, the associated synchronization directive is delayed
787 until the beginning of the next generated line. @xref{Syncoutput}, for
788 runtime control. @var{TRUTH} is interpreted the same as the argument to
789 @code{syncoutput}; if @var{TRUTH} is omitted, or @option{--syncoutput}
790 is not used, synchronization lines are disabled.
793 @itemx --undefine=@var{NAME}
794 This deletes any predefined meaning @var{NAME} might have. Obviously,
795 only predefined macros can be deleted in this way. This option may be
796 given more than once; undefining a @var{NAME} that does not have a
797 definition is silently ignored. Order is significant with respect to
802 @section Command line options for limits control
804 There are some limits within @code{m4} that can be tuned. For
805 compatibility, @code{m4} also accepts some options that control limits
806 in other implementations, but which are automatically unbounded (limited
807 only by your hardware and operating system constraints) in @acronym{GNU}
813 Enable all the extensions in this implementation. This is on by
814 default unless @env{POSIXLY_CORRECT} is set in the environment; it
815 exists to allow overriding @option{--traditional}.
820 Suppress all the extensions made in this implementation, compared to the
821 System V version. @xref{Compatibility}, for a list of these. This
822 loads the @samp{traditional} module in place of the @samp{gnu} module.
823 It is implied if @env{POSIXLY_CORRECT} is set in the environment.
826 @itemx --nesting-limit=@var{NUM}
827 Artificially limit the nesting of macro calls to @var{NUM} levels,
828 stopping program execution if this limit is ever exceeded. When not
829 specified, nesting is limited to 1024 levels. A value of zero means
830 unlimited; but then heavily nested code could potentially cause a stack
831 overflow. @var{NUM} can have an optional scaling suffix.
832 @comment FIXME - need a node on what scaling suffixes are supported (see
833 @comment [info coreutils 'block size'] for ideas), and need to consider
834 @comment whether builtins should also understand scaling suffixes:
835 @comment eval, mpeval, perhaps format
837 The precise effect of this option might be more correctly associated
838 with textual nesting than dynamic recursion. It has been useful
839 when some complex @code{m4} input was generated by mechanical means.
840 Most users would never need this option. If shown to be obtrusive,
841 this option (which is still experimental) might well disappear.
843 This option does @emph{not} have the ability to break endless
844 rescanning loops, since these do not necessarily consume much memory
845 or stack space. Through clever usage of rescanning loops, one can
846 request complex, time-consuming computations from @code{m4} with useful
847 results. Putting limitations in this area would break @code{m4} power.
848 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
849 only the simplest example (but @pxref{Compatibility}). Expecting @acronym{GNU}
850 @code{m4} to detect these would be a little like expecting a compiler
851 system to detect and diagnose endless loops: it is a quite @emph{hard}
852 problem in general, if not undecidable!
855 @itemx --hashsize=@var{NUM}
857 @itemx --diversions=@var{NUM}
858 @itemx --word-regexp=@var{REGEXP}
859 These options are present only for compatibility with previous versions
860 of GNU @code{m4}. They do nothing except issue a warning, because the
861 symbol table size and number of diversions are not fixed anymore, and
862 because the new @code{changesyntax} feature is more efficient than the
863 withdrawn experimental @code{changeword}. These options will eventually
864 disappear in future releases.
868 These options are present for compatibility with System V @code{m4}, but
869 do nothing in this implementation. They may disappear in future
870 releases, and issue a warning to that effect.
874 @section Command line options for frozen state
876 @acronym{GNU} @code{m4} comes with a feature of freezing internal state
877 (@pxref{Frozen files}). This can be used to speed up @code{m4}
878 execution when reusing a common initialization script.
882 @itemx --freeze-state=@var{FILE}
883 Once execution is finished, write out the frozen state on the specified
884 @var{FILE}. It is conventional, but not required, for @var{FILE} to end
888 @itemx --reload-state=@var{FILE}
889 Before execution starts, recover the internal state from the specified
890 frozen @var{FILE}. The options @option{-D}, @option{-U}, @option{-t},
891 @option{-m}, @option{-r}, and @option{--import-environment} take effect
892 after state is reloaded, but before the input files are read.
895 @node Debugging options
896 @section Command line options for debugging
898 Finally, there are several options for aiding in debugging @code{m4}
902 @item -d@r{[}@var{FLAGS}@r{]}
903 @itemx --debug@r{[}=@var{FLAGS}@r{]}
904 @itemx --debugmode@r{[}=@var{FLAGS}@r{]}
905 Set the debug-level according to the flags @var{FLAGS}. The debug-level
906 controls the format and amount of information presented by the debugging
907 functions. @xref{Debugmode}, for more details on the format and
908 meaning of @var{FLAGS}. If omitted, @var{FLAGS} defaults to
909 @samp{aeq}. When the option is presented multiple times, if later
910 @var{FLAGS} starts with @samp{-} or @samp{+}, they are cumulative,
911 otherwise the later flags override all earlier occurrences. The
912 spelling @option{--debug} is recognized as an unambiguous option for
913 compatibility with earlier versions of @acronym{GNU} M4, but for
914 consistency with the builtin name, you can also use the spelling
915 @option{--debugmode}.
917 @item --debugfile=@var{FILE}
919 @itemx --error-output=@var{FILE}
920 Redirect debug and trace output to the named @var{FILE}. Warnings,
921 error messages, and the output of @code{errprint} and @code{dumpdef},
922 are still printed to standard error. If this option is not given, debug
923 output goes to standard error; if @var{FILE} is the empty string, debug
924 output is discarded. @xref{Debugfile}, for more details. The
925 spellings @option{-o} and @option{--error-output} are misleading and
926 inconsistent with other @acronym{GNU} tools; using those spellings will
927 evoke a warning, and they may be withdrawn or change semantics in a
931 @itemx --debuglen=@var{NUM}
932 @itemx --arglength=@var{NUM}
933 Restrict the size of the output generated by macro tracing or by
934 @code{dumpdef} to @var{NUM} characters per string. If unspecified or
935 zero, output is unlimited. @xref{Debuglen}, for more details.
936 @var{NUM} can have an optional scaling suffix. The spelling
937 @option{--arglength} is deprecated, since it does not match the
938 @code{debuglen} macro; using it will evoke a warning, and it may be
939 withdrawn in a future release.
940 @comment FIXME - Should we add an option that controls whether output
941 @comment strings are sanitized with escape sequences, so that dumpdef is
942 @comment truly one line per macro?
943 @comment FIXME - see comment on --nesting-limit about NUM.
946 @itemx --trace=@var{NAME}
947 @itemx --traceon=@var{NAME}
948 This enables tracing for the macro @var{NAME}, at any point where it is
949 defined. @var{NAME} need not be defined when this option is given.
950 This option may be given more than once, and order is significant with
951 respect to file names. @xref{Trace}, for more details.
953 @item --traceoff=@var{NAME}
954 This disables tracing for the macro @var{NAME}, at any point where it is
955 defined. @var{NAME} need not be defined when this option is given.
956 This option may be given more than once, and order is significant with
957 respect to file names. @xref{Trace}, for more details.
960 @node Command line files
961 @section Specifying input files on the command line
963 @cindex command line, file names on the
964 @cindex file names, on the command line
965 The remaining arguments on the command line are taken to be input file
966 names. If no names are present, standard input is read. A file
967 name of @file{-} is taken to mean standard input. It is
968 conventional, but not required, for input files to end in @samp{.m4}.
970 The input files are read in the sequence given. Standard input can be
971 read more than once, so the file name @file{-} may appear multiple times
972 on the command line; this makes a difference when input is from a
973 terminal or other special file type. It is an error if an input file
974 ends in the middle of argument collection, a comment, or a quoted
976 @comment FIXME - it would be nicer if we let these three things
977 @comment continue across file boundaries, provided that we warn in
978 @comment interactive use when switching to stdin in a non-default parse
981 Various options, such as @option{--define} (@option{-D}), @option{--undefine}
982 (@option{-U}), @option{--synclines} (@option{-s}), @option{--trace}
983 (@option{-t}), @option{--regexp-syntax} (@option{-r}), and
984 @option{--load-module} (@option{-m}), only take effect after processing
985 input from any file names that occur earlier on the command line. For
986 example, assume the file @file{foo} contains:
994 The text @samp{bar} can then be redefined over multiple uses of
997 @comment options: -Dbar=hello foo -Dbar=world foo
999 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
1004 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
1005 exit status of @code{m4} will be 0 for success, 1 for general failure
1006 (such as problems with reading an input file), and 63 for version
1007 mismatch (@pxref{Using frozen files}).
1009 If you need to read a file whose name starts with a @file{-}, you can
1010 specify it as @samp{./-file}, or use @option{--} to mark the end of
1014 @chapter Lexical and syntactic conventions
1016 @cindex input tokens
1018 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1019 token is either a name, a quoted string, or any single character, that
1020 is not a part of either a name or a string. Input to @code{m4} can also
1021 contain comments. @acronym{GNU} @code{m4} does not yet understand
1022 multibyte locales; all operations are byte-oriented rather than
1023 character-oriented (although if your locale uses a single byte
1024 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1025 However, @code{m4} is eight-bit clean, so you can
1026 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1027 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1028 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1031 * Names:: Macro names
1032 * Quoted strings:: Quoting input to @code{m4}
1033 * Comments:: Comments in @code{m4} input
1034 * Other tokens:: Other kinds of input tokens
1035 * Input processing:: How @code{m4} copies input to output
1036 * Regular expression syntax:: How @code{m4} interprets regular expressions
1040 @section Macro names
1043 A name is any sequence of letters, digits, and the character @samp{_}
1044 (underscore), where the first character is not a digit. @code{m4} will
1045 use the longest such sequence found in the input. If a name has a
1046 macro definition, it will be subject to macro expansion
1047 (@pxref{Macros}). Names are case-sensitive.
1049 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1051 The definitions of letters, digits and other input characters can be
1052 changed at any time, using the builtin macro @code{changesyntax}.
1053 @xref{Changesyntax}, for more information.
1055 @node Quoted strings
1056 @section Quoting input to @code{m4}
1058 @cindex quoted string
1059 A quoted string is a sequence of characters surrounded by quote
1060 strings, defaulting to
1061 @samp{`} and @samp{'}, where the nested begin and end quotes within the
1062 string are balanced. The value of a string token is the text, with one
1063 level of quotes stripped off. Thus
1072 is the empty string, and double-quoting turns into single-quoting.
1080 The quote characters can be changed at any time, using the builtin macros
1081 @code{changequote} (@pxref{Changequote}) or @code{changesyntax}
1082 (@pxref{Changesyntax}).
1085 @section Comments in @code{m4} input
1088 Comments in @code{m4} are normally delimited by the characters @samp{#}
1089 and newline. All characters between the comment delimiters are ignored,
1090 but the entire comment (including the delimiters) is passed through to
1091 the output, unless you supply the @option{--discard-comments} or
1092 @option{-c} option at the command line (@pxref{Operation modes, ,
1093 Invoking m4}). When discarding comments, the comment delimiters are
1094 discarded, even if the close-comment string is a newline.
1096 Comments cannot be nested, so the first newline after a @samp{#} ends
1097 the comment. The commenting effect of the begin-comment string
1098 can be inhibited by quoting it.
1102 `quoted text' # `commented text'
1103 @result{}quoted text # `commented text'
1104 `quoting inhibits' `#' `comments'
1105 @result{}quoting inhibits # comments
1108 @comment options: -c
1111 `quoted text' # `commented text'
1112 `quoting inhibits' `#' `comments'
1113 @result{}quoted text quoting inhibits # comments
1116 The comment delimiters can be changed to any string at any time, using
1117 the builtin macros @code{changecom} (@pxref{Changecom}) or
1118 @code{changesyntax} (@pxref{Changesyntax}).
1121 @section Other kinds of input tokens
1123 Any character, that is neither a part of a name, nor of a quoted string,
1124 nor a comment, is a token by itself. When not in the context of macro
1125 expansion, all of these tokens are just copied to output. However,
1126 during macro expansion, whitespace characters (space, tab, newline,
1127 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1128 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1129 roles, explained later. Which characters actually perform these roles
1130 can be adjusted with @code{changesyntax} (@pxref{Changesyntax}).
1132 @node Input processing
1133 @section How @code{m4} copies input to output
1135 As @code{m4} reads the input token by token, it will copy each token
1136 directly to the output immediately.
1138 The exception is when it finds a word with a macro definition. In that
1139 case @code{m4} will calculate the macro's expansion, possibly reading
1140 more input to get the arguments. It then inserts the expansion in front
1141 of the remaining input. In other words, the resulting text from a macro
1142 call will be read and parsed into tokens again.
1144 @code{m4} expands a macro as soon as possible. If it finds a macro call
1145 when collecting the arguments to another, it will expand the second
1146 call first. For a running example, examine how @code{m4} handles this
1151 format(`Result is %d', eval(`2**15'))
1155 First, @code{m4} sees that the token @samp{format} is a macro name, so
1156 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1157 and @samp{@w{ }}, before encountering another potential macro. Sure
1158 enough, @samp{eval} is a macro name, so the nested argument collection
1159 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1160 with the lone argument of @samp{2**15}. The expansion of
1161 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1162 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1163 combined with the next @samp{)}, the format macro now has all its
1164 arguments, as if the user had typed:
1168 format(`Result is %d', 32768)
1172 The format macro expands to @samp{Result is 32768}, and we have another
1173 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1174 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1175 @samp{8}. None of these are macros, so the final output is
1179 @result{}Result is 32768
1182 The order in which @code{m4} expands the macros can be explored using
1183 the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
1185 This process continues until there are no more macro calls to expand and
1186 all the input has been consumed.
1188 @node Regular expression syntax
1189 @section How @code{m4} interprets regular expressions
1191 There are several contexts where @code{m4} parses an argument as a
1192 regular expression. This section describes the various flavors of
1193 regular expressions. @xref{Changeresyntax}.
1195 @include regexprops-generic.texi
1198 @chapter How to invoke macros
1200 This chapter covers macro invocation, macro arguments and how macro
1201 expansion is treated.
1204 * Invocation:: Macro invocation
1205 * Inhibiting Invocation:: Preventing macro invocation
1206 * Macro Arguments:: Macro arguments
1207 * Quoting Arguments:: On Quoting Arguments to macros
1208 * Macro expansion:: Expanding macros
1212 @section Macro invocation
1214 @cindex macro invocation
1215 Macro invocations has one of the forms
1223 which is a macro invocation without any arguments, or
1227 name(arg1, arg2, @dots{}, arg@var{n})
1231 which is a macro invocation with @var{n} arguments. Macros can have any
1232 number of arguments. All arguments are strings, but different macros
1233 might interpret the arguments in different ways.
1235 The opening parenthesis @emph{must} follow the @var{name} directly, with
1236 no spaces in between. If it does not, the macro is called with no
1239 For a macro call to have no arguments, the parentheses @emph{must} be
1240 left out. The macro call
1248 is a macro call with one argument, which is the empty string, not a call
1251 @node Inhibiting Invocation
1252 @section Preventing macro invocation
1254 An innovation of the @code{m4} language, compared to some of its
1255 predecessors (like Stratchey's @code{GPM}, for example), is the ability
1256 to recognize macro calls without resorting to any special, prefixed
1257 invocation character. While generally useful, this feature might
1258 sometimes be the source of spurious, unwanted macro calls. So, @acronym{GNU}
1259 @code{m4} offers several mechanisms or techniques for inhibiting the
1260 recognition of names as macro calls.
1262 @cindex @acronym{GNU} extensions
1264 @cindex macro, blind
1265 First of all, many builtin macros cannot meaningfully be called without
1266 arguments. As a @acronym{GNU} extension, for any of these macros,
1267 whenever an opening parenthesis does not immediately follow their name,
1268 the builtin macro call is not triggered. This solves the most usual
1269 cases, like for @samp{include} or @samp{eval}. Later in this document,
1270 the sentence ``This macro is recognized only with parameters'' refers to
1271 this specific provision of @acronym{GNU} M4, also known as a blind
1272 builtin macro. For the builtins defined by @acronym{POSIX} that bear
1273 this disclaimer, @acronym{POSIX} specifically states that invoking those
1274 builtins without arguments is unspecified, because many other
1275 implementations simply invoke the builtin as though it were given one
1276 empty argument instead.
1286 There is also a command line option (@option{--prefix-builtins}, or
1287 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1288 builtin macros with a prefix of @samp{m4_} at startup. The option has
1289 no effect whatsoever on user defined macros. For example, with this option,
1290 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1291 no effect on whether a macro requires parameters.
1293 @comment options: -P
1306 Another alternative is to redefine problematic macros to a name less
1307 likely to cause conflicts, @xref{Definitions}. Or the parsing engine
1308 can be changed to redefine what constitutes a valid macro name,
1309 @xref{Changesyntax}.
1311 Of course, the simplest way to prevent a name from being interpreted
1312 as a call to an existing macro is to quote it. The remainder of
1313 this section studies a little more deeply how quoting affects macro
1314 invocation, and how quoting can be used to inhibit macro invocation.
1316 Even if quoting is usually done over the whole macro name, it can also
1317 be done over only a few characters of this name (provided, of course,
1318 that the unquoted portions are not also a macro). It is also possible
1319 to quote the empty string, but this works only @emph{inside} the name.
1334 all yield the string @samp{divert}. While in both:
1344 the @code{divert} builtin macro will be called, which expands to the
1347 The output of macro evaluations is always rescanned. The following
1348 example would yield the string @samp{de}, exactly as if @code{m4}
1349 has been given @w{@samp{substr(`abcde', `3', `2')}} as input:
1352 define(`x', `substr(ab')
1354 define(`y', `cde, `3', `2')')
1360 Unquoted strings on either side of a quoted string are subject to
1361 being recognized as macro names. In the following example, quoting the
1362 empty string allows for the second @code{macro} to be recognized as such:
1365 define(`macro', `m')
1373 Quoting may prevent recognizing as a macro name the concatenation of a
1374 macro expansion with the surrounding characters. In this example:
1377 define(`macro', `di$1')
1386 the input will produce the string @samp{divert}. When the quotes were
1387 removed, the @code{divert} builtin was called instead.
1389 @node Macro Arguments
1390 @section Macro arguments
1392 @cindex macros, arguments to
1393 @cindex arguments to macros
1394 When a name is seen, and it has a macro definition, it will be expanded
1397 If the name is followed by an opening parenthesis, the arguments will be
1398 collected before the macro is called. If too few arguments are
1399 supplied, the missing arguments are taken to be the empty string.
1400 However, some builtins are documented to behave differently for a
1401 missing optional argument than for an explicit empty string. If there
1402 are too many arguments, the excess arguments are ignored. Unquoted
1403 leading whitespace is stripped off all arguments, but whitespace
1404 generated by a macro expansion or occurring after a macro that expanded
1405 to an empty string remains intact. Whitespace includes space, tab,
1406 newline, carriage return, vertical tab, and formfeed.
1409 define(`macro', `$1')
1411 macro( unquoted leading space lost)
1412 @result{}unquoted leading space lost
1413 macro(` quoted leading space kept')
1414 @result{} quoted leading space kept
1416 divert `unquoted space kept after expansion')
1417 @result{} unquoted space kept after expansion
1419 ')`whitespace from expansion kept')
1421 @result{}whitespace from expansion kept
1422 macro(`unquoted trailing whitespace kept'
1424 @result{}unquoted trailing whitespace kept
1428 Normally @code{m4} will issue warnings if a builtin macro is called
1429 with an inappropriate number of arguments, but it can be suppressed with
1430 the @option{--quiet} command line option (or @option{--silent}, or
1431 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1432 defined macros, there is no check of the number of arguments given.
1437 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
1441 index(`abc', `b', `ignored')
1442 @error{}m4:stdin:3: Warning: index: extra arguments ignored: 3 > 2
1446 @comment options: -Q
1453 index(`abc', `b', `ignored')
1457 Macros are expanded normally during argument collection, and whatever
1458 commas, quotes and parentheses that might show up in the resulting
1459 expanded text will serve to define the arguments as well. Thus, if
1460 @var{foo} expands to @samp{, b, c}, the macro call
1468 is a macro call with four arguments, which are @samp{a }, @samp{b},
1469 @samp{c} and @samp{d}. To understand why the first argument contains
1470 whitespace, remember that unquoted leading whitespace is never part
1471 of an argument, but trailing whitespace always is.
1473 It is possible for a macro's definition to change during argument
1474 collection, in which case the expansion uses the definition that was in
1475 effect at the time the opening @samp{(} was seen.
1486 It is an error if the end of file occurs while collecting arguments.
1491 @result{}hello world
1494 @error{}m4:stdin:2: end of file in argument list
1497 @node Quoting Arguments
1498 @section On Quoting Arguments to macros
1500 @cindex quoted macro arguments
1501 @cindex macros, quoted arguments to
1502 @cindex arguments, quoted macro
1503 Each argument has unquoted leading whitespace removed. Within each
1504 argument, all unquoted parentheses must match. For example, if
1505 @var{foo} is a macro,
1513 is a macro call, with one argument, whose value is @samp{() (() (}.
1514 Commas separate arguments, except when they occur inside quotes,
1515 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1518 It is common practice to quote all arguments to macros, unless you are
1519 sure you want the arguments expanded. Thus, in the above
1520 example with the parentheses, the `right' way to do it is like this:
1527 It is, however, in certain cases necessary or convenient to leave out
1528 quotes for some arguments, and there is nothing wrong in doing it. It
1529 just makes life a bit harder, if you are not careful. For consistency,
1530 this manual follows the rule of thumb that each layer of parentheses
1531 introduces another layer of single quoting, except when showing the
1532 consequences of quoting rules. This is done even when the quoted string
1533 cannot be a macro, such as with integers when you have not changed the
1534 syntax via @code{changesyntax} (@pxref{Changesyntax}).
1536 @node Macro expansion
1537 @section Macro expansion
1539 @cindex macros, expansion of
1540 @cindex expansion of macros
1541 When the arguments, if any, to a macro call have been collected, the
1542 macro is expanded, and the expansion text is pushed back onto the input
1543 (unquoted), and reread. The expansion text from one macro call might
1544 therefore result in more macros being called, if the calls are included,
1545 completely or partially, in the first macro calls' expansion.
1547 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1548 @var{bar} expands to @samp{Hello world}, the input
1550 @comment options: -Dbar='Hello world' -Dfoo=bar
1552 $ @kbd{m4 -Dbar="Hello world" -Dfoo=bar}
1554 @result{}Hello world
1558 will expand first to @samp{bar}, and when this is reread and
1559 expanded, into @samp{Hello world}.
1562 @chapter How to define new macros
1564 @cindex macros, how to define new
1565 @cindex defining new macros
1566 Macros can be defined, redefined and deleted in several different ways.
1567 Also, it is possible to redefine a macro without losing a previous
1568 value, and bring back the original value at a later time.
1571 * Define:: Defining a new macro
1572 * Arguments:: Arguments to macros
1573 * Pseudo Arguments:: Special arguments to macros
1574 * Undefine:: Deleting a macro
1575 * Defn:: Renaming macros
1576 * Pushdef:: Temporarily redefining macros
1577 * Renamesyms:: Renaming macros with regular expressions
1579 * Indir:: Indirect call of macros
1580 * Builtin:: Indirect call of builtins
1581 * M4symbols:: Getting the defined macro names
1585 @section Defining a macro
1587 The normal way to define or redefine macros is to use the builtin
1590 @deffn {Builtin (m4)} define (@var{name}, @ovar{expansion})
1591 Defines @var{name} to expand to @var{expansion}. If
1592 @var{expansion} is not given, it is taken to be empty.
1594 The expansion of @code{define} is void.
1595 The macro @code{define} is recognized only with parameters.
1598 The following example defines the macro @var{foo} to expand to the text
1599 @samp{Hello World.}.
1602 define(`foo', `Hello world.')
1605 @result{}Hello world.
1608 The empty line in the output is there because the newline is not
1609 a part of the macro definition, and it is consequently copied to
1610 the output. This can be avoided by use of the macro @code{dnl}.
1611 @xref{Dnl}, for details.
1613 The first argument to @code{define} should be quoted; otherwise, if the
1614 macro is already defined, you will be defining a different macro. This
1615 example shows the problems with underquoting, since we did not want to
1616 redefine @code{one}:
1627 @cindex @acronym{GNU} extensions
1628 @acronym{GNU} @code{m4} normally replaces only the @emph{topmost}
1629 definition of a macro if it has several definitions from @code{pushdef}
1630 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1631 definitions of a macro with @code{define}. @xref{Incompatibilities},
1633 @comment FIXME - See Austin group XCU ERN 118; this is considered
1634 @comment ambiguous in the current version of POSIX. The best thing to
1635 @comment do here would probably be keep GNU semantics of popdef/pushdef
1636 @comment in the m4 module unconditionally, then have a shadow builtin in
1637 @comment the traditional module that does the undefine/pushdef
1638 @comment semantics, rather than our current keying off of
1639 @comment POSIXLY_CORRECT within the m4 module.
1641 As a @acronym{GNU} extension, the first argument to @code{define} does
1642 not have to be a simple word.
1643 It can be any text string, even the empty string. A macro with a
1644 non-standard name cannot be invoked in the normal way, as the name is
1645 not recognized. It can only be referenced by the builtins @code{Indir}
1646 (@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
1649 Arrays and associative arrays can be simulated by using this trick.
1652 define(`array', `defn(format(``array[%d]'', `$1'))')
1654 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
1656 array_set(`4', `array element no. 4')
1658 array_set(`17', `array element no. 17')
1661 @result{}array element no. 4
1662 array(eval(`10 + 7'))
1663 @result{}array element no. 17
1666 Change the @code{%d} to @code{%s} and it is an associative array.
1669 @section Arguments to macros
1671 @cindex macros, arguments to
1672 @cindex Arguments to macros
1673 Macros can have arguments. The @var{n}th argument is denoted by
1674 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
1675 argument, when the macro is expanded. Replacement of arguments happens
1676 before rescanning, regardless of how many nesting levels of quoting
1677 appear in the expansion. Here is an example of a macro with
1678 two arguments. It simply exchanges the order of the two arguments.
1681 define(`exch', `$2, $1')
1683 exch(`arg1', `arg2')
1687 This can be used, for example, if you like the arguments to
1688 @code{define} to be reversed.
1691 define(`exch', `$2, $1')
1693 define(exch(``expansion text'', ``macro''))
1696 @result{}expansion text
1699 @xref{Quoting Arguments}, for an explanation of the double quotes.
1700 (You should try and improve this example so that clients of @code{exch}
1701 do not have to double quote; or @pxref{Improved exch, , Answers}).
1703 @cindex @acronym{GNU} extensions
1704 @acronym{GNU} @code{m4} allows the number following the @samp{$} to
1706 or more digits, allowing macros to have any number of arguments. This
1707 is not so in UNIX implementations of @code{m4}, which only recognize
1709 @comment FIXME - See Austin group XCU ERN 111. POSIX says that $11 must
1710 @comment be the first argument concatenated with 1, and instead reserves
1711 @comment ${11} for implementation use. Once this is implemented, the
1712 @comment documentation needs to reflect how these extended arguments
1713 @comment are handled, as well as backwards compatibility issues with
1714 @comment 1.4.x. Also, consider adding further extensions such as
1715 @comment ${1-default}, which expands to `default' if $1 is empty.
1717 As a special case, the zeroth argument, @code{$0}, is always the name
1718 of the macro being expanded.
1721 define(`test', ``Macro name: $0'')
1724 @result{}Macro name: test
1727 If you want quoted text to appear as part of the expansion text,
1728 remember that quotes can be nested in quoted strings. Thus, in
1731 define(`foo', `This is macro `foo'.')
1734 @result{}This is macro foo.
1738 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
1739 a quoted string, and not a name.
1741 @node Pseudo Arguments
1742 @section Special arguments to macros
1744 @cindex special arguments to macros
1745 @cindex macros, special arguments to
1746 @cindex arguments to macros, special
1747 There is a special notation for the number of actual arguments supplied,
1748 and for all the actual arguments.
1750 The number of actual arguments in a macro call is denoted by @code{$#}
1751 in the expansion text. Thus, a macro to display the number of arguments
1755 define(`nargs', `$#')
1761 nargs(`arg1', `arg2', `arg3')
1763 nargs(`commas can be quoted, like this')
1765 nargs(arg1#inside comments, commas do not separate arguments
1768 nargs((unquoted parentheses, like this, group arguments))
1772 The notation @code{$*} can be used in the expansion text to denote all
1773 the actual arguments, unquoted, with commas in between. For example
1776 define(`echo', `$*')
1778 echo(arg1, arg2, arg3 , arg4)
1779 @result{}arg1,arg2,arg3 ,arg4
1782 Often each argument should be quoted, and the notation @code{$@@} handles
1783 that. It is just like @code{$*}, except that it quotes each argument.
1784 A simple example of that is:
1787 define(`echo', `$@@')
1789 echo(arg1, arg2, arg3 , arg4)
1790 @result{}arg1,arg2,arg3 ,arg4
1793 Where did the quotes go? Of course, they were eaten, when the expanded
1794 text were reread by @code{m4}. To show the difference, try
1797 define(`echo1', `$*')
1799 define(`echo2', `$@@')
1801 define(`foo', `This is macro `foo'.')
1804 @result{}This is macro This is macro foo..
1806 @result{}This is macro foo.
1808 @result{}This is macro foo.
1814 @xref{Trace}, if you do not understand this. As another example of the
1815 difference, remember that comments encountered in arguments are passed
1816 untouched to the macro, and that quoting disables comments.
1819 define(`echo1', `$*')
1821 define(`echo2', `$@@')
1823 define(`foo', `bar')
1835 A @samp{$} sign in the expansion text, that is not followed by anything
1836 @code{m4} understands, is simply copied to the macro expansion, as any
1840 define(`foo', `$$$ hello $$$')
1843 @result{}$$$ hello $$$
1846 If you want a macro to expand to something like @samp{$12}, the
1847 judicious use of nested quoting can put a safe character between the
1848 @code{$} and the next character, relying on the rescanning to remove the
1849 nested quote. This will prevent @code{m4} from interpreting the
1850 @code{$} sign as a reference to an argument.
1853 define(`foo', `no nested quote: $1')
1856 @result{}no nested quote: arg
1857 define(`foo', `nested quote around $: `$'1')
1860 @result{}nested quote around $: $1
1861 define(`foo', `nested empty quote after $: $`'1')
1864 @result{}nested empty quote after $: $1
1865 define(`foo', `nested quote around next character: $`1'')
1868 @result{}nested quote around next character: $1
1869 define(`foo', `nested quote around both: `$1'')
1872 @result{}nested quote around both: arg
1876 @section Deleting a macro
1878 @cindex macros, how to delete
1879 @cindex deleting macros
1880 @cindex undefining macros
1881 A macro definition can be removed with @code{undefine}:
1883 @deffn {Builtin (m4)} undefine (@var{name}@dots{})
1884 For each argument, remove the macro @var{name}. The macro names must
1885 necessarily be quoted, since they will be expanded otherwise.
1887 The expansion of @code{undefine} is void.
1888 The macro @code{undefine} is recognized only with parameters.
1893 @result{}foo bar blah
1894 define(`foo', `some')define(`bar', `other')define(`blah', `text')
1897 @result{}some other text
1901 @result{}foo other text
1902 undefine(`bar', `blah')
1905 @result{}foo bar blah
1908 Undefining a macro inside that macro's expansion is safe; the macro
1909 still expands to the definition that was in effect at the @samp{(}.
1912 define(`f', ``$0':$1')
1914 f(f(f(undefine(`f')`hello world')))
1915 @result{}f:f:f:hello world
1920 It is not an error for @var{name} to have no macro definition. In that
1921 case, @code{undefine} does nothing.
1924 @section Renaming macros
1926 @cindex macros, how to rename
1927 @cindex renaming macros
1928 It is possible to rename an already defined macro. To do this, you need
1929 the builtin @code{defn}:
1931 @deffn {Builtin (m4)} defn (@var{name}@dots{})
1932 Expands to the @emph{quoted definition} of each @var{name}. If an
1933 argument is not a defined macro, the expansion for that argument is
1934 empty and triggers a warning.
1936 If @var{name} is a user-defined macro, the quoted definition is simply
1937 the quoted expansion text. If, instead, @var{name} is a builtin, the
1938 expansion is a special token, which points to the builtin's internal
1939 definition. This token is only meaningful as the second argument to
1940 @code{define} (and @code{pushdef}), and is silently converted to an
1941 empty string in most other contexts.
1942 @comment FIXME - Other implementations, such as Solaris, can pass a
1943 @comment builtin token around to other macros, flattening it only on output:
1944 @comment define(foo, a`'defn(`divnum')b)
1945 @comment len(foo) => 3
1946 @comment index(foo, defn(`divnum') => 1
1948 @comment It may be worth making some changes to support this behavior.
1950 The macro @code{defn} is recognized only with parameters.
1953 Its normal use is best understood through an example, which shows how to
1954 rename @code{undefine} to @code{zap}:
1957 define(`zap', defn(`undefine'))
1962 @result{}undefine(zap)
1965 In this way, @code{defn} can be used to copy macro definitions, and also
1966 definitions of builtin macros. Even if the original macro is removed,
1967 the other name can still be used to access the definition.
1969 The fact that macro definitions can be transferred also explains why you
1970 should use @code{$0}, rather than retyping a macro's name in its
1974 define(`foo', `This is `$0'')
1976 define(`bar', defn(`foo'))
1979 @result{}This is bar
1982 Macros used as string variables should be referred through @code{defn},
1983 to avoid unwanted expansion of the text:
1986 define(`string', `The macro dnl is very useful
1990 @result{}The macro@w{ }
1992 @result{}The macro dnl is very useful
1996 However, it is important to remember that @code{m4} rescanning is purely
1997 textual. If an unbalanced end-quote string occurs in a macro
1998 definition, the rescan will see that embedded quote as the termination
1999 of the quoted string, and the remainder of the macro's definition will
2000 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2001 end-quotes in macro definitions or arguments to macros.
2008 define(`echo', `$@@')
2018 Using @code{defn} to generate special tokens for builtin macros outside
2019 of expected contexts can sometimes trigger warnings. But most of the
2020 time, such tokens are silently converted to the empty string.
2025 define(defn(`divnum'), `cannot redefine a builtin token')
2026 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2032 Since @code{defn} can take more than one argument, it can be used to
2033 concatenate multiple macros into one.
2034 @comment FIXME - we don't yet handle mixing text and builtins. This
2035 @comment example passes under Solaris (minus the warning).
2040 @error{}m4:stdin:1: Warning: defn: undefined macro `foo'
2044 define(`bar', defn(`foo', `divnum'))
2046 define(`blah', defn(`divnum', `foo'))
2055 @section Temporarily redefining macros
2057 @cindex macros, temporary redefinition of
2058 @cindex temporary redefinition of macros
2059 @cindex redefinition of macros, temporary
2060 It is possible to redefine a macro temporarily, reverting to the
2061 previous definition at a later time. This is done with the builtins
2062 @code{pushdef} and @code{popdef}:
2064 @deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion})
2065 @deffnx {Builtin (m4)} popdef (@var{name}@dots{})
2066 Analogous to @code{define} and @code{undefine}.
2068 These macros work in a stack-like fashion. A macro is temporarily
2069 redefined with @code{pushdef}, which replaces an existing definition of
2070 @var{name}, while saving the previous definition, before the new one is
2071 installed. If there is no previous definition, @code{pushdef} behaves
2072 exactly like @code{define}.
2074 If a macro has several definitions (of which only one is accessible),
2075 the topmost definition can be removed with @code{popdef}. If there is
2076 no previous definition, @code{popdef} behaves like @code{undefine}.
2078 The expansion of both @code{pushdef} and @code{popdef} is void.
2079 The macros @code{pushdef} and @code{popdef} are recognized only with
2084 define(`foo', `Expansion one.')
2087 @result{}Expansion one.
2088 pushdef(`foo', `Expansion two.')
2091 @result{}Expansion two.
2092 pushdef(`foo', `Expansion three.')
2094 pushdef(`foo', `Expansion four.')
2099 @result{}Expansion three.
2100 popdef(`foo', `foo')
2103 @result{}Expansion one.
2110 If a macro with several definitions is redefined with @code{define}, the
2111 topmost definition is @emph{replaced} with the new definition. If it is
2112 removed with @code{undefine}, @emph{all} the definitions are removed,
2113 and not only the topmost one.
2116 define(`foo', `Expansion one.')
2119 @result{}Expansion one.
2120 pushdef(`foo', `Expansion two.')
2123 @result{}Expansion two.
2124 define(`foo', `Second expansion two.')
2127 @result{}Second expansion two.
2134 @cindex local variables
2135 @cindex variables, local
2136 Local variables within macros are made with @code{pushdef} and
2137 @code{popdef}. At the start of the macro a new definition is pushed,
2138 within the macro it is manipulated and at the end it is popped,
2139 revealing the former definition.
2141 It is possible to temporarily redefine a builtin with @code{pushdef}
2145 @section Renaming macros with regular expressions
2147 @cindex regular expressions
2148 @cindex macros, how to rename
2149 @cindex renaming macros
2150 @cindex @acronym{GNU} extensions
2151 Sometimes it is desirable to rename multiple symbols without having to
2152 use a long sequence of calls to @code{define}. The @code{renamesyms}
2153 builtin allows this:
2155 @deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @
2157 Global renaming of macros is done by @code{renamesyms}, which selects
2158 all macros with names that match @var{regexp}, and renames each match
2159 according to @var{replacement}. It is unspecified what happens if the
2160 rename causes multiple macros to map to the same name.
2161 @comment FIXME - right now, collisions cause a core dump on some platforms:
2162 @comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa')
2164 If @var{resyntax} is given, the particular flavor of regular
2165 expression understood with respect to @var{regexp} can be changed from
2166 the current default. @xref{Changeresyntax}, for details of the values
2167 that can be given for this argument.
2169 A macro that does not have a name that matches @var{regexp} is left
2170 with its original name. If only part of the name matches, any part of
2171 the name that is not covered by @var{regexp} is copied to the
2172 replacement name. Whenever a match is found in the name, the search
2173 proceeds from the end of the match, so no character in the original
2174 name can be substituted twice. If @var{regexp} matches a string of
2175 zero length, the start position for the continued search is
2176 incremented to avoid infinite loops.
2178 Where a replacement is to be made, @var{replacement} replaces the
2179 matched text in the original name, with @samp{\@var{n}} substituted by
2180 the text matched by the @var{n}th parenthesized sub-expression of
2181 @var{regexp}, and @samp{\&} being the text matched by the entire
2184 The expansion of @code{renamesyms} is void.
2185 The macro @code{renamesyms} is recognized only with parameters.
2186 This macro was added in M4 2.0.
2189 Here is an example that performs the same renaming as the
2190 @option{--prefix-builtins} option (or @option{-P}). Where
2191 @option{--prefix-builtins} only renames M4 builtin macros,
2192 @code{renamesyms} will rename any macros that match when it runs,
2193 including text macros.
2196 renamesyms(`^.*$', `m4_\&')
2200 If @var{resyntax} is given, @var{regexp} must be given according to
2201 the syntax chosen, though the default regular expression syntax
2202 remains unchanged for other invocations. Here is a more realistic
2203 example that performs a similar renaming on macros, except that it
2204 ignores macros with names that begin with @samp{_}, and avoids creating
2205 macros with names that begin with @samp{m4_m4}.
2208 renamesyms(`^[^_]\w*$', `m4_\&')
2210 m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED')
2214 When a symbol has multiple definitions, thanks to @code{pushdef}, the
2215 entire stack is renamed.
2218 pushdef(`foo', `1')pushdef(`foo', `2')
2220 renamesyms(`^foo$', `bar')
2231 @section Indirect call of macros
2233 @cindex indirect call of macros
2234 @cindex call of macros, indirect
2235 @cindex macros, indirect call of
2236 @cindex @acronym{GNU} extensions
2237 Any macro can be called indirectly with @code{indir}:
2239 @deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}})
2240 Results in a call to the macro @var{name}, which is passed the
2241 rest of the arguments @var{args}. If @var{name} is not defined, an
2242 error message is printed, and the expansion is void.
2244 The macro @code{indir} is recognized only with parameters.
2247 This can be used to call macros with computed or ``invalid''
2248 names (@code{define} allows such names to be defined):
2251 define(`$$internal$macro', `Internal macro (name `$0')')
2254 @result{}$$internal$macro
2255 indir(`$$internal$macro')
2256 @result{}Internal macro (name $$internal$macro)
2259 The point is, here, that larger macro packages can have private macros
2260 defined, that will not be called by accident. They can @emph{only} be
2261 called through the builtin @code{indir}.
2263 One other point to observe is that argument collection occurs before
2264 @code{indir} invokes @var{name}, so if argument collection changes the
2265 value of @var{name}, that will be reflected in the final expansion.
2266 This is different than the behavior when invoking macros directly,
2267 where the definition that was in effect before argument collection is
2276 indir(`f', define(`f', `3'))
2278 indir(`f', undefine(`f'))
2279 @error{}m4:stdin:4: Warning: indir: undefined macro `f'
2283 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2284 arguments, @code{indir} defers to the invoked @var{name} for whether a
2285 token representing a builtin is recognized or flattened to the empty
2290 indir(defn(`defn'), `divnum')
2291 @error{}m4:stdin:1: Warning: indir: invalid macro name ignored
2293 indir(`define', defn(`defn'), `divnum')
2294 @error{}m4:stdin:2: Warning: define: invalid macro name ignored
2296 indir(`define', `foo', defn(`divnum'))
2300 indir(`divert', defn(`foo'))
2301 @error{}m4:stdin:5: Warning: divert: empty string treated as 0
2306 @section Indirect call of builtins
2308 @cindex indirect call of builtins
2309 @cindex call of builtins, indirect
2310 @cindex builtins, indirect call of
2311 @cindex @acronym{GNU} extensions
2312 Builtin macros can be called indirectly with @code{builtin}:
2314 @deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}})
2315 @deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1})
2316 Results in a call to the builtin @var{name}, which is passed the
2317 rest of the arguments @var{args}. If @var{name} does not name a
2318 builtin, a warning message is printed, and the expansion is void.
2320 As a special case, if @var{name} is exactly the special token
2321 representing the @code{builtin} macro, as obtained by @code{defn}
2322 (@pxref{Defn}), then @var{args} must consist of a single @var{name1},
2323 and the expansion is the special token representing the builtin macro
2324 named by @var{name1}.
2326 The macro @code{builtin} is recognized only with parameters.
2329 This can be used even if @var{name} has been given another definition
2330 that has covered the original, or been undefined so that no macro
2331 maps to the builtin.
2334 pushdef(`define', `hidden')
2336 undefine(`undefine')
2338 define(`foo', `bar')
2342 builtin(`define', `foo', defn(`divnum'))
2346 builtin(`define', `foo', `BAR')
2351 @result{}undefine(foo)
2354 builtin(`undefine', `foo')
2360 The @var{name} argument only matches the original name of the builtin,
2361 even when the @option{--prefix-builtins} option (or @option{-P},
2362 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2363 from @code{indir}, which only tracks current macro names.
2365 @comment options: -P
2368 m4_builtin(`divnum')
2370 m4_builtin(`m4_divnum')
2371 @error{}m4:stdin:2: Warning: m4_builtin: undefined builtin `m4_divnum'
2374 @error{}m4:stdin:3: Warning: m4_indir: undefined macro `divnum'
2376 m4_indir(`m4_divnum')
2380 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2381 without arguments, even when they normally require parameters to be
2382 recognized; but it will provoke a warning, and the expansion will behave
2383 as though empty strings had been passed as the required arguments.
2389 @error{}m4:stdin:2: Warning: builtin: undefined builtin `'
2392 @error{}m4:stdin:3: Warning: builtin: too few arguments: 0 < 1
2395 @error{}m4:stdin:4: Warning: builtin: undefined builtin `'
2398 @error{}m4:stdin:5: Warning: index: too few arguments: 0 < 2
2402 Normally, once a builtin macro is undefined, the only way to retrieve
2403 its functionality is by defining a new macro that expands to
2404 @code{builtin} under the hood. But this extra layer of expansion is
2405 slightly inefficient, not to mention the fact that it is not robust to
2406 changes in the current quoting scheme due to @code{changequote}
2407 (@pxref{Changequote}). On the other hand, defining a macro to the
2408 special token produced by @code{defn} (@pxref{Defn}) is very efficient,
2409 and avoids the need for quoting within the macro definition; but
2410 @code{defn} only works if the desired macro is already defined by some
2411 other name. So @code{builtin} provides a special case where it is
2412 possible to retrieve the same special token representing a builtin as
2413 what @code{defn} would provide, were the desired macro still defined.
2414 This feature is activated by passing @code{defn(`builtin')} as the first
2415 argument to builtin. Normally, passing a special token representing a
2416 macro as @var{name} results in a warning and an empty expansion, but in
2417 this case, if the second argument @var{name1} names a valid builtin,
2418 there is no warning and the expansion is the appropriate special
2419 token. In fact, with just the @code{builtin} macro accessible, it is
2420 possible to reconstitute the entire startup state of @code{m4}.
2422 In the example below, compare the number of macro invocations performed
2423 by @code{defn1} and @code{defn2}, and the differences once quoting is
2430 define(`foo', `bar')
2432 define(`defn1', `builtin(`defn', $@@)')
2434 define(`defn2', builtin(builtin(`defn', `builtin'), `defn'))
2436 dumpdef(`defn1', `defn2')
2437 @error{}defn1:@tabchar{}`builtin(`defn', $@@)'
2438 @error{}defn2:@tabchar{}<defn>
2443 @error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')'
2444 @error{}m4trace: -1- builtin(`defn', `foo') -> ``bar''
2447 @error{}m4trace: -1- defn2(`foo') -> ``bar''
2450 @error{}m4trace: -1- traceoff -> `'
2452 changequote(`[', `]')
2455 @error{}m4:stdin:11: Warning: builtin: undefined builtin ``defn''
2462 @section Getting the defined macro names
2465 @cindex @acronym{GNU} extensions
2466 The name of the currently defined macros can be accessed by
2469 @deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}})
2470 Without arguments, @code{m4symbols} expands to a sorted list of quoted
2471 strings, separated by commas. This contrasts with @code{dumpdef}
2472 (@pxref{Dumpdef}), whose output cannot be accessed by @code{m4}
2475 When given arguments, @code{m4symbols} returns the sorted subset of the
2476 @var{names} currently defined, and silently ignores the rest.
2477 This macro was added in M4 2.0.
2481 m4symbols(`ifndef', `ifdef', `define', `undef')
2482 @result{}define,ifdef
2486 @chapter Conditionals, loops, and recursion
2488 Macros, expanding to plain text, perhaps with arguments, are not quite
2489 enough. We would like to have macros expand to different things, based
2490 on decisions taken at run-time. For that, we need some kind of conditionals.
2491 Also, we would like to have some kind of loop construct, so we could do
2492 something a number of times, or while some condition is true.
2495 * Ifdef:: Testing if a macro is defined
2496 * Ifelse:: If-else construct, or multibranch
2497 * Shift:: Recursion in @code{m4}
2498 * Forloop:: Iteration by counting
2499 * Foreach:: Iteration by list contents
2503 @section Testing if a macro is defined
2505 @cindex conditionals
2506 There are two different builtin conditionals in @code{m4}. The first is
2509 @deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2})
2510 If @var{name} is defined as a macro, @code{ifdef} expands to
2511 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
2512 omitted, it is taken to be the empty string (according to the normal
2515 The macro @code{ifdef} is recognized only with parameters.
2519 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2520 @result{}foo is not defined
2523 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
2524 @result{}foo is defined
2525 ifdef(`no_such_macro', `yes', `no', `extra argument')
2526 @error{}m4:stdin:4: Warning: ifdef: extra arguments ignored: 4 > 3
2531 @section If-else construct, or multibranch
2533 @cindex comparing strings
2534 The other conditional, @code{ifelse}, is much more powerful. It can be
2535 used as a way to introduce a long comment, as an if-else construct, or
2536 as a multibranch, depending on the number of arguments supplied:
2538 @deffn {Builtin (m4)} ifelse (@var{comment})
2539 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
2541 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
2542 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{})
2543 Used with only one argument, the @code{ifelse} simply discards it and
2546 If called with three or four arguments, @code{ifelse} expands into
2547 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
2548 for character), otherwise it expands to @var{not-equal}. A final fifth
2549 argument is ignored, after triggering a warning.
2551 If called with six or more arguments, and @var{string-1} and
2552 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
2553 otherwise the first three arguments are discarded and the processing
2556 The macro @code{ifelse} is recognized only with parameters.
2559 Using only one argument is a common @code{m4} idiom for introducing a
2560 block comment, as an alternative to repeatedly using @code{dnl}. This
2561 special usage is recognized by @acronym{GNU} @code{m4}, so that in this
2562 case, the warning about missing arguments is never triggered.
2565 ifelse(`some comments')
2567 ifelse(`foo', `bar')
2568 @error{}m4:stdin:2: Warning: ifelse: too few arguments: 2 < 3
2572 Using three or four arguments provides decision points.
2575 ifelse(`foo', `bar', `true')
2577 ifelse(`foo', `foo', `true')
2579 define(`foo', `bar')
2581 ifelse(foo, `bar', `true', `false')
2583 ifelse(foo, `foo', `true', `false')
2587 @cindex macro, blind
2589 Notice how the first argument was used unquoted; it is common to compare
2590 the expansion of a macro with a string. With this macro, you can now
2591 reproduce the behavior of blind builtins, where the macro is recognized
2592 only with arguments.
2595 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
2600 @result{}arguments:1
2602 @result{}arguments:3
2605 @cindex multibranches
2606 However, @code{ifelse} can take more than four arguments. If given more
2607 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
2608 statement in traditional programming languages. If @var{string-1} and
2609 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
2610 the procedure is repeated with the first three arguments discarded. This
2611 calls for an example:
2614 ifelse(`foo', `bar', `third', `gnu', `gnats')
2615 @error{}m4:stdin:1: Warning: ifelse: extra arguments ignored: 5 > 4
2617 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
2619 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
2621 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
2622 @error{}m4:stdin:4: Warning: ifelse: extra arguments ignored: 8 > 7
2626 Naturally, the normal case will be slightly more advanced than these
2627 examples. A common use of @code{ifelse} is in macros implementing loops
2631 @section Recursion in @code{m4}
2633 @cindex recursive macros
2634 @cindex macros, recursive
2635 There is no direct support for loops in @code{m4}, but macros can be
2636 recursive. There is no limit on the number of recursion levels, other
2637 than those enforced by your hardware and operating system.
2640 Loops can be programmed using recursion and the conditionals described
2643 There is a builtin macro, @code{shift}, which can, among other things,
2644 be used for iterating through the actual arguments to a macro:
2646 @deffn {Builtin (m4)} shift (@var{arg1}, @dots{})
2647 Takes any number of arguments, and expands to all its arguments except
2648 @var{arg1}, separated by commas, with each argument quoted.
2650 The macro @code{shift} is recognized only with parameters.
2658 shift(`foo', `bar', `baz')
2662 An example of the use of @code{shift} is this macro:
2664 @deffn Composite reverse (@dots{})
2665 Takes any number of arguments, and reverses their order.
2668 It is implemented as:
2671 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
2672 `reverse(shift($@@)), `$1'')')
2678 reverse(`foo', `bar', `gnats', `and gnus')
2679 @result{}and gnus, gnats, bar, foo
2682 While not a very interesting macro, it does show how simple loops can be
2683 made with @code{shift}, @code{ifelse} and recursion. It also shows
2684 that @code{shift} is usually used with @samp{$@@}. Sometimes, a
2685 recursive algorithm requires adding quotes to each element:
2687 @deffn Composite quote (@dots{})
2688 @deffnx Composite dquote (@dots{})
2689 @deffnx Composite dquote_elt (@dots{})
2690 Takes any number of arguments, and adds quoting. With @code{quote},
2691 only one level of quoting is added, effectively removing whitespace
2692 after commas and turning multiple arguments into a single string. With
2693 @code{dquote}, two levels of quoting are added, one around each element,
2694 and one around the list. And with @code{dquote_elt}, two levels of
2695 quoting are added around each element.
2698 An actual implementation of these three macros is distributed as
2699 @file{m4-@value{VERSION}/@/examples/@/quote.m4} in this package. First,
2700 let's examine their usage:
2704 $ @kbd{m4 -I examples}
2707 -quote-dquote-dquote_elt-
2709 -quote()-dquote()-dquote_elt()-
2711 -quote(`1')-dquote(`1')-dquote_elt(`1')-
2712 @result{}-1-`1'-`1'-
2713 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
2714 @result{}-1,2-`1',`2'-`1',`2'-
2715 define(`n', `$#')dnl
2716 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
2718 dquote(dquote_elt(`1', `2'))
2719 @result{}``1'',``2''
2720 dquote_elt(dquote(`1', `2'))
2724 The last two lines show that when given two arguments, @code{dquote}
2725 results in one string, while @code{dquote_elt} results in two. Now,
2726 examine the implementation. Note that @code{quote} and
2727 @code{dquote_elt} make decisions based on their number of arguments, so
2728 that when called without arguments, they result in nothing instead of a
2729 quoted empty string; this is so that it is possible to distinguish
2730 between no arguments and an empty first argument. @code{dquote}, on the
2731 other hand, results in a string no matter what, since it is still
2732 possible to tell whether it was invoked without arguments based on the
2737 $ @kbd{m4 -I examples}
2738 undivert(`quote.m4')dnl
2739 @result{}divert(`-1')
2740 @result{}# quote(args) - convert args to single-quoted string
2741 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
2742 @result{}# dquote(args) - convert args to quoted list of quoted strings
2743 @result{}define(`dquote', ``$@@'')
2744 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
2745 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
2746 @result{} ```$1'',$0(shift($@@))')')
2747 @result{}divert`'dnl
2751 @section Iteration by counting
2754 @cindex loops, counting
2755 @cindex counting loops
2756 Here is an example of a loop macro that implements a simple for loop.
2758 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
2759 Takes the name in @var{iterator}, which must be a valid macro name, and
2760 successively assign it each integer value from @var{start} to @var{end},
2761 inclusive. For each assignment to @var{iterator}, append @var{text} to
2762 the expansion of the @code{forloop}. @var{text} may refer to
2763 @var{iterator}. Any definition of @var{iterator} prior to this
2764 invocation is restored.
2767 It can, for example, be used for simple counting:
2771 $ @kbd{m4 -I examples}
2772 include(`forloop.m4')
2774 forloop(`i', `1', `8', `i ')
2775 @result{}1 2 3 4 5 6 7 8@w{ }
2778 For-loops can be nested, like:
2782 $ @kbd{m4 -I examples}
2783 include(`forloop.m4')
2785 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
2787 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
2788 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
2789 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
2790 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
2794 The implementation of the @code{forloop} macro is fairly
2795 straightforward. The @code{forloop} macro itself is simply a wrapper,
2796 which saves the previous definition of the first argument, calls the
2797 internal macro @code{@w{_forloop}}, and re-establishes the saved
2798 definition of the first argument.
2800 The macro @code{@w{_forloop}} expands the fourth argument once, and
2801 tests to see if the iterator has reached the final value. If it has
2802 not finished, it increments the iterator (using the predefined macro
2803 @code{incr}, @pxref{Incr}), and recurses.
2805 Here is an actual implementation of @code{forloop}, distributed as
2806 @file{m4-@value{VERSION}/@/examples/@/forloop.m4} in this package:
2810 $ @kbd{m4 -I examples}
2811 undivert(`forloop.m4')dnl
2812 @result{}divert(`-1')
2813 @result{}# forloop(var, from, to, stmt) - simple version
2814 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
2815 @result{}define(`_forloop',
2816 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
2817 @result{}divert`'dnl
2820 Notice the careful use of quotes. Certain macro arguments are left
2821 unquoted, each for its own reason. Try to find out @emph{why} these
2822 arguments are left unquoted, and see what happens if they are quoted.
2823 (As presented, these two macros are useful but not very robust for
2824 general use. They lack even basic error handling for cases like
2825 @var{start} less than @var{end}, @var{end} not numeric, or
2826 @var{iterator} not being a macro name. See if you can improve these
2827 macros; or @pxref{Improved forloop, , Answers}).
2830 @section Iteration by list contents
2832 @cindex for each loops
2833 @cindex loops, list iteration
2834 @cindex iterating over lists
2835 Here is an example of a loop macro that implements list iteration.
2837 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
2838 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
2839 Takes the name in @var{iterator}, which must be a valid macro name, and
2840 successively assign it each value from @var{paren-list} or
2841 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
2842 comma-separated list of elements contained in parentheses. In
2843 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
2844 contained in a quoted string. For each assignment to @var{iterator},
2845 append @var{text} to the overall expansion. @var{text} may refer to
2846 @var{iterator}. Any definition of @var{iterator} prior to this
2847 invocation is restored.
2850 As an example, this displays each word in a list inside of a sentence,
2851 using an implementation of @code{foreach} distributed as
2852 @file{m4-@value{VERSION}/@/examples/@/foreach.m4}, and @code{foreachq}
2853 in @file{m4-@value{VERSION}/@/examples/@/foreachq.m4}.
2857 $ @kbd{m4 -I examples}
2858 include(`foreach.m4')
2860 foreach(`x', (foo, bar, foobar), `Word was: x
2862 @result{}Word was: foo
2863 @result{}Word was: bar
2864 @result{}Word was: foobar
2865 include(`foreachq.m4')
2867 foreachq(`x', `foo, bar, foobar', `Word was: x
2869 @result{}Word was: foo
2870 @result{}Word was: bar
2871 @result{}Word was: foobar
2874 It is possible to be more complex; each element of the @var{paren-list}
2875 or @var{quote-list} can itself be a list, to pass as further arguments
2876 to a helper macro. This example generates a shell case statement:
2880 $ @kbd{m4 -I examples}
2881 include(`foreach.m4')
2883 define(`_case', ` $1)
2886 define(`_cat', `$1$2')dnl
2889 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
2890 `_cat(`_case', x)')dnl
2892 @result{} vara=" a";;
2894 @result{} varb=" b";;
2896 @result{} varc=" c";;
2901 The implementation of the @code{foreach} macro is a bit more involved;
2902 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
2903 needed to grab the first element of a list. Second,
2904 @code{@w{_foreach}} implements the recursion, successively walking
2905 through the original list. Here is a simple implementation of
2910 $ @kbd{m4 -I examples}
2911 undivert(`foreach.m4')dnl
2912 @result{}divert(`-1')
2913 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
2914 @result{}# parenthesized list, simple version
2915 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
2916 @result{}define(`_arg1', `$1')
2917 @result{}define(`_foreach', `ifelse(`$2', `()', `',
2918 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
2919 @result{}divert`'dnl
2922 Unfortunately, that implementation is not robust to macro names as list
2923 elements. Each iteration of @code{@w{_foreach}} is stripping another
2924 layer of quotes, leading to erratic results if list elements are not
2925 already fully expanded. The first cut at implementing @code{foreachq}
2926 takes this into account. Also, when using quoted elements in a
2927 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
2928 has the nice property of requiring fewer characters to create a list
2929 containing the same quoted elements. To see the difference between the
2930 two macros, we attempt to pass double-quoted macro names in a list,
2931 expecting the macro name on output after one layer of quotes is removed
2932 during list iteration and the final layer removed during the final
2937 $ @kbd{m4 -I examples}
2938 define(`a', `1')define(`b', `2')define(`c', `3')
2940 include(`foreach.m4')
2942 include(`foreachq.m4')
2944 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
2951 foreachq(`x', ```a'', ``(b'', ``c)''', `x
2958 Obviously, @code{foreachq} did a better job; here is its implementation:
2962 $ @kbd{m4 -I examples}
2963 undivert(`foreachq.m4')dnl
2964 @result{}include(`quote.m4')dnl
2965 @result{}divert(`-1')
2966 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
2967 @result{}# quoted list, simple version
2968 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
2969 @result{}define(`_arg1', `$1')
2970 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
2971 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
2972 @result{}divert`'dnl
2975 Notice that @code{@w{_foreachq}} had to use the helper macro
2976 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
2977 embedded @code{ifelse} call does not go haywire if a list element
2978 contains a comma. Unfortunately, this implementation of @code{foreachq}
2979 has its own severe flaw. Whereas the @code{foreach} implementation was
2980 linear, this macro is quadratic in the number of list elements, and is
2981 much more likely to trip up the limit set by the command line option
2982 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
2983 Invoking m4}). (It is possible to have robust iteration with linear
2984 behavior for either list style. See if you can learn from the best
2985 elements of both of these implementations to create robust macros; or
2986 @pxref{Improved foreach, , Answers}).
2988 With a robust @code{foreach} implementation, it is possible to create a
2989 filter on a list of defined symbols. This next example will find all
2990 symbols that contain @samp{if}. Notice the use of @code{dquote} and
2991 @code{dquote_elt} to ensure that the list of macro names is properly
2992 quoted; without these, the iteration would be invoking various macros
2993 with catastrophic effects. This example also shows a trick for
2994 generating the correct number of commas in the resulting output.
2998 $ @kbd{m4 -I examples}
2999 include(`quote.m4')include(`foreachq.m4')
3001 pushdef(`sep', ``, '')
3003 pushdef(`cleanup', `popdef(`sep', `cleanup')')
3005 pushdef(`sep', `define(`cleanup',
3006 `popdef(`cleanup')')popdef(`sep')')
3008 foreachq(`macro', dquote(dquote_elt(m4symbols)),
3009 `regexp(macro, `.*if.*', `sep`\&'')')
3010 @result{}ifdef, ifelse, shift
3016 @chapter How to debug macros and input
3018 When writing macros for @code{m4}, they often do not work as intended on
3019 the first try (as is the case with most programming languages).
3020 Fortunately, there is support for macro debugging in @code{m4}.
3023 * Dumpdef:: Displaying macro definitions
3024 * Trace:: Tracing macro calls
3025 * Debugmode:: Controlling debugging options
3026 * Debuglen:: Limiting debug output
3027 * Debugfile:: Saving debugging output
3031 @section Displaying macro definitions
3033 @cindex displaying macro definitions
3034 @cindex macros, displaying definitions
3035 @cindex definitions, displaying macro
3036 If you want to see what a name expands into, you can use the builtin
3039 @deffn {Builtin (m4)} dumpdef (@ovar{names@dots{}})
3040 Accepts any number of arguments. If called without any arguments,
3041 it displays the definitions of all known names, otherwise it displays
3042 the definitions of the @var{names} given. The output is printed
3043 directly to standard error, independently of the @option{--debugfile}
3044 option (@pxref{Debugging options, , Invoking m4}), or @code{debugfile} macro.
3045 The output is sorted by name. If an unknown name is encountered, a
3048 The expansion of @code{dumpdef} is void.
3053 define(`foo', `Hello world.')
3056 @error{}foo:@tabchar{}`Hello world.'
3059 @error{}define:@tabchar{}<define>
3063 The last example shows how builtin macros definitions are displayed.
3064 The definition that is dumped corresponds to what would occur if the
3065 macro were to be called at that point, even if other definitions are
3066 still live due to redefining a macro during argument collection.
3070 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
3072 f(popdef(`f')dumpdef(`f'))
3073 @error{}f:@tabchar{}``$0'1'
3075 f(popdef(`f')dumpdef(`f'))
3076 @error{}m4:stdin:3: Warning: dumpdef: undefined macro `f'
3080 @xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and
3081 @samp{s} flags affect the details of the display. Remember, the
3082 @samp{q} flag is implied when the @option{--debug} option (@option{-d},
3083 @pxref{Debugging options, , Invoking m4}) is used in the command line
3084 without arguments. Also, @code{--debuglen} (@pxref{Debuglen}) can affect
3085 output, by truncating longer strings.
3087 @comment options: -ds -l3
3090 pushdef(`foo', `1 long string')
3092 pushdef(`foo', defn(`divnum'))
3098 dumpdef(`foo', `dnl', `indir', `__gnu__')
3099 @error{}__gnu__:@tabchar{}@{gnu@}
3100 @error{}dnl:@tabchar{}<dnl>@{m4@}
3101 @error{}foo:@tabchar{}3, <div...>@{m4@}, 1 l...
3102 @error{}indir:@tabchar{}<ind...>@{gnu@}
3109 @section Tracing macro calls
3111 @cindex tracing macro expansion
3112 @cindex macro expansion, tracing
3113 @cindex expansion, tracing macro
3114 It is possible to trace macro calls and expansions through the builtins
3115 @code{traceon} and @code{traceoff}:
3117 @deffn {Builtin (m4)} traceon (@ovar{names@dots{}})
3118 @deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}})
3119 When called without any arguments, @code{traceon} and @code{traceoff}
3120 will turn tracing on and off, respectively, for all macros, identical to
3121 using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}).
3123 When called with arguments, only the macros listed in @var{names} are
3124 affected, whether or not they are currently defined. A macro's
3125 expansion will be traced if global tracing is on, or if the individual
3126 macro tracing flag is set; to avoid tracing a macro, both the global
3127 flag and the macro must have tracing off.
3129 The expansion of @code{traceon} and @code{traceoff} is void.
3132 Whenever a traced macro is called and the arguments have been collected,
3133 the call is displayed. If the expansion of the macro call is not void,
3134 the expansion can be displayed after the call. The output is printed
3135 to the current debug file (defaulting to standard error,
3140 define(`foo', `Hello World.')
3142 define(`echo', `$@@')
3144 traceon(`foo', `echo')
3147 @error{}m4trace: -1- foo -> `Hello World.'
3148 @result{}Hello World.
3149 echo(`gnus', `and gnats')
3150 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
3151 @result{}gnus,and gnats
3154 The number between dashes is the depth of the expansion. It is one most
3155 of the time, signifying an expansion at the outermost level, but it
3156 increases when macro arguments contain unquoted macro calls. The
3157 maximum number that will appear between dashes is controlled by the
3158 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
3159 , Invoking m4}). Additionally, the option @option{--trace} (or
3160 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
3163 @comment options: -d-V -L3 -tifelse
3166 $ @kbd{m4 -L 3 -t ifelse}
3168 @error{}m4trace: -1- ifelse
3170 ifelse(ifelse(ifelse(`three levels')))
3171 @error{}m4trace: -3- ifelse
3172 @error{}m4trace: -2- ifelse
3173 @error{}m4trace: -1- ifelse
3175 ifelse(ifelse(ifelse(ifelse(`four levels'))))
3176 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
3179 Tracing by name is an attribute that is preserved whether the macro is
3180 defined or not. This allows the selection of macros to trace before
3181 those macros are defined.
3191 define(`foo', `bar')
3194 @error{}m4trace: -1- foo -> `bar'
3198 ifdef(`foo', `yes', `no')
3201 @error{}m4:stdin:8: Warning: indir: undefined macro `foo'
3203 define(`foo', `blah')
3206 @error{}m4trace: -1- foo -> `blah'
3210 Tracing even works on builtins. However, @command{defn} (@pxref{Defn})
3211 does not transfer tracing status.
3215 traceon(`eval', `m4_divnum')
3217 define(`m4_eval', defn(`eval'))
3219 define(`m4_divnum', defn(`divnum'))
3222 @error{}m4trace: -1- eval(`0') -> `0'
3225 @error{}m4trace: -2- m4_divnum -> `0'
3229 As of @acronym{GNU} M4 2.0, named macro tracing is independent of global
3230 tracing status; calling @code{traceoff} without arguments turns off the
3231 global trace flag, but does not turn off tracing for macros where
3232 tracing was requested by name. Likewise, calling @code{traceon} without
3233 arguments will affect tracing of macros that are not defined yet. This
3234 behavior matches traditional implementations of @code{m4}.
3240 define(`foo', `bar')
3241 @error{}m4trace: -1- define(`foo', `bar') -> `'
3243 foo # traced, even though foo was not defined at traceon
3244 @error{}m4trace: -1- foo -> `bar'
3245 @result{}bar # traced, even though foo was not defined at traceon
3247 @error{}m4trace: -1- traceoff(`foo') -> `'
3249 foo # traced, since global tracing is still on
3250 @error{}m4trace: -1- foo -> `bar'
3251 @result{}bar # traced, since global tracing is still on
3253 @error{}m4trace: -1- traceon(`foo') -> `'
3256 @error{}m4trace: -1- traceoff -> `'
3258 foo # traced, since foo is now traced by name
3259 @error{}m4trace: -1- foo -> `bar'
3260 @result{}bar # traced, since foo is now traced by name
3264 @result{}bar # untraced
3267 However, @acronym{GNU} M4 1.4.7 and earlier had slightly different
3268 semantics, where @code{traceon} without arguments only affected symbols
3269 that were defined at that moment, and @code{traceoff} without arguments
3270 stopped all tracing, even when tracing was requested by macro name. The
3271 addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it
3272 possible to write a file that approximates the older semantics
3273 regardless of which version of @acronym{GNU} M4 is in use.
3275 @comment options: -d-V
3279 `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)',
3280 `builtin(`traceon', $@@)')')dnl
3281 define(`traceoff', `ifelse(`$#', `0',
3282 `builtin(`traceoff')builtin(`traceoff', m4symbols)',
3283 `builtin(`traceoff', $@@)')')')dnl
3286 traceon # called before b is defined, so b is not traced
3287 @result{} # called before b is defined, so b is not traced
3289 @error{}m4trace: -1- define
3292 @error{}m4trace: -1- a
3295 @error{}m4trace: -1- traceon
3296 @error{}m4trace: -1- ifelse
3297 @error{}m4trace: -1- builtin
3300 @error{}m4trace: -1- a
3301 @error{}m4trace: -1- b
3303 traceoff # stops tracing b, even though it was traced by name
3304 @error{}m4trace: -1- traceoff
3305 @error{}m4trace: -1- ifelse
3306 @error{}m4trace: -1- builtin
3307 @error{}m4trace: -2- m4symbols
3308 @error{}m4trace: -1- builtin
3309 @result{} # stops tracing b, even though it was traced by name
3314 @xref{Debugmode}, for information on controlling the details of the
3318 @section Controlling debugging options
3320 @cindex controlling debugging output
3321 @cindex debugging output, controlling
3322 The @option{--debug} option to @code{m4} (also spelled
3323 @option{--debugmode} or @option{-d}, @pxref{Debugging options, ,
3324 Invoking m4}) controls the amount of details presented in three
3325 categories of output. Trace output is requested by @code{traceon}
3326 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
3327 relation to a macro invocation. Debug output tracks useful events not
3328 associated with a macro invocation, and each line is prefixed by
3329 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
3330 affected, with no prefix added to the output lines.
3332 The @var{flags} following the option can be one or more of the
3337 In trace output, show the actual arguments that were collected before
3338 invoking the macro. Arguments are subject to length truncation
3339 specified by @code{debuglen} (@pxref{Debuglen}).
3342 In trace output, show an additional line for each macro call, prior to
3343 the arguments being collected, that shows the definition of the macro
3344 that will be used for the expansion. The definition is subject to
3345 length truncation specified by @code{debuglen} (@pxref{Debuglen}).
3348 In trace output, show the expansion of each macro call. The expansion
3349 is subject to length truncation specified by @code{debuglen}
3353 In debug and trace output, include the name of the current input file in
3357 In debug output, print a message each time the current input file is
3361 In debug and trace output, include the current input line number in the
3365 In debug output, print a message each time a module is manipulated
3366 (@pxref{Modules}). In trace output when the @samp{c} flag is in effect,
3367 and in dumpdef output, follow builtin macros with their module name,
3368 surrounded by braces (@samp{@{@}}).
3371 In debug output, print a message when a named file is found through the
3372 path search mechanism (@pxref{Search Path}), giving the actual file name
3376 In trace and dumpdef output, quote actual arguments and macro expansions
3377 in the display with the current quotes. This is useful in connection
3378 with the @samp{a} and @samp{e} flags above.
3381 In dumpdef output, show the entire stack of definitions associated with
3382 a symbol via @code{pushdef}.
3385 In trace output, trace all macro calls made in this invocation of
3386 @code{m4}. This is equivalent to using @code{traceon} without
3390 In trace output, add a unique `macro call id' to each line of the trace
3391 output. This is useful in connection with the @samp{c} flag above.
3394 A shorthand for all of the above flags.
3397 As special cases, if @var{flags} starts with a @samp{+}, the named flags
3398 are enabled without impacting other flags, and if it starts with a
3399 @samp{-}, the named flags are disabled without impacting other flags.
3400 Without either of these starting characters, @var{flags} simply replaces
3401 the previous setting.
3402 @comment FIXME - should we accept usage like debugmode(+fl-q)? Also,
3403 @comment should we add debugmode(?) which expands to the current
3404 @comment enabled flags, and debugmode(e?) which expands to e if e is
3405 @comment currently enabled?
3407 If no flags are specified with the @option{--debug} option, the default is
3408 @samp{aeq}. Many examples in this manual show their output using
3411 @cindex @acronym{GNU} extensions
3412 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
3413 the debugging output format:
3415 @deffn {Builtin (gnu)} debugmode (@ovar{flags})
3416 The argument @var{flags} should be a subset of the letters listed above.
3417 If no argument is present, all debugging flags are cleared
3418 (as if no @option{--debug} was given), and with an empty argument the flags
3419 are reset to the default of @samp{aeq}. If an unknown flag is
3420 encountered, an error is issued.
3422 The expansion of @code{debugmode} is void.
3425 @comment options: -d-V
3428 define(`foo', `FOO$1')
3430 traceon(`foo', `divnum')
3435 @error{}m4trace: -1- foo -> `FOO'
3440 @error{}m4trace: -1- foo
3445 @error{}m4trace:8: -1- id 8: foo ... = FOO$1
3446 @error{}m4trace:8: -2- id 9: divnum ... = <divnum>@{m4@}
3447 @error{}m4trace:8: -2- id 9: divnum
3448 @error{}m4trace:8: -1- id 8: foo
3455 @section Limiting debug output
3457 @cindex @acronym{GNU} extensions
3460 @cindex limiting trace output length
3461 @cindex trace output, limiting length
3462 @cindex dumpdef output, limiting length
3463 When debugging, sometimes it is desirable to reduce the clutter of
3464 arbitrary-length strings, because the prefix carries enough information
3465 to understand the issues. The builtin macro @code{debuglen}, along with
3466 the command line option counterpart @option{--debuglen} (or @option{-l},
3467 @pxref{Debugging options, , Invoking m4}), allow on-the-fly control of
3468 debugging string lengths:
3470 @deffn {Builtin (gnu)} debuglen (@var{len})
3471 The argument @var{len} is an integer that controls how much of
3472 arbitrary-length strings should be output during trace and dumpdef
3473 output. If specified to a non-zero value, then strings longer than that
3474 length are truncated, and @samp{...} included in the output to show that
3475 truncation took place. A warning is issued if @var{len} cannot be
3476 parsed as an integer.
3477 @comment FIXME - make this understand an optional suffix, similar to how
3478 @comment --debuglen does. Also, we need a section documenting scaling
3480 @comment FIXME - should we allow len to be `?', meaning expand to the
3481 @comment current value?
3483 The macro @code{debuglen} is recognized only with parameters.
3486 @comment options: -l4 -techo
3488 $ @kbd{m4 -d -l 4 -t echo}
3490 @error{}m4:stdin:1: Warning: debuglen: non-numeric argument `oops'
3492 define(`echo', `$@')
3495 @error{}m4trace: -1- echo(`long...') -> ``lon...'
3496 @result{}long string
3502 @error{}m4trace: -1- echo(`long string') -> ``long string''
3503 @result{}long string
3507 @error{}m4trace: -1- echo(`long string') -> ``long string...'
3508 @result{}long string
3512 @section Saving debugging output
3514 @cindex saving debugging output
3515 @cindex debugging output, saving
3516 @cindex output, saving debugging
3517 @cindex @acronym{GNU} extensions
3518 Debug and tracing output can be redirected to files using either the
3519 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
3520 Invoking m4}), or with the builtin macro @code{debugfile}:
3522 @deffn {Builtin (gnu)} debugfile (@ovar{file})
3523 Send all further debug and trace output to @var{file}, opened in append
3524 mode. If @var{file} is the empty string, debug and trace output are
3525 discarded. If @code{debugfile} is called without any arguments,
3526 debug and trace output are sent to standard error. This does not
3527 affect warnings, error messages, or @code{errprint} and @code{dumpdef}
3528 output, which are always sent to standard error. If @var{file} cannot
3529 be opened, the current debug file is unchanged, and an error is issued.
3531 When the @option{--safer} option (@pxref{Operation modes, , Invoking
3532 m4}) is in effect, @var{file} must be empty or omitted, since otherwise
3533 an input file could cause the modification of arbitrary files.
3535 The expansion of @code{debugfile} is void.
3543 @error{}m4:stdin:2: Warning: divnum: extra arguments ignored: 1 > 0
3544 @error{}m4trace: -1- divnum(`extra') -> `0'
3549 @error{}m4:stdin:4: Warning: divnum: extra arguments ignored: 1 > 0
3554 @error{}m4trace: -1- divnum -> `0'
3558 Although the @option{--safer} option cripples @code{debugfile} to a
3559 limited subset of capabilities, you may still use the @option{--debugfile}
3560 option from the command line with no restrictions.
3562 @comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l
3565 $ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq}
3566 foo # traced to `trace'
3567 @result{}bar # traced to `trace'
3569 @error{}m4:stdin:2: debugfile: disabled by --safer
3571 foo # traced to `trace'
3572 @result{}bar # traced to `trace'
3575 foo # trace discarded
3576 @result{}bar # trace discarded
3579 foo # traced to stderr
3580 @error{}m4trace:7: -1- foo -> `bar'
3581 @result{}bar # traced to stderr
3582 undivert(`trace')dnl
3583 @result{}m4trace:1: -1- foo -> `bar'
3584 @result{}m4trace:3: -1- foo -> `bar'
3588 @chapter Input control
3590 This chapter describes various builtin macros for controlling the input
3594 * Dnl:: Deleting whitespace in input
3595 * Changequote:: Changing the quote characters
3596 * Changecom:: Changing the comment delimiters
3597 * Changeresyntax:: Changing the regular expression syntax
3598 * Changesyntax:: Changing the lexical structure of the input
3599 * M4wrap:: Saving text until end of input
3603 @section Deleting whitespace in input
3605 @cindex deleting whitespace in input
3606 The builtin @code{dnl} stands for ``Discard to Next Line'':
3608 @deffn {Builtin (m4)} dnl
3609 All characters, up to and including the next newline, are discarded
3610 without performing any macro expansion. A warning is issued if the end
3611 of the file is encountered without a newline.
3613 The expansion of @code{dnl} is void.
3616 It is often used in connection with @code{define}, to remove the
3617 newline that follows the call to @code{define}. Thus
3620 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
3625 The input up to and including the next newline is discarded, as opposed
3626 to the way comments are treated (@pxref{Comments}), when the command
3627 line option @option{--discard-comments} is not in effect
3628 (@pxref{Operation modes, , Invoking m4}).
3630 Usually, @code{dnl} is immediately followed by an end of line or some
3631 other whitespace. @acronym{GNU} @code{m4} will produce a warning diagnostic if
3632 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
3633 will collect and process all arguments, looking for a matching close
3634 parenthesis. All predictable side effects resulting from this
3635 collection will take place. @code{dnl} will return no output. The
3636 input following the matching close parenthesis up to and including the
3637 next newline, on whatever line containing it, will still be discarded.
3640 dnl(`args are ignored, but side effects occur',
3641 define(`foo', `like this')) while this text is ignored: undefine(`foo')
3642 @error{}m4:stdin:1: Warning: dnl: extra arguments ignored: 2 > 0
3643 See how `foo' was defined, foo?
3644 @result{}See how foo was defined, like this?
3647 If the end of file is encountered without a newline character, a
3648 warning is issued and dnl stops consuming input.
3651 m4wrap(`m4wrap(`2 hi
3657 @error{}m4:stdin:1: Warning: dnl: end of file treated as newline
3662 @section Changing the quote characters
3664 @cindex changing the quote delimiters
3665 @cindex quote delimiters, changing the
3666 The default quote delimiters can be changed with the builtin
3669 @deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '})
3670 This sets @var{start} as the new begin-quote delimiter and @var{end} as
3671 the new end-quote delimiter. If both arguments are missing, the default
3672 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
3673 quoting is disabled. Otherwise, if @var{end} is missing or void, the
3674 default end-quote delimiter (@code{'}) is used. The quote delimiters
3675 can be of any length.
3677 The expansion of @code{changequote} is void.
3681 changequote(`[', `]')
3683 define([foo], [Macro [foo].])
3689 The quotation strings can safely contain eight-bit characters.
3690 If no single character is appropriate, @var{start} and @var{end} can be
3691 of any length. Other implementations cap the delimiter length to five
3692 characters, but @acronym{GNU} has no inherent limit.
3695 changequote(`[[[', `]]]')
3697 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
3700 @result{}Macro [[foo]].
3703 Calling @code{changequote} with @var{start} as the empty string will
3704 effectively disable the quoting mechanism, leaving no way to quote text.
3705 However, using an empty string is not portable, as some other
3706 implementations of @code{m4} revert to the default quoting, while others
3707 preserve the prior non-empty delimiter. If @var{start} is not empty,
3708 then an empty @var{end} will use the default end-quote delimiter of
3709 @samp{'}, as otherwise, it would be impossible to end a quoted string.
3710 Again, this is not portable, as some other @code{m4} implementations
3711 reuse @var{start} as the end-quote delimiter, while others preserve the
3712 previous non-empty value. Omitting both arguments restores the default
3713 begin-quote and end-quote delimiters; fortunately this behavior is
3714 portable to all implementations of @code{m4}.
3717 define(`foo', `Macro `FOO'.')
3722 @result{}Macro `FOO'.
3724 @result{}`Macro `FOO'.'
3731 There is no way in @code{m4} to quote a string containing an unmatched
3732 begin-quote, except using @code{changequote} to change the current
3735 If the quotes should be changed from, say, @samp{[} to @samp{[[},
3736 temporary quote characters have to be defined. To achieve this, two
3737 calls of @code{changequote} must be made, one for the temporary quotes
3738 and one for the new quotes.
3740 Macros are recognized in preference to the begin-quote string, so if a
3741 prefix of @var{start} can be recognized as part of a potential macro
3742 name, the quoting mechanism is effectively disabled. Unless you use
3743 @code{changesyntax} (@pxref{Changesyntax}), this means that @var{start}
3744 should not begin with a letter, digit, or @samp{_} (underscore).
3745 However, even though quoted strings are not recognized, the quote
3746 characters can still be discerned in macro expansion and in trace
3750 define(`echo', `$@@')
3754 changequote(`q', `Q')
3762 changequote(`-', `EOF')
3768 changequote(`1', `2')
3776 Quotes are recognized in preference to argument collection. In
3777 particular, if @var{start} is a single @samp{(}, then argument
3778 collection is effectively disabled. For portability with other
3779 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
3780 @samp{)} as the first character in @var{start}.
3783 define(`echo', `$#:$@@:')
3787 changequote(`(',`)')
3793 changequote(`((', `))')
3801 changequote(`,', `)')
3807 If @var{end} is a prefix of @var{start}, the end-quote will be
3808 recognized in preference to a nested begin-quote. In particular,
3809 changing the quotes to have the same string for @var{start} and
3810 @var{end} disables nesting of quotes. When quote nesting is disabled,
3811 it is impossible to double-quote strings across macro expansions, so
3812 using the same string is not done very often.
3817 changequote(`""', `"')
3829 changequote(`"', `"')
3835 It is an error if the end of file occurs within a quoted string.
3840 @result{}hello world
3843 @error{}m4:stdin:2: end of file in string
3847 @section Changing the comment delimiters
3849 @cindex changing comment delimiters
3850 @cindex comment delimiters, changing
3851 The default comment delimiters can be changed with the builtin
3852 macro @code{changecom}:
3854 @deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}})
3855 This sets @var{start} as the new begin-comment delimiter and @var{end}
3856 as the new end-comment delimiter. If both arguments are missing, or
3857 @var{start} is void, then comments are disabled. Otherwise, if
3858 @var{end} is missing or void, the default end-comment delimiter of
3859 newline is used. The comment delimiters can be of any length.
3861 The expansion of @code{changecom} is void.
3865 define(`comment', `COMMENT')
3868 @result{}# A normal comment
3869 changecom(`/*', `*/')
3871 # Not a comment anymore
3872 @result{}# Not a COMMENT anymore
3873 But: /* this is a comment now */ while this is not a comment
3874 @result{}But: /* this is a comment now */ while this is not a COMMENT
3877 @cindex comments, copied to output
3878 Note how comments are copied to the output, much as if they were quoted
3879 strings. If you want the text inside a comment expanded, quote the
3880 start comment delimiter.
3882 Calling @code{changecom} without any arguments, or with @var{start} as
3883 the empty string, will effectively disable the commenting mechanism. To
3884 restore the original comment start of @samp{#}, you must explicitly ask
3885 for it. If @var{start} is not empty, then an empty @var{end} will use
3886 the default end-comment delimiter of newline, as otherwise, it would be
3887 impossible to end a comment. However, this is not portable, as some
3888 other @code{m4} implementations preserve the previous non-empty
3892 define(`comment', `COMMENT')
3896 # Not a comment anymore
3897 @result{}# Not a COMMENT anymore
3901 @result{}# comment again
3904 The comment strings can safely contain eight-bit characters.
3905 If no single character is appropriate, @var{start} and @var{end} can be
3906 of any length. Other implementations cap the delimiter length to five
3907 characters, but @acronym{GNU} has no inherent limit.
3909 Macros and quotes are recognized in preference to comments, so if a
3910 prefix of @var{start} can be recognized as part of a potential macro
3911 name, or confused with a quoted string, the comment mechanism is
3912 effectively disabled. Unless you use @code{changesyntax}
3913 (@pxref{Changesyntax}), this means that @var{start} should not begin
3914 with a letter, digit, or @samp{_} (underscore), and that neither the
3915 start-quote nor the start-comment string should be a prefix of the
3921 define(`hi1hi2', `hello')
3933 changecom(`[[', `]]')
3935 changequote(`[[[', `]]]')
3945 changecom(`[[[', `]]]')
3947 changequote(`[[', `]]')
3955 Comments are recognized in preference to argument collection. In
3956 particular, if @var{start} is a single @samp{(}, then argument
3957 collection is effectively disabled. For portability with other
3958 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
3959 @samp{)} as the first character in @var{start}.
3962 define(`echo', `$#:$@@:')
3972 changecom(`((', `))')
3981 @result{}1:HI,hi)bye:
3984 It is an error if the end of file occurs within a comment.
3988 changecom(`/*', `*/')
3992 @error{}m4:stdin:2: end of file in comment
3995 @node Changeresyntax
3996 @section Changing the regular expression syntax
3998 @cindex regular expression syntax, changing
3999 @cindex @acronym{GNU} extensions
4000 The @acronym{GNU} extensions @code{patsubst}, @code{regexp}, and more
4001 recently, @code{renamesyms} each deal with regular expressions. There
4002 are multiple flavors of regular expressions, so the
4003 @code{changeresyntax} builtin exists to allow choosing the default
4006 @deffn {Builtin (gnu)} changeresyntax (@var{resyntax})
4007 Changes the default regular expression syntax used by M4 according to
4008 the value of @var{resyntax}, equivalent to passing @var{resyntax} as the
4009 argument to the command line option @option{--regexp-syntax}
4010 (@pxref{Operation modes, , Invoking m4}). If @var{resyntax} is empty,
4011 the default flavor is reverted to emacs style.
4013 @var{resyntax} can be any one of the values in the table below. Case is
4014 not important, and @kbd{-} or @kbd{ } can be substituted for @kbd{_} in
4015 the given names. If @var{resyntax} is unrecognized, a warning is
4016 issued and the default flavor is not changed.
4020 @xref{awk regular expression syntax}, for details.
4026 @xref{posix-basic regular expression syntax}, for details.
4030 @itemx POSIX_EXTENDED
4031 @xref{posix-extended regular expression syntax}, for details.
4035 @xref{gnu-awk regular expression syntax}, for details.
4039 @xref{egrep regular expression syntax}, for details.
4044 @xref{emacs regular expression syntax}, for details. This is the
4045 default regular expression flavor.
4048 @xref{grep regular expression syntax}, for details.
4051 @itemx POSIX_MINIMAL
4052 @itemx POSIX_MINIMAL_BASIC
4053 @xref{posix-minimal-basic regular expression syntax}, for details.
4056 @xref{posix-awk regular expression syntax}, for details.
4059 @xref{posix-egrep regular expression syntax}, for details.
4062 The expansion of @code{changeresyntax} is void.
4063 The macro @code{changeresyntax} is recognized only with parameters.
4064 This macro was added in M4 2.0.
4067 For an example of how @var{resyntax} is recognized, the first three
4068 usages select the @samp{GNU_M4} regular expression flavor:
4071 changeresyntax(`gnu m4')
4073 changeresyntax(`GNU-m4')
4075 changeresyntax(`Gnu_M4')
4077 changeresyntax(`unknown')
4078 @error{}m4:stdin:4: Warning: changeresyntax: bad syntax-spec: `unknown'
4082 Using @code{changeresyntax} makes it possible to omit the optional
4083 @var{resyntax} parameter to other macros, while still using a different
4084 regular expression flavor.
4087 patsubst(`ab', `a|b', `c')
4089 patsubst(`ab', `a\|b', `c')
4091 patsubst(`ab', `a|b', `c', `EXTENDED')
4093 changeresyntax(`EXTENDED')
4095 patsubst(`ab', `a|b', `c')
4097 patsubst(`ab', `a\|b', `c')
4102 @section Changing the lexical structure of the input
4104 @cindex lexical structure of the input
4105 @cindex input, lexical structure of the
4106 @cindex syntax table
4107 @cindex @acronym{GNU} extensions
4109 The macro @code{changesyntax} and all associated functionality is
4110 experimental (@pxref{Experiments}). The functionality might change in
4111 the future. Please direct your comments about it the same way you would
4115 The input to @code{m4} is read character by character, and these
4116 characters are grouped together to form input tokens (such as macro
4117 names, strings, comments, etc.).
4119 Each token is parsed according to certain rules. For example, a macro
4120 name starts with a letter or @samp{_} and consists of the longest
4121 possible string of letters, @samp{_} and digits. But who is to decide
4122 what characters are letters, digits, quotes, white space? Earlier the
4123 operating system decided, now you do.
4125 Input characters belong to different categories:
4129 Characters that start a macro name. Defaults to the letters as defined
4130 by the locale, and the character @samp{_}.
4133 Characters that, together with the letters, form the remainder of a
4134 macro name. Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any
4135 other digits defined by the locale.
4138 Characters that should be trimmed from the beginning of each argument to
4139 a macro call. The defaults are space, tab, newline, carriage return,
4140 form feed, and vertical tab, and any others as defined by the locale.
4142 @item Open parenthesis
4143 Characters that open the argument list of a macro call. The default is
4144 the single character @samp{(}.
4146 @item Close parenthesis
4147 Characters that close the argument list of a macro call. The default
4148 is the single character @samp{)}.
4150 @item Argument separator
4151 Characters that separate the arguments of a macro call. The default is
4152 the single character @samp{,}.
4155 Characters that can introduce an argument reference in the body of a
4156 macro. The default is the single character @samp{$}.
4159 Characters that introduce an extended argument reference in the body of
4160 a macro immediately after a character in the Dollar category. The
4161 default is the single character @samp{@{}.
4164 Characters that conclude an extended argument reference in the body of a
4165 macro. The default is the single character @samp{@}}.
4168 The set of characters that can start a single-character quoted string.
4169 The default is the single character @samp{`}. For multiple-character
4170 quote delimiters, use @code{changequote} (@pxref{Changequote}).
4173 The set of characters that can start a single-character comment. The
4174 default is the single character @samp{#}. For multiple-character
4175 comment delimiters, use @code{changecom} (@pxref{Changecom}).
4178 Characters that have no special syntactical meaning to @code{m4}.
4179 Defaults to all characters except those in the categories above.
4182 Characters that themselves, alone, form macro names. This is a
4183 @acronym{GNU} extension, and active characters have lower precedence
4184 than comments. By default, no characters are active.
4187 Characters that must precede macro names for them to be recognized.
4188 This is a @acronym{GNU} extension. When an escape character is defined,
4189 then macros are not recognized unless the escape character is present;
4190 however, the macro name, visible by @samp{$0} in macro definitions, does
4191 not include the escape character. By default, no characters are
4194 @comment FIXME - we should also consider supporting:
4195 @comment @item Ignore - characters that are ignored if they appear in
4196 @comment the input; perhaps defaulting to '\0', category 'I'.
4197 @comment @item Assign -character used in macro definitions for default
4198 @comment variables, category '='.
4202 Each character can, besides the basic syntax category, have some syntax
4203 attributes. One reason these are attributes rather than categories is
4204 that end delimiters are never recognized except when searching for the
4205 end of a token triggered by a start delimiter; the end delimiter can
4206 have syntax properties of its own when it appears in isolation. These
4211 The set of characters that can end a single-character quoted string.
4212 The default is the single character @samp{'}. For multiple-character
4213 quote delimiters, use @code{changequote} (@pxref{Changequote}). Note
4214 that @samp{'} also defaults to the syntax category `Other', when it
4215 appears in isolation.
4218 The set of characters that can end a single-character comment. The
4219 default is the single character @kbd{newline}. For multiple-character
4220 comment delimiters, use @code{changecom} (@pxref{Changecom}). Note that
4221 newline also defaults to the syntax category `White space', when it
4222 appears in isolation.
4225 The builtin macro @code{changesyntax} is used to change the way
4226 @code{m4} parses the input stream into tokens.
4228 @deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
4229 Each @var{syntax-spec} is a two-part string. The first part is a
4230 command, consisting of a single character describing a syntax category,
4231 and an optional one-character action. The action can be @samp{-} to
4232 remove the listed characters from that category and reassign them to the
4233 `Other' category, @samp{=} to set the category to the listed characters
4234 and reassign all other characters previously in that category to
4235 `Other', or @samp{+} to add the listed characters to the category
4236 without affecting other characters. If an action is not specified, but
4237 additional characters are present, then @samp{=} is assumed. The
4238 case-insensitive characters for the syntax categories are:
4275 The remaining characters of each @var{syntax-spec} form the set of
4276 characters to perform the action on for that syntax category. Character
4277 ranges are expanded as for @code{translit} (@pxref{Translit}). To start
4278 the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
4281 If @var{syntax-spec} is just a category, and no action or characters
4282 were specified, then all characters in that category are reset to their
4283 default state. A warning is issued if the category character is not
4284 valid. If @var{syntax-spec} is the empty string, then all categories
4285 are reset to their default state.
4287 The expansion of @code{changesyntax} is void.
4288 The macro @code{changesyntax} is recognized only with parameters. Use
4289 this macro with caution, as it is possible to change the syntax in such
4290 a way that no further macros can be recognized by @code{m4}.
4291 This macro was added in M4 2.0.
4294 With @code{changesyntax} we can modify what characters form a word.
4297 define(`test.1', `TEST ONE')
4305 changesyntax(`W+.', `W-_')
4311 changesyntax(`W=a-zA-Z0-9_')
4325 Another possibility is to change the syntax of a macro call.
4328 define(`test', `$#')
4332 changesyntax(`(<', `,|', `)>')
4340 Leading spaces are always removed from macro arguments in @code{m4}, but
4341 by changing the syntax categories we can avoid it. The use of
4342 @code{format} is an alternative to using a literal tab character.
4345 define(`test', `$1$2$3')
4349 changesyntax(`O 'format(`%c', `9'))
4355 It is possible to redefine the @samp{$} used to indicate macro arguments
4356 in user defined macros.
4359 define(`argref', `Dollar: $#, Question: ?#')
4362 @result{}Dollar: 3, Question: ?#
4363 changesyntax(`$?', `O$')
4366 @result{}Dollar: $#, Question: 3
4370 Dollar class syntax elements are copied to the output if there is no
4374 define(`escape', `$?`'1$?1?')
4382 Macro calls can be given a @TeX{} or Texinfo like syntax using an
4383 escape. If one or more characters are defined as escapes, macro names
4384 are only recognized if preceded by an escape character.
4386 If the escape is not followed by what is normally a word (a letter
4387 optionally followed by letters and/or numerals), that single character
4388 is returned as a macro name.
4390 As always, words without a macro definition cause no error message.
4391 They and the escape character are simply output.
4394 define(`foo', `bar')
4396 changesyntax(`@@@@')
4402 @@changesyntax(`@@\', `O@@')
4410 define(`#', `No comment')
4411 @result{}define(#, No comment)
4412 \define(`#', `No comment')
4414 \# \foo # Comment \foo
4415 @result{}No comment bar # Comment \foo
4418 Active characters are known from @TeX{}. In @code{m4} an active
4419 character is always seen as a one-letter word, and so, if it has a macro
4420 definition, the macro will be called.
4423 define(`@@', `TEST')
4433 There is obviously an overlap with @code{changecom} and
4434 @code{changequote}. Comment delimiters and quotes can now be defined in
4435 two different ways. To avoid incompatibilities, if the quotes are set
4436 with @code{changequote}, all other characters marked in the syntax table
4437 as quotes will revert to their normal syntax categories, leaving only
4438 one set of defined quotes as before. If the quotes are set with
4439 @code{changesyntax}, it is possible to result in multiple sets of
4440 quotes. This applies to comment delimiters as well, @emph{mutatis
4444 define(`test', `TEST')
4446 changesyntax(`L+<', `R+>')
4454 changequote(<[>, `]')
4464 If several characters are assigned to a category that forms single
4465 character tokens, all such characters are treated as equal. Any open
4466 parenthesis will match any close parenthesis, etc.
4469 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
4471 eval@{2**4-1; 2 : 8>
4475 On the other hand, a multi-character start-quote sequence, which can
4476 only be created by @code{changequote}, will only be matched by the
4477 corresponding end-quote sequence. The same goes for comment delimiters.
4480 define(`test', `==$1==')
4482 changequote(`<<', `>>')
4484 changesyntax(<<L[>>, <<R]>>)
4487 @result{}==testing]==
4489 @result{}==testing>>==
4491 @result{}==testing==
4495 Note how it is possible to have both long and short quotes, if
4496 @code{changequote} is used before @code{changesyntax}.
4498 The syntax table is initialized to be backwards compatible, so if you
4499 never call @code{changesyntax}, nothing will have changed.
4501 Debugging output continue to use @kbd{(}, @kbd{,} and @kbd{)} to show
4505 @section Saving text until end of input
4507 @cindex saving input
4508 @cindex input, saving
4509 It is possible to `save' some text until the end of the normal input has
4510 been seen. Text can be saved, to be read again by @code{m4} when the
4511 normal input has been exhausted. This feature is normally used to
4512 initiate cleanup actions before normal exit, e.g., deleting temporary
4515 @deffn {Builtin (m4)} m4wrap (@var{string}, @dots{})
4516 To save input text, use the builtin @code{m4wrap}:
4517 which stores @var{string} and the rest of the arguments in a safe place,
4518 to be reread when end of input is reached.
4522 define(`cleanup', `This is the `cleanup' action.
4527 This is the first and last normal input line.
4528 @result{}This is the first and last normal input line.
4530 @result{}This is the cleanup action.
4533 The saved input is only reread when the end of normal input is seen, and
4534 not if @code{m4exit} is used to exit @code{m4}.
4536 @comment FIXME: this contradicts POSIX, which requires that "If the
4537 @comment m4wrap macro is used multiple times, the arguments specified
4538 @comment shall be processed in the order in which the m4wrap macros were
4539 @comment processed."
4540 It is safe to call @code{m4wrap} from saved text, but then the order in
4541 which the saved text is reread is undefined. If @code{m4wrap} is not used
4542 recursively, the saved pieces of text are reread in the opposite order
4543 in which they were saved (LIFO---last in, first out).
4545 Here is an example of implementing a factorial function using
4549 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
4550 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
4551 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
4556 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
4559 Invocations of @code{m4wrap} at the same recursion level are
4560 concatenated and rescanned as usual:
4566 m4wrap(`a')m4wrap(`a')
4573 however, the transition between recursion levels behaves like an end of
4574 file condition between two input files.
4578 m4wrap(`m4wrap(`)')len(abc')
4581 @error{}m4:stdin:1: end of file in argument list
4584 @node File Inclusion
4585 @chapter File inclusion
4587 @cindex file inclusion
4588 @cindex inclusion, of files
4590 @code{m4} allows you to include named files at any point in the input.
4593 * Include:: Including named files
4594 * Search Path:: Searching for include files
4598 @section Including named files
4600 There are two builtin macros in @code{m4} for including files:
4602 @deffn {Builtin (m4)} include (@var{file})
4603 @deffnx {Builtin (m4)} sinclude (@var{file})
4604 Both macros cause the file named @var{file} to be read by
4605 @code{m4}. When the end of the file is reached, input is resumed from
4606 the previous input file.
4608 The expansion of @code{include} and @code{sinclude} is therefore the
4609 contents of @var{file}.
4611 If @var{file} does not exist (or cannot be read), the expansion is void,
4612 and @code{include} will fail with an error while @code{sinclude} is
4613 silent. The empty string counts as a file that does not exist.
4615 The macros @code{include} and @code{sinclude} are recognized only with
4622 @error{}m4:stdin:1: include: cannot open `n': No such file or directory
4625 @error{}m4:stdin:2: include: cannot open `': No such file or directory
4633 This section uses the @option{--include} command-line option (or
4634 @option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab
4635 files from the @file{m4-@value{VERSION}/@/examples}
4636 directory shipped as part of the @acronym{GNU} @code{m4} package. The
4637 file @file{m4-@value{VERSION}/@/examples/@/incl.m4} in the distribution
4642 $ @kbd{cat examples/incl.m4}
4643 @result{}Include file start
4645 @result{}Include file end
4648 Normally file inclusion is used to insert the contents of a file
4649 into the input stream. The contents of the file will be read by
4650 @code{m4} and macro calls in the file will be expanded:
4654 $ @kbd{m4 -I examples}
4655 define(`foo', `FOO')
4658 @result{}Include file start
4660 @result{}Include file end
4664 The fact that @code{include} and @code{sinclude} expand to the contents
4665 of the file can be used to define macros that operate on entire files.
4666 Here is an example, which defines @samp{bar} to expand to the contents
4671 $ @kbd{m4 -I examples}
4672 define(`bar', include(`incl.m4'))
4674 This is `bar': >>bar<<
4675 @result{}This is bar: >>Include file start
4677 @result{}Include file end
4681 This use of @code{include} is not trivial, though, as files can contain
4682 quotes, commas, and parentheses, which can interfere with the way the
4683 @code{m4} parser works. @acronym{GNU} @code{m4} seamlessly concatenates
4684 the file contents with the next character, even if the included file
4685 ended in the middle of a comment, string, or macro call. These
4686 conditions are only treated as end of file errors if specified as input
4687 files on the command line.
4689 In @acronym{GNU} @code{m4}, an alternative method of reading files is
4690 using @code{undivert} (@pxref{Undivert}) on a named file.
4693 @section Searching for include files
4695 @cindex search path for included files
4696 @cindex included files, search path for
4697 @cindex @acronym{GNU} extensions
4698 @acronym{GNU} @code{m4} allows included files to be found in other directories
4699 than the current working directory.
4701 @cindex @env{M4PATH}
4702 If the @option{--prepend-include} or @option{-B} command-line option was
4703 provided (@pxref{Preprocessor features, , Invoking m4}), those
4704 directories are searched first, in reverse order that those options were
4705 listed on the command line. Then @code{m4} looks in the current working
4706 directory. Next comes the directories specified with the
4707 @option{--include} or @option{-I} option, in the order found on the
4708 command line. Finally, if the @env{M4PATH} environment variable is set,
4709 it is expected to contain a colon-separated list of directories, which
4710 will be searched in order.
4712 If the automatic search for include-files causes trouble, the @samp{p}
4713 debug flag (@pxref{Debugmode}) can help isolate the problem.
4716 @chapter Diverting and undiverting output
4718 Diversions are a way of temporarily saving output. The output of
4719 @code{m4} can at any time be diverted to a temporary file, and be
4720 reinserted into the output stream, @dfn{undiverted}, again at a later
4723 @cindex @env{TMPDIR}
4724 Numbered diversions are counted from 0 upwards, diversion number 0
4725 being the normal output stream. The number of simultaneous diversions
4726 is limited mainly by the memory used to describe them, because @acronym{GNU}
4727 @code{m4} tries to keep diversions in memory. However, there is a
4728 limit to the overall memory usable by all diversions taken altogether
4729 (512K, currently). When this maximum is about to be exceeded,
4730 a temporary file is opened to receive the contents of the biggest
4731 diversion still in memory, freeing this memory for other diversions.
4732 When creating the temporary file, @code{m4} honors the value of the
4733 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
4734 So, it is theoretically possible that the number and aggregate size of
4735 diversions is limited only by available disk space.
4737 Diversions make it possible to generate output in a different order than
4738 the input was read. It is possible to implement topological sorting
4739 dependencies. For example, @acronym{GNU} Autoconf makes use of
4740 diversions under the hood to ensure that the expansion of a prerequisite
4741 macro appears in the output prior to the expansion of a dependent macro,
4742 regardless of which order the two macros were invoked in the user's
4746 * Divert:: Diverting output
4747 * Undivert:: Undiverting output
4748 * Divnum:: Diversion numbers
4749 * Cleardivert:: Discarding diverted text
4753 @section Diverting output
4755 @cindex diverting output to files
4756 @cindex output, diverting to files
4757 @cindex files, diverting output to
4758 Output is diverted using @code{divert}:
4760 @deffn {Builtin (m4)} divert (@dvar{number, 0})
4761 The current diversion is changed to @var{number}. If @var{number} is left
4762 out or empty, it is assumed to be zero. If @var{number} cannot be
4763 parsed, the diversion is unchanged.
4765 The expansion of @code{divert} is void.
4768 When all the @code{m4} input will have been processed, all existing
4769 diversions are automatically undiverted, in numerical order.
4773 This text is diverted.
4776 This text is not diverted.
4777 @result{}This text is not diverted.
4780 @result{}This text is diverted.
4783 Several calls of @code{divert} with the same argument do not overwrite
4784 the previous diverted text, but append to it. Diversions are printed
4785 after any wrapped text is expanded.
4788 define(`text', `TEXT')
4790 divert(`1')`diverted text.'
4793 m4wrap(`Wrapped text precedes ')
4796 @result{}Wrapped TEXT precedes diverted text.
4799 If output is diverted to a negative diversion, it is simply discarded.
4800 This can be used to suppress unwanted output. A common example of
4801 unwanted output is the trailing newlines after macro definitions. Here
4802 is a common programming idiom in @code{m4} for avoiding them.
4806 define(`foo', `Macro `foo'.')
4807 define(`bar', `Macro `bar'.')
4812 @cindex @acronym{GNU} extensions
4813 Traditional implementations only supported ten diversions. But as a
4814 @acronym{GNU} extension, diversion numbers can be as large as positive
4815 integers will allow, rather than treating a multi-digit diversion number
4816 as a request to discard text.
4819 divert(eval(`1<<28'))world
4826 Note that @code{divert} is an English word, but also an active macro
4827 without arguments. When processing plain text, the word might appear in
4828 normal text and be unintentionally swallowed as a macro invocation. One
4829 way to avoid this is to use the @option{-P} option to rename all
4830 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
4831 a wrapper that requires a parameter to be recognized.
4834 We decided to divert the stream for irrigation.
4835 @result{}We decided to the stream for irrigation.
4836 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
4842 We decided to divert the stream for irrigation.
4843 @result{}We decided to divert the stream for irrigation.
4847 @section Undiverting output
4849 Diverted text can be undiverted explicitly using the builtin
4852 @deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}})
4853 Undiverts the numeric @var{diversions} given by the arguments, in the
4854 order given. If no arguments are supplied, all diversions are
4855 undiverted, in numerical order.
4857 @cindex @acronym{GNU} extensions
4858 As a @acronym{GNU} extension, @var{diversions} may contain non-numeric
4859 strings, which are treated as the names of files to copy into the output
4860 without expansion. A warning is issued if a file could not be opened.
4862 The expansion of @code{undivert} is void.
4867 This text is diverted.
4870 This text is not diverted.
4871 @result{}This text is not diverted.
4874 @result{}This text is diverted.
4878 Notice the last two blank lines. One of them comes from the newline
4879 following @code{undivert}, the other from the newline that followed the
4880 @code{divert}! A diversion often starts with a blank line like this.
4882 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
4883 but rather copied directly to the current output, and it is therefore
4884 not an error to undivert into a diversion. Undiverting the empty string
4885 is the same as specifying diversion 0; in either case nothing happens
4886 since the output has already been flushed.
4889 divert(`1')diverted text
4897 @result{}diverted text
4901 When a diversion has been undiverted, the diverted text is discarded,
4902 and it is not possible to bring back diverted text more than once.
4906 This text is diverted first.
4907 divert(`0')undivert(`1')dnl
4909 @result{}This text is diverted first.
4913 This text is also diverted but not appended.
4914 divert(`0')undivert(`1')dnl
4916 @result{}This text is also diverted but not appended.
4919 Attempts to undivert the current diversion are silently ignored. Thus,
4920 when the current diversion is not 0, the current diversion does not get
4921 rearranged among the other diversions.
4929 divert(`2')undivert(`5', `2', `4')dnl
4930 undivert`'dnl effectively undivert(`1', `2', `3', `4', `5')
4931 divert`'undivert`'dnl
4939 @cindex @acronym{GNU} extensions
4940 @cindex file inclusion
4941 @cindex inclusion, of files
4942 @acronym{GNU} @code{m4} allows named files to be undiverted. Given a
4943 non-numeric argument, the contents of the file named will be copied,
4944 uninterpreted, to the current output. This complements the builtin
4945 @code{include} (@pxref{Include}). To illustrate the difference, assume
4946 the file @file{foo} contains:
4958 define(`bar', `BAR')
4968 If the file is not found (or cannot be read), an error message is
4969 issued, and the expansion is void. It is possible to intermix files
4970 and diversion numbers.
4973 divert(`1')diversion one
4974 divert(`2')undivert(`foo')dnl
4975 divert(`3')diversion three
4977 undivert(`1', `2', `foo', `3')dnl
4978 @result{}diversion one
4981 @result{}diversion three
4985 @section Diversion numbers
4987 @cindex diversion numbers
4988 The current diversion is tracked by the builtin @code{divnum}:
4990 @deffn {Builtin (m4)} divnum
4991 Expands to the number of the current diversion.
4998 Diversion one: divnum
5000 Diversion two: divnum
5003 @result{}Diversion one: 1
5005 @result{}Diversion two: 2
5009 @section Discarding diverted text
5011 @cindex discarding diverted text
5012 @cindex diverted text, discarding
5013 Often it is not known, when output is diverted, whether the diverted
5014 text is actually needed. Since all non-empty diversion are brought back
5015 on the main output stream when the end of input is seen, a method of
5016 discarding a diversion is needed. If all diversions should be
5017 discarded, the easiest is to end the input to @code{m4} with
5018 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
5022 Diversion one: divnum
5024 Diversion two: divnum
5031 No output is produced at all.
5033 Clearing selected diversions can be done with the following macro:
5035 @deffn Composite cleardivert (@ovar{diversions@dots{}})
5036 Discard the contents of each of the listed numeric @var{diversions}.
5040 define(`cleardivert',
5041 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
5045 It is called just like @code{undivert}, but the effect is to clear the
5046 diversions, given by the arguments. (This macro has a nasty bug! You
5047 should try to see if you can find it and correct it; or @pxref{Improved
5048 cleardivert, , Answers}).
5051 @chapter Extending M4 with dynamic runtime modules
5054 @acronym{GNU} M4 1.4.x had a monolithic architecture. All of its
5055 functionality was contained in a single binary, and additional macros
5056 could be added only by writing more code in the M4 language, or at the
5057 extreme by hacking the sources and recompiling the whole thing to make
5058 a custom M4 installation.
5060 Starting with release 2.0, M4 uses Libtool's @code{libltdl} facilities
5061 (@pxref{Using libltdl, , libltdl, libtool, The GNU Libtool Manual})
5062 to move all of M4's builtins out to pluggable modules. Unless compile
5063 time options are set to change the default build, the installed M4 2.0
5064 binary is virtually identical to 1.4.x, supporting the same builtins.
5065 However, an optional module can be loaded into the running M4 interpreter
5066 to provide a new @code{load} builtin. This facilitates runtime
5067 extension of the M4 builtin macro list using compiled C code linked
5068 against a new shared library, typically named @file{libm4.so}.
5070 For example, you might want to add a @code{setenv} builtin to M4, to
5071 use before invoking @code{esyscmd}. We might write a @file{setenv.c}
5072 something like this:
5076 #include "m4module.h"
5080 m4_builtin m4_builtin_table[] =
5082 /* name handler flags minargs maxargs */
5083 @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2, 3 @},
5085 @{ NULL, NULL, 0, 0, 0 @}
5089 * setenv(NAME, VALUE, [OVERWRITE])
5091 M4BUILTIN_HANDLER (setenv)
5096 if (!m4_numeric_arg (context, argc, argv, 3, &overwrite))
5099 setenv (M4ARG (1), M4ARG (2), overwrite);
5103 Then, having compiled and linked the module, in (somewhat contrived)
5108 $ @kbd{M4MODPATH=`pwd` m4 --load-module=setenv}
5109 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
5111 esyscmd(`ifconfig -a')dnl
5115 Or instead of loading the module from the M4 invocation, you can use
5116 the new @code{load} builtin:
5120 $ @kbd{M4MODPATH=`pwd` m4 --load-module=load}
5123 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
5127 Also, at build time, you can choose which modules to build into
5128 the core (so that they will be available without dynamic loading).
5129 SUSv3 M4 functionality is contained in the module @samp{m4}, @acronym{GNU}
5130 extensions in the module @samp{gnu}, the @code{load} builtin in the
5131 module @samp{load} and so on.
5133 We hinted earlier that the @code{m4} and @code{gnu} modules are
5134 preloaded into the installed M4 binary, but it is possible to install
5135 a @emph{thinner} binary; for example, omitting the @acronym{GNU}
5136 extensions by configuring the distribution with @kbd{./configure
5137 --with-modules=m4}. For a binary built with that option to understand
5138 code that uses @acronym{GNU} extensions, you must then run @kbd{m4
5139 --load-module=gnu}. It is also possible to build a @emph{fatter}
5140 binary with additional modules preloaded: adding, say, the @code{load}
5141 builtin using @kbd{./configure --with-modules="m4 gnu load"}.
5143 @acronym{GNU} M4 now has a facility for defining additional builtins without
5144 recompiling the sources. In actual fact, all of the builtins provided
5145 by @acronym{GNU} M4 are loaded from such modules. All of the builtin
5146 descriptions in this manual are annotated with the module from which
5147 they are loaded -- mostly from the module @samp{m4}.
5149 When you start @acronym{GNU} M4, the modules @samp{m4} and @samp{gnu} are
5150 loaded by default. If you supply the @option{-G} option at startup, the
5151 module @samp{traditional} is loaded instead of @samp{gnu}.
5152 @xref{Compatibility}, for more details on the differences between these
5153 two modes of startup.
5156 * M4modules:: Listing loaded modules
5157 * Load:: Loading additional modules
5158 * Unload:: Removing loaded modules
5159 * Standard Modules:: Standard bundled modules
5163 @section Listing loaded modules
5165 @deffn {Builtin (load)} m4modules
5166 Expands to a quoted ordered list of currently loaded modules,
5167 with the most recently loaded module at the front of the list. Loading
5168 a module multiple times will not affect the order of this list, the
5169 position depends on when the module was @emph{first} loaded.
5172 For example, if @acronym{GNU} @code{m4} is started with the
5173 @option{-m load} option to load the module @samp{load} and make this
5174 builtin available, @code{m4modules} will yield the following:
5176 @comment options: -m load
5180 @result{}load,gnu,m4
5184 @section Loading additional modules
5186 @deffn {Builtin (load)} load (@var{module-name})
5187 @var{module-name} will be searched for along the module search path
5188 (@pxref{Standard Modules}) and loaded if found. Loading a module
5189 consists of running its initialization function (if any) and then adding
5190 any macros it provides to the internal table.
5192 The macro @code{load} is recognized only with parameters.
5195 Once the @code{load} module has successfully loaded, use of the
5196 @samp{load} macro is entirely equivalent to the @option{-m} command line
5199 @c The -mmpeval/--unload=mpeval pair allows the testsuite to skip this
5200 @c test if mpeval was not configured for usage.
5201 @comment options: -m load -m mpeval --unload-module=mpeval
5205 @result{}load,gnu,m4
5209 @result{}mpeval,load,gnu,m4
5213 @section Removing loaded modules
5215 @deffn {Builtin (load)} unload (@var{module-name})
5216 Any loaded modules that can be listed by the @code{modules} macro can be
5217 removed by naming them as the @var{module-name} parameter of the
5218 @code{unload} macro. Unloading a module consists of removing all of the
5219 macros it provides from the internal table of visible macros, and
5220 running the module's finalization method (if any).
5222 The macro @code{unload} is recognized only with parameters.
5225 @comment options: -m mpeval -m load
5227 $ @kbd{m4 -m mpeval -m load}
5229 @result{}load,mpeval,gnu,m4
5233 @result{}load,gnu,m4
5236 @node Standard Modules
5237 @section Standard bundled modules
5239 @acronym{GNU} @code{m4} ships with several bundled modules as standard.
5240 By convention, these modules define a text macro that can be tested
5241 with @code{ifdef} when they are loaded; only the @code{m4} module lacks
5242 this feature test macro.
5246 Provides all of the builtins defined by @acronym{POSIX}. This module
5247 is always loaded --- @acronym{GNU} @code{m4} would only be a very slow
5248 version of @command{cat} without the builtins supplied by this module.
5251 Provides all of the @acronym{GNU} extensions, as defined by
5252 @acronym{GNU} M4 through the 1.4.x release series. It also provides a
5253 couple of feature test macros:
5255 @deffn {Macro (gnu)} __gnu__
5256 Expands to the empty string, as an indication that the @samp{gnu}
5260 @deffn {Macro (gnu)} __m4_version__
5261 Expands to a quoted string containing the release version number of the
5262 running @acronym{GNU} @code{m4} executable.
5265 This module is always loaded, unless the @option{-G} command line
5266 option is supplied at startup (@pxref{Limits control, , Invoking m4}).
5269 This module provides compatibility with System V @code{m4}, for anything
5270 not specified by @acronym{POSIX}, and is loaded instead of the
5271 @samp{gnu} module if the @option{-G} command line option is specified.
5273 @deffn {Macro (traditional)} __traditional__
5274 Expands to the empty string, as an indication that the
5275 @samp{traditional} module is loaded.
5279 This module supplies the builtins required to use modules from within a
5280 @acronym{GNU} @code{m4} program. @xref{Modules}, for more details. The
5281 module also defines the following macro:
5283 @deffn {Macro (load)} __load__
5284 Expands to the empty string, as an indication that the @samp{load}
5289 This module provides the implementation for the experimental
5290 @code{mpeval} feature. If the host machine does not have the
5291 @acronym{GNU} gmp library, the builtin will generate an error if called.
5292 @xref{Mpeval}, for more details. The module also defines the following
5295 @deffn {Macro (mpeval)} __mpeval__
5296 Expands to the empty string, as an indication that the @samp{mpeval}
5301 Here is an example of using the feature test macros.
5305 __gnu__-__traditional__
5306 @result{}-__traditional__
5307 ifdef(`__gnu__', `Extensions are active', `Minimal features')
5308 @result{}Extensions are active
5311 @comment options: -G
5313 $ @kbd{m4 --traditional}
5314 __gnu__-__traditional__
5316 ifdef(`__gnu__', `Extensions are active', `Minimal features')
5317 @result{}Minimal features
5321 @chapter Macros for text handling
5323 There are a number of builtins in @code{m4} for manipulating text in
5324 various ways, extracting substrings, searching, substituting, and so on.
5327 * Len:: Calculating length of strings
5328 * Index macro:: Searching for substrings
5329 * Regexp:: Searching for regular expressions
5330 * Substr:: Extracting substrings
5331 * Translit:: Translating characters
5332 * Patsubst:: Substituting text by regular expression
5333 * Format:: Formatting strings (printf-like)
5337 @section Calculating length of strings
5339 @cindex length of strings
5340 @cindex strings, length of
5341 The length of a string can be calculated by @code{len}:
5343 @deffn {Builtin (m4)} len (@var{string})
5344 Expands to the length of @var{string}, as a decimal number.
5346 The macro @code{len} is recognized only with parameters.
5357 @section Searching for substrings
5359 Searching for substrings is done with @code{index}:
5361 @deffn {Builtin (m4)} index (@var{string}, @var{substring})
5362 Expands to the index of the first occurrence of @var{substring} in
5363 @var{string}. The first character in @var{string} has index 0. If
5364 @var{substring} does not occur in @var{string}, @code{index} expands to
5367 The macro @code{index} is recognized only with parameters.
5371 index(`gnus, gnats, and armadillos', `nat')
5373 index(`gnus, gnats, and armadillos', `dag')
5377 Omitting @var{substring} evokes a warning, but still produces output.
5381 @error{}m4:stdin:1: Warning: index: too few arguments: 1 < 2
5386 @section Searching for regular expressions
5388 @cindex regular expressions
5389 @cindex @acronym{GNU} extensions
5390 Searching for regular expressions is done with the builtin
5393 @deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax})
5394 @deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @
5395 @ovar{replacement}, @ovar{resyntax})
5396 Searches for @var{regexp} in @var{string}.
5398 If @var{resyntax} is given, the particular flavor of regular expression
5399 understood with respect to @var{regexp} can be changed from the current
5400 default. @xref{Changeresyntax}, for details of the values that can be
5401 given for this argument. If exactly three arguments given, then the
5402 third argument is treated as @var{resyntax} only if it matches a known
5403 syntax name, otherwise it is treated as @var{replacement}.
5405 If @var{replacement} is omitted, @code{regexp} expands to the index of
5406 the first match of @var{regexp} in @var{string}. If @var{regexp} does
5407 not match anywhere in @var{string}, it expands to -1.
5409 If @var{replacement} is supplied, and there was a match, @code{regexp}
5410 changes the expansion to this argument, with @samp{\@var{n}} substituted
5411 by the text matched by the @var{n}th parenthesized sub-expression of
5412 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
5413 replaced by the text of the entire regular expression matched. For
5414 all other characters, @samp{\} treats the next character literally. A
5415 warning is issued if there were fewer sub-expressions than the
5416 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
5417 was no match, @code{regexp} expands to the empty string.
5419 The macro @code{regexp} is recognized only with parameters.
5423 regexp(`GNUs not Unix', `\<[a-z]\w+')
5425 regexp(`GNUs not Unix', `\<Q\w*')
5427 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
5428 @result{}*** Unix *** nix ***
5429 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
5433 Here are some more examples on the handling of backslash:
5436 regexp(`abc', `\(b\)', `\\\10\a')
5438 regexp(`abc', `b', `\1\')
5439 @error{}m4:stdin:2: Warning: regexp: sub-expression 1 not present
5440 @error{}m4:stdin:2: Warning: regexp: trailing \ ignored in replacement
5442 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
5443 @error{}m4:stdin:3: Warning: regexp: sub-expression 4 not present
5444 @error{}m4:stdin:3: Warning: regexp: sub-expression 5 not present
5445 @error{}m4:stdin:3: Warning: regexp: sub-expression 6 not present
5449 Omitting @var{regexp} evokes a warning, but still produces output.
5453 @error{}m4:stdin:1: Warning: regexp: too few arguments: 1 < 2
5457 If @var{resyntax} is given, @var{regexp} must be given according to
5458 the syntax chosen, though the default regular expression syntax
5459 remains unchanged for other invocations:
5462 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***',
5464 @result{}*** Unix *** nix ***
5465 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***')
5469 Occasionally, you might want to pass an @var{resyntax} argument without
5470 wishing to give @var{replacement}. If there are exactly three
5471 arguments, and the last argument is a valid @var{resyntax}, it is used
5472 as such, rather than as a replacement.
5475 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED')
5477 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED')
5478 @result{}POSIX_EXTENDED
5479 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `')
5481 regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `')
5482 @result{}POSIX_EXTENDED
5486 @section Extracting substrings
5488 @cindex extracting substrings
5489 @cindex substrings, extracting
5490 Substrings are extracted with @code{substr}:
5492 @deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length})
5493 Expands to the substring of @var{string}, which starts at index
5494 @var{from}, and extends for @var{length} characters, or to the end of
5495 @var{string}, if @var{length} is omitted. The starting index of a
5496 is always 0. The expansion is empty if there is an error parsing
5497 @var{from} or @var{length}, if @var{from} is beyond the end of
5498 @var{string}, or if @var{length} is negative.
5500 The macro @code{substr} is recognized only with parameters.
5504 substr(`gnus, gnats, and armadillos', `6')
5505 @result{}gnats, and armadillos
5506 substr(`gnus, gnats, and armadillos', `6', `5')
5510 Omitting @var{from} evokes a warning, but still produces output.
5514 @error{}m4:stdin:1: Warning: substr: too few arguments: 1 < 2
5517 @error{}m4:stdin:2: Warning: substr: empty string treated as 0
5522 @section Translating characters
5524 @cindex translating characters
5525 @cindex characters, translating
5526 Character translation is done with @code{translit}:
5528 @deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement})
5529 Expands to @var{string}, with each character that occurs in
5530 @var{chars} translated into the character from @var{replacement} with
5533 If @var{replacement} is shorter than @var{chars}, the excess characters
5534 of @var{chars} are deleted from the expansion; if @var{chars} is
5535 shorter, the excess characters in @var{replacement} are silently
5536 ignored. If @var{replacement} is omitted, all characters in
5537 @var{string} that are present in @var{chars} are deleted from the
5538 expansion. If a character appears more than once in @var{chars}, only
5539 the first instance is used in making the translation. Only a single
5540 translation pass is made, even if characters in @var{replacement} also
5541 appear in @var{chars}.
5543 As a @acronym{GNU} extension, both @var{chars} and @var{replacement} can
5544 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
5545 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
5546 in @var{chars} or @var{replacement}, place it first or last in the
5547 entire string, or as the last character of a range. Back-to-back ranges
5548 can share a common endpoint. It is not an error for the last character
5549 in the range to be `larger' than the first. In that case, the range
5550 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
5551 The expansion of a range is dependent on the underlying encoding of
5552 characters, so using ranges is not always portable between machines.
5554 The macro @code{translit} is recognized only with parameters.
5558 translit(`GNUs not Unix', `A-Z')
5560 translit(`GNUs not Unix', `a-z', `A-Z')
5561 @result{}GNUS NOT UNIX
5562 translit(`GNUs not Unix', `A-Z', `z-a')
5563 @result{}tmfs not fnix
5564 translit(`+,-12345', `+--1-5', `<;>a-c-a')
5566 translit(`abcdef', `aabdef', `bcged')
5570 In the @sc{ascii} encoding, the first example deletes all uppercase
5571 letters, the second converts lowercase to uppercase, and the third
5572 `mirrors' all uppercase letters, while converting them to lowercase.
5573 The two first cases are by far the most common, even though they are not
5574 portable to @sc{ebcdic} or other encodings. The fourth example shows a
5575 range ending in @samp{-}, as well as back-to-back ranges. The final
5576 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
5577 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
5578 @samp{e} are swapped, and the @samp{f} is discarded.
5580 Omitting @var{chars} evokes a warning, but still produces output.
5584 @error{}m4:stdin:1: Warning: translit: too few arguments: 1 < 2
5589 @section Substituting text by regular expression
5591 @cindex regular expressions
5592 @cindex pattern substitution
5593 @cindex substitution by regular expression
5594 @cindex @acronym{GNU} extensions
5595 Global substitution in a string is done by @code{patsubst}:
5597 @deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @
5598 @ovar{replacement}, @ovar{resyntax})
5599 Searches @var{string} for matches of @var{regexp}, and substitutes
5600 @var{replacement} for each match.
5602 If @var{resyntax} is given, the particular flavor of regular expression
5603 understood with respect to @var{regexp} can be changed from the current
5604 default. @xref{Changeresyntax}, for details of the values that can be
5605 given for this argument. Unlike @var{regexp}, if exactly three
5606 arguments given, the third argument is always treated as
5607 @var{replacement}, even if it matches a known syntax name.
5609 The parts of @var{string} that are not covered by any match of
5610 @var{regexp} are copied to the expansion. Whenever a match is found, the
5611 search proceeds from the end of the match, so a character from
5612 @var{string} will never be substituted twice. If @var{regexp} matches a
5613 string of zero length, the start position for the search is incremented,
5614 to avoid infinite loops.
5616 When a replacement is to be made, @var{replacement} is inserted into
5617 the expansion, with @samp{\@var{n}} substituted by the text matched by
5618 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
5619 nine sub-expressions. The escape @samp{\&} is replaced by the text of
5620 the entire regular expression matched. For all other characters,
5621 @samp{\} treats the next character literally. A warning is issued if
5622 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
5623 if there is a trailing @samp{\}.
5625 The @var{replacement} argument can be omitted, in which case the text
5626 matched by @var{regexp} is deleted.
5628 The macro @code{patsubst} is recognized only with parameters.
5631 When used with two arguments, @code{regexp} returns the position of the
5632 match, but @code{patsubst} deletes the match:
5635 patsubst(`GNUs not Unix', `^', `OBS: ')
5636 @result{}OBS: GNUs not Unix
5637 patsubst(`GNUs not Unix', `\<', `OBS: ')
5638 @result{}OBS: GNUs OBS: not OBS: Unix
5639 patsubst(`GNUs not Unix', `\w*', `(\&)')
5640 @result{}(GNUs)() (not)() (Unix)()
5641 patsubst(`GNUs not Unix', `\w+', `(\&)')
5642 @result{}(GNUs) (not) (Unix)
5643 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
5644 @result{}GN not@w{ }
5645 patsubst(`GNUs not Unix', `not', `NOT\')
5646 @error{}m4:stdin:6: Warning: patsubst: trailing \ ignored in replacement
5647 @result{}GNUs NOT Unix
5650 Here is a slightly more realistic example, which capitalizes individual
5651 words or whole sentences, by substituting calls of the macros
5652 @code{upcase} and @code{downcase} into the strings.
5654 @deffn Composite upcase (@var{text})
5655 @deffnx Composite downcase (@var{text})
5656 @deffnx Composite capitalize (@var{text})
5657 Expand to @var{text}, but with capitalization changed: @code{upcase}
5658 changes all letters to upper case, @code{downcase} changes all letters
5659 to lower case, and @code{capitalize} changes the first character of each
5660 word to upper case and the remaining characters to lower case.
5664 define(`upcase', `translit(`$*', `a-z', `A-Z')')dnl
5665 define(`downcase', `translit(`$*', `A-Z', `a-z')')dnl
5666 define(`capitalize1',
5667 `regexp(`$1', `^\(\w\)\(\w*\)',
5668 `upcase(`\1')`'downcase(`\2')')')dnl
5669 define(`capitalize',
5670 `patsubst(`$1', `\w+', `capitalize1(`\&')')')dnl
5671 capitalize(`GNUs not Unix')
5672 @result{}Gnus Not Unix
5675 If @var{resyntax} is given, @var{regexp} must be given according to
5676 the syntax chosen, though the default regular expression syntax
5677 remains unchanged for other invocations:
5681 `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl
5682 epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_')
5683 @result{}_bar_ _baz_
5684 patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_')
5685 @result{}_bar_ _baz_
5688 While @code{regexp} replaces the whole input with the replacement as
5689 soon as there is a match, @code{patsubst} replaces each
5690 @emph{occurrence} of a match and preserves non-matching pieces:
5696 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
5697 @result{}bar FOO baz FOO
5699 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
5700 @result{}bab abb 212
5704 Omitting @var{regexp} evokes a warning, but still produces output.
5708 @error{}m4:stdin:1: Warning: patsubst: too few arguments: 1 < 2
5713 @section Formatting strings (printf-like)
5715 @cindex formatted output
5716 @cindex output, formatted
5717 @cindex @acronym{GNU} extensions
5718 Formatted output can be made with @code{format}:
5720 @deffn {Builtin (gnu)} format (@var{format-string}, @dots{})
5721 Works much like the C function @code{printf}. The first argument
5722 @var{format-string} can contain @samp{%} specifications which are
5723 satisfied by additional arguments, and the expansion of @code{format} is
5724 the formatted string.
5726 The macro @code{format} is recognized only with parameters.
5729 Its use is best described by a few examples:
5732 define(`foo', `The brown fox jumped over the lazy dog')
5734 format(`The string "%s" uses %d characters', foo, len(foo))
5735 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
5736 format(`%.0f', `56789.9876')
5738 len(format(`%-*X', `5000', `1'))
5742 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
5743 example shows how @code{format} can be used to produce tabular output.
5747 $ @kbd{m4 -I examples}
5748 include(`forloop.m4')
5750 forloop(`i', `1', `10', `format(`%6d squared is %10d
5752 @result{} 1 squared is 1
5753 @result{} 2 squared is 4
5754 @result{} 3 squared is 9
5755 @result{} 4 squared is 16
5756 @result{} 5 squared is 25
5757 @result{} 6 squared is 36
5758 @result{} 7 squared is 49
5759 @result{} 8 squared is 64
5760 @result{} 9 squared is 81
5761 @result{} 10 squared is 100
5765 The builtin @code{format} is modeled after the ANSI C @samp{printf}
5766 function, and supports these @samp{%} specifiers: @samp{c},
5767 @samp{s}, @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{e},
5768 @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and @samp{%}; it
5769 supports field widths and precisions, and the
5770 modifiers @samp{+}, @samp{-}, @samp{@w{ }}, @samp{0}, @samp{#}, @samp{h} and
5771 @samp{l}. For more details on the functioning of @code{printf}, see the
5774 @c FIXME - format still needs some improvements.
5775 For now, unrecognized specifiers are silently ignored, but it is
5776 anticipated that a future release of @acronym{GNU} @code{m4} will support more
5777 specifiers, and give warnings when problems are encountered. Likewise,
5778 escape sequences are not yet recognized.
5781 @chapter Macros for doing arithmetic
5784 @cindex integer arithmetic
5785 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
5786 convenient shorthands, there are builtins for simple increment and
5787 decrement operations.
5790 * Incr:: Decrement and increment operators
5791 * Eval:: Evaluating integer expressions
5792 * Mpeval:: Multiple precision arithmetic
5796 @section Decrement and increment operators
5798 @cindex decrement operator
5799 @cindex increment operator
5800 Increment and decrement of integers are supported using the builtins
5801 @code{incr} and @code{decr}:
5803 @deffn {Builtin (m4)} incr (@var{number})
5804 @deffnx {Builtin (m4)} decr (@var{number})
5805 Expand to the numerical value of @var{number}, incremented
5806 or decremented, respectively, by one. Except for the empty string, the
5807 expansion is empty if @var{number} could not be parsed.
5809 The macros @code{incr} and @code{decr} are recognized only with
5819 @error{}m4:stdin:3: Warning: incr: empty string treated as 0
5822 @error{}m4:stdin:4: Warning: decr: empty string treated as 0
5826 The builtin macros @code{incr} and @code{decr} are recognized only when
5830 @section Evaluating integer expressions
5832 @cindex integer expression evaluation
5833 @cindex evaluation, of integer expressions
5834 @cindex expressions, evaluation of integer
5835 Integer expressions are evaluated with @code{eval}:
5837 @deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
5838 Expands to the value of @var{expression}. The expansion is empty
5839 if a problem is encountered while parsing the arguments. If specified,
5840 @var{radix} and @var{width} control the format of the output.
5842 Calculations are done with signed numbers, using at least 31-bit
5843 precision, but as a @acronym{GNU} extension, @code{m4} will use wider
5844 integers if available. Precision is finite, based on the platform's
5845 notion of @code{intmax_t}, and overflow silently results in wraparound.
5846 A warning is issued if division by zero is attempted, or if
5847 @var{expression} could not be parsed.
5849 Expressions can contain the following operators, listed in order of
5850 decreasing precedence.
5856 Unary plus and minus, and bitwise and logical negation
5860 Multiplication, division, modulo, and ratio
5862 Addition and subtraction
5864 Shift left, shift right, unsigned shift right
5866 Relational operators
5872 Bitwise exclusive-or
5882 Sequential evaluation
5885 The macro @code{eval} is recognized only with parameters.
5888 All binary operators, except exponentiation, are left associative. C
5889 operators that perform variable assignment, such as @samp{=} or
5890 @samp{--}, are forbidden by @acronym{POSIX}, since @code{eval} only
5891 operates on constants, not variables. Attempting to use them results
5893 @comment fixme If XCU ERN 137 is approved, then we could provide an
5894 @comment extension that supported assignment operators.
5896 Note that some older @code{m4} implementations use @samp{^} as an
5897 alternate operator for the exponentiation, although @acronym{POSIX}
5898 requires the C behavior of bitwise exclusive-or. The precedence of the
5899 negation operators, @samp{~} and @samp{!}, was traditionally lower than
5900 equality. The unary operators could not be used reliably more than once
5901 on the same term without intervening parentheses. The traditional
5902 precedence of the equality operators @samp{==} and @samp{!=} was
5903 identical instead of lower than the relational operators such as
5904 @samp{<}, even through @acronym{GNU} M4 1.4.8. Starting with version
5905 1.4.9, @acronym{GNU} M4 correctly follows @acronym{POSIX} precedence
5906 rules. M4 scripts designed to be portable between releases must be
5907 aware that parentheses may be required to enforce C precedence rules.
5908 Likewise, division by zero, even in the unused branch of a
5909 short-circuiting operator, is not always well-defined in other
5912 Following are some examples where the current version of M4 follows C
5913 precedence rules, but where older versions and some other
5914 implementations of @code{m4} require explicit parentheses to get the
5921 eval(`(1 == 2) > 0')
5931 eval(`+ + - ~ ! ~ 0')
5934 @error{}m4:stdin:8: eval: invalid operator: ++0
5937 @error{}m4:stdin:9: eval: invalid operator: 1 = 1
5940 @error{}m4:stdin:10: eval: invalid operator: 0 |= 1
5945 @error{}m4:stdin:12: Warning: eval: divide by zero: 0 || 1 / 0
5950 @error{}m4:stdin:14: Warning: eval: modulo by zero: 2 && 1 % 0
5954 @cindex @acronym{GNU} extensions
5955 As a @acronym{GNU} extension, @code{eval} supports several operators
5956 that do not appear in C. A right-associative exponentiation operator
5957 @samp{**} computes the value of the left argument raised to the right,
5958 modulo the numeric precision width. If evaluated, the exponent must be
5959 non-negative, and at least one of the arguments must be non-zero, or a
5960 warning is issued. An unsigned shift operator @samp{>>>} allows
5961 shifting a negative number as though it were an unsigned bit pattern,
5962 which shifts in 0 bits rather than twos-complement sign-extension. A
5963 ratio operator @samp{\} behaves like normal division @samp{/} on
5964 integers, but is provided for symmetry with @code{mpeval}.
5969 eval(`(2 ** 3) ** 2')
5977 @error{}m4:stdin:5: Warning: eval: divide by zero: 0 ** 0
5979 @error{}m4:stdin:6: Warning: eval: negative exponent: 4 ** -2
5981 eval(`2 || 4 ** -2')
5983 eval(`(-1 >> 1) == -1')
5985 eval(`(-1 >>> 1) > (1 << 30)')
5991 Furthermore, when you do not use the @option{--traditional} command line
5992 option (or @option{-G}, @pxref{Limits control, , Invoking m4}), the C
5993 operators @samp{,} and @samp{?:} are supported. But in traditional
5994 mode, @acronym{POSIX} requires that the use of these two operators cause
6011 @comment options: -G
6016 @error{}m4:stdin:1: eval: invalid operator: 1?2:3
6019 @error{}m4:stdin:2: eval: invalid operator: 4,5
6023 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
6024 without a special prefix are decimal. A simple @samp{0} prefix
6025 introduces an octal number. @samp{0x} introduces a hexadecimal number.
6026 As @acronym{GNU} extensions, @samp{0b} introduces a binary number.
6027 @samp{0r} introduces a number expressed in any radix between 1 and 36:
6028 the prefix should be immediately followed by the decimal expression of
6029 the radix, a colon, then the digits making the number. For radix 1,
6030 leading zeros are ignored, and all remaining digits must be @samp{1};
6031 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
6032 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
6033 to @samp{z}. Lower and upper case letters can be used interchangeably
6034 in numbers prefixes and as number digits.
6036 Parentheses may be used to group subexpressions whenever needed. For the
6037 relational operators, a true relation returns @code{1}, and a false
6038 relation return @code{0}.
6040 Here are a few examples of use of @code{eval}.
6045 eval(index(`Hello world', `llo') >= 0)
6047 eval(`0r1:0111 + 0b100 + 0r3:12')
6049 define(`square', `eval(`($1) ** 2')')
6053 square(square(`5')` + 1')
6055 define(`foo', `666')
6058 @error{}m4:stdin:8: Warning: eval: bad expression: foo / 6
6064 As the last two lines show, @code{eval} does not handle macro
6065 names, even if they expand to a valid expression (or part of a valid
6066 expression). Therefore all macros must be expanded before they are
6067 passed to @code{eval}.
6068 @comment update this if we add support for variables.
6070 Some calculations are not portable to other implementations, since they
6071 have undefined semantics in C, but @acronym{GNU} @code{m4} has
6072 well-defined behavior on overflow. When shifting, an out-of-range shift
6073 amount is implicitly brought into the range of the precision using
6074 modulo arithmetic (for example, on 32-bit integers, this would be an
6075 implicit bit-wise and with 0x1f). This example should work whether your
6076 platform uses 32-bit integers, 64-bit integers, or even some other
6080 define(`max_int', eval(`-1 >>> 1'))
6082 define(`min_int', eval(max_int` + 1'))
6088 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
6089 @result{}overflow occurred
6090 eval(`0x80000000 % -1')
6094 eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2'))
6098 If @var{radix} is specified, it specifies the radix to be used in the
6099 expansion. The default radix is 10; this is also the case if
6100 @var{radix} is the empty string. A warning results if the radix is
6101 outside the range of 1 through 36, inclusive. The result of @code{eval}
6102 is always taken to be signed. No radix prefix is output, and for
6103 radices greater than 10, the digits are lower case. The @var{width}
6104 argument specifies the minimum output width, excluding any negative
6105 sign. The result is zero-padded to extend the expansion to the
6106 requested width. A warning results if the width is negative. If
6107 @var{radix} or @var{width} is out of bounds, the expansion of
6108 @code{eval} is empty.
6117 eval(`666', `6', `10')
6119 eval(`-666', `6', `10')
6120 @result{}-0000003030
6123 `0r1:'eval(`10', `1', `11')
6124 @result{}0r1:01111111111
6128 @error{}m4:stdin:9: Warning: eval: radix out of range: 37
6131 @error{}m4:stdin:10: Warning: eval: negative width: -1
6134 @error{}m4:stdin:11: Warning: eval: empty string treated as zero
6139 @section Multiple precision arithmetic
6141 When @code{m4} is compiled with a multiple precision arithmetic library
6142 (@pxref{Experiments}), a builtin @code{mpeval} is defined.
6144 @deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @
6146 Behaves similarly to @code{eval}, except the calculations are done with
6147 infinite precision, and rational numbers are supported. Numbers may be
6150 The macro @code{mpeval} is recognized only with parameters.
6153 For the most part, using @code{mpeval} is similar to using @code{eval}:
6155 @comment options: -m mpeval
6157 $ @kbd{m4 -m mpeval}
6158 mpeval(`(1 << 70) + 2 ** 68 * 3', `16')
6159 @result{}700000000000000000
6160 `0r24:'mpeval(`0r36:zYx', `24', `5')
6164 The ratio operator, @samp{\}, is provided with the same precedence as
6165 division, and rationally divides two numbers and canonicalizes the
6166 result, whereas the division operator @samp{/} always returns the
6167 integer quotient of the division. To convert a rational value to
6168 integral, divide (@samp{/}) by 1. Some operators, such as @samp{%},
6169 @samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate
6170 only on integers and will truncate any rational remainder. The unsigned
6171 shift operator, @samp{>>>}, behaves identically with regular right
6172 shifts, @samp{>>}, since with infinite precision, it is not possible to
6173 convert a negative number to a positive using shifts. The
6174 exponentiation operator, @samp{**}, assumes that the exponent is
6175 integral, but allows negative exponents. With the short-circuit logical
6176 operators, @samp{||} and @samp{&&}, a non-zero result preserves the
6177 value of the argument that ended evaluation, rather than collapsing to
6178 @samp{1}. The operators @samp{?:} and @samp{,} are always available,
6179 even in @acronym{POSIX} mode, since @code{mpeval} does not have to
6180 conform to the @acronym{POSIX} rules for @code{eval}.
6182 @comment options: -m mpeval
6184 $ @kbd{m4 -m mpeval}
6199 @node Shell commands
6200 @chapter Macros for running shell commands
6202 @cindex executing shell commands
6203 @cindex running shell commands
6204 @cindex shell commands, running
6205 @cindex UNIX commands, running
6206 @cindex commands, running shell
6207 There are a few builtin macros in @code{m4} that allow you to run shell
6208 commands from within @code{m4}.
6210 Note that the definition of a valid shell command is system dependent.
6211 On UNIX systems, this is the typical @code{/bin/sh}. But on other
6212 systems, such as native Windows, the shell has a different syntax of
6213 commands that it understands. Some examples in this chapter assume
6214 @code{/bin/sh}, and also demonstrate how to quit early with a known
6215 exit value if this is not the case.
6218 * Platform macros:: Determining the platform
6219 * Syscmd:: Executing simple commands
6220 * Esyscmd:: Reading the output of commands
6221 * Sysval:: Exit status
6222 * Mkstemp:: Making temporary files
6223 * Mkdtemp:: Making temporary directories
6226 @node Platform macros
6227 @section Determining the platform
6228 Sometimes it is desirable for an input file to know which platform
6229 @code{m4} is running on. @acronym{GNU} @code{m4} provides several
6230 macros that are predefined to expand to the empty string; checking for
6231 their existence will confirm platform details.
6233 @deffn {Optional builtin (gnu)} __os2__
6234 @deffnx {Optional builtin (traditional)} os2
6235 @deffnx {Optional builtin (gnu)} __unix__
6236 @deffnx {Optional builtin (traditional)} unix
6237 @deffnx {Optional builtin (gnu)} __windows__
6238 @deffnx {Optional builtin (traditional)} windows
6239 Each of these macros is conditionally defined as needed to describe the
6240 environment of @code{m4}. If defined, each macro expands to the empty
6244 @cindex platform macro
6245 On UNIX systems, @acronym{GNU} @code{m4} will define @code{@w{__unix__}}
6246 in the @samp{gnu} module, and @code{unix} in the @samp{traditional}
6249 On native Windows systems, @acronym{GNU} @code{m4} will define
6250 @code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in
6251 the @samp{traditional} module.
6253 On OS/2 systems, @acronym{GNU} @code{m4} will define @code{@w{__os2__}}
6254 in the @samp{gnu} module, and @code{os2} in the @samp{traditional}
6257 If @acronym{GNU} M4 does not provide a platform macro for your system,
6258 please report that as a bug.
6261 define(`provided', `0')
6263 ifdef(`__unix__', `define(`provided', incr(provided))')
6265 ifdef(`__windows__', `define(`provided', incr(provided))')
6267 ifdef(`__os2__', `define(`provided', incr(provided))')
6274 @section Executing simple commands
6276 @deffn {Builtin (m4)} syscmd (@var{shell-command})
6277 Any shell command can be executed, using @code{syscmd}, which executes
6278 @var{shell-command} as a shell command.
6280 The expansion of @code{syscmd} is void, @emph{not} the output from
6281 @var{shell-command}! Output or error messages from @var{shell-command}
6282 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
6285 Prior to executing the command, @code{m4} flushes its output buffers.
6286 The default standard input, output and error of @var{shell-command} are
6287 the same as those of @code{m4}.
6289 When the @option{--safer} option (@pxref{Operation modes, , Invoking
6290 m4}) is in effect, @code{syscmd} results in an error, since otherwise an
6291 input file could execute arbitrary code.
6293 The builtin macro @code{syscmd} is recognized only when given arguments.
6297 define(`foo', `FOO')
6304 Note how the expansion of @code{syscmd} keeps the trailing newline of
6305 the command, as well as using the newline that appeared after the macro.
6307 @comment options: --safer
6312 @error{}m4:stdin:1: syscmd: disabled by --safer
6317 @section Reading the output of commands
6319 @cindex @acronym{GNU} extensions
6320 @deffn {Builtin (gnu)} esyscmd (@var{shell-command})
6321 If you want @code{m4} to read the output of a shell command, use
6322 @code{esyscmd}, which expands to the standard output of the shell
6323 command @var{shell-command}.
6325 Prior to executing the command, @code{m4} flushes its output buffers.
6326 The default standard input and standard error of @var{shell-command} are
6327 the same as those of @code{m4}. The error output of @var{shell-command}
6328 is not a part of the expansion: it will appear along with the error
6329 output of @code{m4}.
6332 define(`foo', `FOO')
6339 Note how the expansion of @code{esyscmd} keeps the trailing newline of
6340 the command, as well as using the newline that appeared after the macro.
6342 When the @option{--safer} option (@pxref{Operation modes, , Invoking
6343 m4}) is in effect, @code{esyscmd} results in an error, since otherwise
6344 an input file could execute arbitrary code.
6346 The builtin macro @code{esyscmd} is recognized only when given
6350 @comment options: --safer
6355 @error{}m4:stdin:1: esyscmd: disabled by --safer
6360 @section Exit status
6362 @cindex exit code from shell commands
6363 @cindex shell commands, exit code from
6364 @cindex UNIX commands, exit code from
6365 @cindex commands, exit code from shell
6366 @deffn {Builtin (m4)} sysval
6367 To see whether a shell command succeeded, use @code{sysval}, which
6368 expands to the exit status of the last shell command run with
6369 @code{syscmd} or @code{esyscmd}.
6375 ifelse(sysval, 0, zero, nonzero)
6383 When the @option{--safer} option (@pxref{Operation modes, , Invoking
6384 m4}) is in effect, @code{sysval} will always remain at its default value
6387 @comment options: --safer
6394 @error{}m4:stdin:2: syscmd: disabled by --safer
6401 @section Making temporary files
6403 @cindex temporary file names
6404 @cindex files, names of temporary
6405 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6406 temporary file, for output or for some other purpose. There is a
6407 builtin macro, @code{mkstemp}, for making a temporary file:
6409 @deffn {Builtin (m4)} mkstemp (@var{template})
6410 @deffnx {Builtin (m4)} maketemp (@var{template})
6411 Expands to a name of a new, empty file, made from the string
6412 @var{template}, which should end with the string @samp{XXXXXX}. The six
6413 @samp{X} characters are then replaced with random characters matching
6414 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
6415 name unique. If fewer than six @samp{X} characters are found at the end
6416 of @code{template}, the result will be longer than the template. The
6417 created file will have access permissions as if by @kbd{chmod =rw,go=},
6418 meaning that the current umask of the @code{m4} process is taken into
6419 account, and at most only the current user can read and write the file.
6421 The traditional behavior, standardized by @acronym{POSIX}, is that
6422 @code{maketemp} merely replaces the trailing @samp{X} with the process
6423 id, without creating a file, and without ensuring that the resulting
6424 string is a unique file name. In part, this means that using the same
6425 @var{template} twice in the same input file will result in the same
6426 expansion. This behavior is a security hole, as it is very easy for
6427 another process to guess the name that will be generated, and thus
6428 interfere with a subsequent use of @code{syscmd} trying to manipulate
6429 that file name. Hence, @acronym{POSIX} has recommended that all new
6430 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
6431 and that users of @code{m4} check for its existence.
6433 The expansion is void and an error issued if a temporary file could
6436 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
6437 is in effect, @code{mkstemp} and @acronym{GNU}-mode @code{maketemp}
6438 result in an error, since otherwise an input file could perform a mild
6439 denial-of-service attack by filling up a disk with multiple empty files.
6441 The macros @code{mkstemp} and @code{maketemp} are recognized only with
6445 If you try this next example, you will most likely get different output
6446 for the two file names, since the replacement characters are randomly
6452 maketemp(`/tmp/fooXXXXXX')
6453 @error{}m4:stdin:1: Warning: maketemp: recommend using mkstemp instead
6454 @result{}/tmp/fooa07346
6455 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
6456 `define(`mkstemp', defn(`maketemp'))dnl
6457 errprint(`warning: potentially insecure maketemp implementation
6464 @comment options: --safer
6468 maketemp(`/tmp/fooXXXXXX')
6469 @error{}m4:stdin:1: Warning: maketemp: recommend using mkstemp instead
6470 @error{}m4:stdin:1: maketemp: disabled by --safer
6472 mkstemp(`/tmp/fooXXXXXX')
6473 @error{}m4:stdin:2: mkstemp: disabled by --safer
6477 @cindex @acronym{GNU} extensions
6478 Unless you use the @option{--traditional} command line option (or
6479 @option{-G}, @pxref{Limits control, , Invoking m4}), the @acronym{GNU}
6480 version of @code{maketemp} is secure. This means that using the same
6481 template to multiple calls will generate multiple files. However, we
6482 recommend that you use the new @code{mkstemp} macro, introduced in
6483 @acronym{GNU} M4 1.4.8, which is secure even in traditional mode.
6487 syscmd(`echo foo??????')dnl
6489 define(`file1', maketemp(`fooXXXXXX'))dnl
6490 @error{}m4:stdin:2: Warning: maketemp: recommend using mkstemp instead
6491 ifelse(esyscmd(`echo foo??????'), `foo??????', `no file', `created')
6493 define(`file2', maketemp(`fooXX'))dnl
6494 @error{}m4:stdin:4: Warning: maketemp: recommend using mkstemp instead
6495 define(`file3', mkstemp(`fooXXXXXX'))dnl
6496 ifelse(len(file1), len(file2), `same length', `different')
6497 @result{}same length
6498 ifelse(file1, file2, `same', `different file')
6499 @result{}different file
6500 ifelse(file2, file3, `same', `different file')
6501 @result{}different file
6502 ifelse(file1, file3, `same', `different file')
6503 @result{}different file
6504 syscmd(`rm 'file1 file2 file3)
6510 @comment options: -G
6513 define(`file1', maketemp(`fooXXXXXX'))dnl
6514 @error{}m4:stdin:1: Warning: maketemp: recommend using mkstemp instead
6515 define(`file2', maketemp(`fooXXXXXX'))dnl
6516 @error{}m4:stdin:2: Warning: maketemp: recommend using mkstemp instead
6517 ifelse(file1, file2, `same', `different file')
6519 syscmd(`echo foo??????')dnl
6524 @section Making temporary directories
6526 @cindex temporary directory
6527 @cindex directories, temporary
6528 @cindex @acronym{GNU} extensions
6529 Commands specified to @code{syscmd} or @code{esyscmd} might need a
6530 temporary directory, for holding multiple temporary files; such a
6531 directory can be created with @code{mkdtemp}:
6533 @deffn {Builtin (gnu)} mkdtemp (@var{template})
6534 Expands to a name of a new, empty directory, made from the string
6535 @var{template}, which should end with the string @samp{XXXXXX}. The six
6536 @samp{X} characters are then replaced with random characters matching
6537 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name
6538 unique. If fewer than six @samp{X} characters are found at the end of
6539 @code{template}, the result will be longer than the template. The
6540 created directory will have access permissions as if by @kbd{chmod
6541 =rwx,go=}, meaning that the current umask of the @code{m4} process is
6542 taken into account, and at most only the current user can read, write,
6543 and search the directory.
6545 The expansion is void and an error issued if a temporary directory could
6548 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
6549 is in effect, @code{mkdtemp} results in an error, since otherwise an
6550 input file could perform a mild denial-of-service attack by filling up a
6551 disk with multiple directories.
6553 The macro @code{mkdtemp} is recognized only with parameters.
6554 This macro was added in M4 2.0.
6557 If you try this next example, you will most likely get different output
6558 for the directory names, since the replacement characters are randomly
6564 mkdtemp(`/tmp/fooXXXXXX')
6565 @result{}/tmp/foo2h89Vo
6570 @comment options: --safer
6574 mkdtemp(`/tmp/fooXXXXXX')
6575 @error{}m4:stdin:1: mkdtemp: disabled by --safer
6579 Multiple calls with the same template will generate multiple
6584 syscmd(`echo foo??????')dnl
6586 define(`dir1', mkdtemp(`fooXXXXXX'))dnl
6587 ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created')
6589 define(`dir2', mkdtemp(`fooXXXXXX'))dnl
6590 ifelse(dir1, dir2, `same', `different directories')
6591 @result{}different directories
6592 syscmd(`rmdir 'dir1 dir2)
6599 @chapter Miscellaneous builtin macros
6601 This chapter describes various builtins, that do not really belong in
6602 any of the previous chapters.
6605 * Errprint:: Printing error messages
6606 * Location:: Printing current location
6607 * M4exit:: Exiting from @code{m4}
6608 * Syncoutput:: Turning on and off sync lines
6612 @section Printing error messages
6614 @cindex printing error messages
6615 @cindex error messages, printing
6616 @cindex messages, printing error
6617 You can print error messages using @code{errprint}:
6619 @deffn {Builtin (m4)} errprint (@var{message}, @dots{})
6620 Prints @var{message} and the rest of the arguments to standard error,
6621 separated by spaces. Standard error is used, regardless of the
6622 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
6624 The expansion of @code{errprint} is void.
6625 The macro @code{errprint} is recognized only with parameters.
6629 errprint(`Invalid arguments to forloop
6631 @error{}Invalid arguments to forloop
6633 errprint(`1')errprint(`2',`3
6639 A trailing newline is @emph{not} printed automatically, so it should be
6640 supplied as part of the argument, as in the example. Unfortunately, the
6641 exact output of @code{errprint} is not very portable to other @code{m4}
6642 implementations: @acronym{POSIX} requires that all arguments be printed,
6643 but some implementations of @code{m4} only print the first.
6644 Furthermore, some BSD implementations always append a newline for each
6645 @code{errprint} call, regardless of whether the last argument already
6646 had one, and @acronym{POSIX} is silent on whether this is acceptable.
6649 @section Printing current location
6651 To make it possible to specify the location of an error, three
6652 utility builtins exist:
6654 @deffn {Builtin (gnu)} __file__
6655 @deffnx {Builtin (gnu)} __line__
6656 @deffnx {Builtin (gnu)} __program__
6657 Expand to the quoted name of the current input file, the
6658 current input line number in that file, and the quoted name of the
6659 current invocation of @code{m4}.
6663 errprint(__program__:__file__:__line__: `input error
6665 @error{}m4:stdin:1: input error
6669 Line numbers start at 1 for each file. If the file was found due to the
6670 @option{-I} option or @env{M4PATH} environment variable, that is
6671 reflected in the file name. Synclines, via @code{syncoutput}
6672 (@pxref{Syncoutput}) or the command line option @option{--synclines}
6673 (or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the
6674 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}),
6675 also use this notion of current file and line. Redefining the three
6676 location macros has no effect on syncline, debug, warning, or error
6679 This example reuses the file @file{incl.m4} mentioned earlier
6684 $ @kbd{m4 -I examples}
6685 define(`foo', ``$0' called at __file__:__line__')
6688 @result{}foo called at stdin:2
6690 @result{}Include file start
6691 @result{}foo called at examples/incl.m4:2
6692 @result{}Include file end
6696 The location of macros invoked during the rescanning of macro expansion
6697 text corresponds to the location in the file where the expansion was
6698 triggered, regardless of how many newline characters the expansion text
6699 contains. As of @acronym{GNU} M4 1.4.8, the location of text wrapped
6700 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
6701 @code{m4wrap} was invoked. Previous versions, however, behaved as
6702 though wrapped text came from line 0 of the file ``''.
6705 define(`echo', `$@@')
6707 define(`foo', `echo(__line__
6717 foo(errprint(__line__
6731 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
6732 terminology. If you invoke @code{m4} through an absolute path or a link
6733 with a different spelling, rather than by relying on a @env{PATH} search
6734 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
6735 The intent is that you can use it to produce error messages with the
6736 same formatting that @code{m4} produces internally. It can also be used
6737 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
6738 @code{m4} that is currently running, rather than whatever version of
6739 @code{m4} happens to be first in @env{PATH}. It was first introduced in
6740 @acronym{GNU} M4 1.4.6.
6743 @section Exiting from @code{m4}
6745 @cindex exiting from @code{m4}
6746 @deffn {Builtin (m4)} m4exit (@ovar{code})
6747 If you need to exit from @code{m4} before the entire input has been
6748 read, you can use @code{m4exit}, which causes @code{m4} to exit, with
6749 exit code @var{code}. If @var{code} is left out, the exit code is
6755 define(`fatal_error',
6756 `errprint(`m4:'__file__:__line__`: fatal error: $*
6759 fatal_error(`This is a BAD one, buster')
6760 @error{}m4:stdin:4: fatal error: This is a BAD one, buster
6763 After this macro call, @code{m4} will exit with exit code 1. This macro
6764 is only intended for error exits, since the normal exit procedures are
6765 not followed, e.g., diverted text is not undiverted, and saved text
6766 (@pxref{M4wrap}) is not reread.
6769 @section Turning on and off sync lines
6771 @cindex Toggling sync lines within @code{m4}
6772 @deffn {Builtin (gnu)} syncoutput (@var{truth})
6773 If you need to toggle sync lines on and off while processing macros, or
6774 to insure that they are off or on, you may do so using
6777 If @var{truth} matches the extended regular expression
6778 @samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the
6779 form: @code{#line <number> ["<file>"]}.
6781 If @var{truth} is empty, or matches the extended regular expression
6782 @samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off.
6784 All other arguments are ignored and issue a warning. The macro
6785 @code{syncoutput} is only recognized with arguments.
6789 @chapter Fast loading of frozen state
6791 Some bigger @code{m4} applications may be built over a common base
6792 containing hundreds of definitions and other costly initializations.
6793 Usually, the common base is kept in one or more declarative files,
6794 which files are listed on each @code{m4} invocation prior to the
6795 user's input file, or else each input file uses @code{include}.
6797 Reading the common base of a big application, over and over again, may
6798 be time consuming. @acronym{GNU} @code{m4} offers some machinery to
6799 speed up the start of an application using lengthy common bases.
6802 * Using frozen files:: Using frozen files
6803 * Frozen file format 1:: Frozen file format 1
6804 * Frozen file format 2:: Frozen file format 2
6807 @node Using frozen files
6808 @section Using frozen files
6810 @cindex fast loading of frozen files
6811 @cindex frozen files for fast loading
6812 @cindex initialization, frozen state
6813 @cindex dumping into frozen file
6814 @cindex reloading a frozen file
6815 @cindex @acronym{GNU} extensions
6816 Suppose a user has a library of @code{m4} initializations in
6817 @file{base.m4}, which is then used with multiple input files:
6821 $ @kbd{m4 base.m4 input1.m4}
6822 $ @kbd{m4 base.m4 input2.m4}
6823 $ @kbd{m4 base.m4 input3.m4}
6826 Rather than spending time parsing the fixed contents of @file{base.m4}
6827 every time, the user might rather execute:
6831 $ @kbd{m4 -F base.m4f base.m4}
6835 once, and further execute, as often as needed:
6839 $ @kbd{m4 -R base.m4f input1.m4}
6840 $ @kbd{m4 -R base.m4f input2.m4}
6841 $ @kbd{m4 -R base.m4f input3.m4}
6845 with the varying input. The first call, containing the @code{-F}
6846 option, only reads and executes file @file{base.m4}, defining
6847 various application macros and computing other initializations.
6848 Once the input file @file{base.m4} has been completely processed, @acronym{GNU}
6849 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
6850 file which contains a kind of snapshot of the @code{m4} internal state.
6852 Later calls, containing the @code{-R} option, are able to reload
6853 the internal state of @code{m4}, from @file{base.m4f},
6854 @emph{prior} to reading any other input files. This means
6855 instead of starting with a virgin copy of @code{m4}, input will be
6856 read after having effectively recovered the effect of a prior run.
6857 In our example, the effect is the same as if file @file{base.m4} has
6858 been read anew. However, this effect is achieved a lot faster.
6860 Only one frozen file may be created or read in any one @code{m4}
6861 invocation. It is not possible to recover two frozen files at once.
6862 However, frozen files may be updated incrementally, through using
6863 @code{-R} and @code{-F} options simultaneously. For example, if
6864 some care is taken, the command:
6868 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
6872 could be broken down in the following sequence, accumulating the same
6877 $ @kbd{m4 -F file1.m4f file1.m4}
6878 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
6879 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
6880 $ @kbd{m4 -R file3.m4f file4.m4}
6883 @comment FIXME - merge the rest of this section.
6884 Some care is necessary because not every effort has been made for
6885 this to work in all cases. In particular, the trace attribute of
6886 macros is not handled.
6887 Also, interactions for some options of @code{m4} being used in one call
6888 and not for the next, have not been fully analyzed yet. On the other
6889 end, you may be confident that stacks of @code{pushdef}'ed definitions
6890 are handled correctly, so are @code{undefine}'d or renamed builtins,
6891 changed strings for quotes or comments.
6893 When an @code{m4} run is to be frozen, the automatic undiversion
6894 which takes place at end of execution is inhibited. Instead, all
6895 positively numbered diversions are saved into the frozen file.
6896 The active diversion number is also transmitted.
6898 A frozen file to be reloaded need not reside in the current directory.
6899 It is looked up the same way as an @code{include} file (@pxref{Search
6902 @node Frozen file format 1
6903 @section Frozen file format 1
6905 Wow - thanks for really reading the manual. Report this as a bug if
6906 this text is not removed before a release.
6907 FIXME - split out the two formats into separate nodes.
6909 When loading format 1, the syntax categories @samp{@{} and @samp{@}} are
6910 disabled (reverting braces to be treated like plain characters). This
6911 is because frozen files created with M4 1.4.x did not understand
6912 @samp{$@{@dots{}@}} extended argument notation, and a frozen macro that
6913 contained this character sequence should not behave differently just
6914 because a newer version of M4 reloaded the file.
6916 @node Frozen file format 2
6917 @section Frozen file format 2
6919 Frozen files are sharable across architectures. It is safe to write
6920 a frozen file on one machine and read it on another, given that the
6921 second machine uses the same, or a newer version of GNU @code{m4}.
6922 These are simple (editable) text files, made up of directives,
6923 each starting with a capital letter and ending with a newline
6924 (@key{NL}). Wherever a directive is expected, the character
6925 @kbd{#} introduces a comment line, empty lines are also ignored.
6926 In the following descriptions, @var{length}s always refer to
6927 corresponding @var{string}s. Numbers are always expressed in decimal.
6931 @item V @var{number} @key{NL}
6932 Confirms the format of the file. For the version documented here,
6933 @var{number} should be 2. It is backwards compatible with the previous
6934 version though, so version 1 frozen files can be loaded too if necessary.
6936 @item C @var{length1} , @var{length2} @key{NL} @var{string1} @var{string2} @key{NL}
6937 Uses @var{string1} and @var{string2} as the beginning comment and
6938 end comment strings.
6940 @item Q @var{length1} , @var{length2} @key{NL} @var{string1} @var{string2} @key{NL}
6941 Uses @var{string1} and @var{string2} as the beginning quote and end quote
6944 @item R @var{length} @key{NL} @var{string} @key{NL}
6945 Sets the default regexp syntax, where @var{string} encodes one of the
6946 regular expression syntaxes supported by @acronym{GNU} M4.
6947 @xref{Changeresyntax}, for more details.
6949 @item M @var{length} @key{NL} @var{string} @key{NL}
6950 Names a module which will be searched for according to the module search path
6951 and loaded. Modules loaded from a frozen file don't add their builtin entries
6952 to the symbol table.
6954 @item F @var{length} @key{NL} @var{string} @key{ML}
6955 Defines, through @code{pushdef}, a definition for @var{string}
6956 expanding to the function whose builtin name is also @var{string}. The builtin
6957 name is searched for among the intrinsic builtin functions only.
6959 @item F @var{length1} , @var{length2} @key{NL} @var{string1} @var{string2} @key{NL}
6960 Defines, through @code{pushdef}, a definition for @var{string1}
6961 expanding to the function whose builtin name is @var{string2}. With two
6962 arguments, the builtin name is searched for among the intrinsic builtin
6965 @item F @var{length1} , @var{length2} , @var{length3} @key{NL} @var{string1} @var{string2} @var{string3} @key{NL}
6966 Defines, through @code{pushdef}, a definition for @var{string1}
6967 expanding to the function whose builtin name is @var{string2}. With three
6968 arguments, the builtin name is searched for amongst the builtin functions
6969 defined by the module named by @var{string3}.
6971 @item S @var{syntax-code} @var{length} @key{NL} @var{string} @key{NL}
6972 Defines, through @code{changesyntax}, a syntax category for each of the
6973 characters in @var{string}. The @var{syntax-code} must be one of the
6974 characters described in @ref{Changesyntax}.
6976 @item T @var{length1} , @var{length2} @key{NL} @var{string1} @var{string2} @key{NL}
6977 Defines, though @code{pushdef}, a definition for @var{string1}
6978 expanding to the text given by @var{string2}.
6980 @item D @var{number}, @var{length} @key{NL} @var{string} @key{NL}
6981 Selects diversion @var{number}, making it current, then copy
6982 @var{string} in the current diversion. @var{number} may be a negative
6983 number for a non-existing diversion. To merely specify an active
6984 selection, use this command with an empty @var{string}. With 0 as the
6985 diversion @var{number}, @var{string} will be issued on standard output
6986 at reload time, however this may not be produced from within @code{m4}.
6991 @chapter Compatibility with other versions of @code{m4}
6993 @cindex compatibility
6994 This chapter describes the differences between this implementation of
6995 @code{m4}, and the implementation found under UNIX, notably System V,
6998 There are also differences in BSD flavors of @code{m4}. No attempt
6999 is made to summarize these here.
7002 * Extensions:: Extensions in @acronym{GNU} M4
7003 * Incompatibilities:: Other incompatibilities
7004 * Experiments:: Experimental features in @acronym{GNU} M4
7008 @section Extensions in @acronym{GNU} M4
7010 @cindex @acronym{GNU} extensions
7011 @cindex @acronym{POSIX}
7012 @cindex @env{POSIXLY_CORRECT}
7013 This version of @code{m4} contains a few facilities that do not exist
7014 in System V @code{m4}. These extra facilities are all suppressed by
7015 using the @samp{-G} command line option, unless overridden by other
7016 command line options.
7017 Most of these extensions are compatible with
7018 @uref{http://www.unix.org/single_unix_specification/,
7019 @acronym{POSIX}}; the few exceptions are suppressed if the
7020 @env{POSIXLY_CORRECT} environment variable is set.
7024 In the @code{$}@var{n} notation for macro arguments, @var{n} can contain
7025 several digits, while the System V @code{m4} only accepts one digit.
7026 This allows macros in GNU @code{m4} to take any number of arguments, and
7027 not only nine (@pxref{Arguments}).
7028 @acronym{POSIX} does not allow this extension, so it is disabled if
7029 @env{POSIXLY_CORRECT} is set.
7032 Files included with @code{include} and @code{sinclude} are sought in a
7033 user specified search path, if they are not found in the working
7034 directory. The search path is specified by the @samp{-I} option and the
7035 @samp{M4PATH} environment variable (@pxref{Search Path}).
7038 Arguments to @code{undivert} can be non-numeric, in which case the named
7039 file will be included uninterpreted in the output (@pxref{Undivert}).
7040 @acronym{POSIX} does not allow this extension, so it is disabled if
7041 @env{POSIXLY_CORRECT} is set.
7044 Formatted output is supported through the @code{format} builtin, which
7045 is modeled after the C library function @code{printf} (@pxref{Format}).
7048 Searches and text substitution through regular expressions are supported
7049 by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
7050 (@pxref{Patsubst}) builtins.
7053 The syntax of regular expressions in M4 has never clearly formalized.
7054 While Open BSD M4 uses extended regular expressions for @code{regexp}
7055 and @code{patsubst}, @acronym{GNU} M4 uses basic regular expression. Use
7056 @code{changeresyntax} (@pxref{Changeresyntax}) to change the regular
7057 expression syntax used by @acronym{GNU} M4.
7060 The output of shell commands can be read into @code{m4} with
7061 @code{esyscmd} (@pxref{Esyscmd}).
7064 There is indirect access to any builtin macro with @code{builtin}
7068 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
7071 The name of the current input file and the current input line number are
7072 accessible through the builtins @code{@w{__file__}} and
7073 @code{@w{__line__}} (@pxref{Errprint}).
7076 The generation of sync lines can be controlled through @code{syncoutput}
7077 (@pxref{Syncoutput}).
7080 The format of the output from @code{dumpdef} and macro tracing can be
7081 controlled with @code{debugmode} (@pxref{Debugmode}).
7084 The destination of trace and debug output can be controlled with
7085 @code{debugfile} (@pxref{Debugfile}).
7088 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
7089 creating a new file with a unique name on every invocation, rather than
7090 following the insecure behavior of replacing the trailing @samp{X}
7091 characters with the @code{m4} process id. @acronym{POSIX} does not
7092 allow this extension, so @code{maketemp} is insecure if
7093 @env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in
7097 Additionally, @acronym{POSIX} only requires support for the command line
7098 options @samp{-s}, @samp{-D}, and @samp{-U}, so all other options
7099 accepted by @acronym{GNU} M4 are extensions. @xref{Invoking m4},
7100 for a description of these options.
7102 Also, the debugging and tracing facilities in GNU @code{m4} are much
7103 more extensive than in most other versions of @code{m4}.
7105 @node Incompatibilities
7106 @section Other incompatibilities
7108 There are a few other incompatibilities between this implementation of
7109 @code{m4}, and the System V version.
7113 GNU @code{m4} implements sync lines differently from System V @code{m4},
7114 when text is being diverted. GNU @code{m4} outputs the sync lines when
7115 the text is being diverted, and System V @code{m4} when the diverted
7116 text is being brought back.
7118 The problem is which lines and file names should be attached to text that
7119 is being, or has been, diverted. System V @code{m4} regards all the
7120 diverted text as being generated by the source line containing the
7121 @code{undivert} call, whereas GNU @code{m4} regards the diverted text as
7122 being generated at the time it is diverted.
7124 I expect the sync line option to be used mostly when using @code{m4} as
7125 a front end to a compiler. If a diverted line causes a compiler error,
7126 the error messages should most probably refer to the place where the
7127 diversion were made, and not where it was inserted again.
7130 GNU @code{m4} makes no attempt at prohibiting self-referential definitions
7141 There is nothing inherently wrong with defining @samp{x} to
7142 return @samp{x}. The wrong thing is to expand @samp{x} unquoted.
7143 In @code{m4}, one might use macros to hold strings, as we do for
7144 variables in other programming languages, further checking them with:
7148 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
7152 In cases like this one, an interdiction for a macro to hold its own
7153 name would be a useless limitation. Of course, this leaves more rope
7154 for the GNU @code{m4} user to hang himself! Rescanning hangs may be
7155 avoided through careful programming, a little like for endless loops
7156 in traditional programming languages.
7159 Some implementations of @code{m4} (Solaris for example) conform to
7160 @acronym{POSIX}, which requires @code{define(@var{macro})} to behave
7161 like @code{undefine(@var{macro})pushdef(@var{macro})}. Other
7162 implementations, including GNU @code{m4} without the @samp{-G} option
7163 and without @env{POSIXLY_CORRECT} set, treat
7164 @code{define(@var{macro})} as
7165 @code{popdef(@var{macro})pushdef(@var{macro})}.
7168 @acronym{POSIX} states that only unquoted leading newlines and blanks
7169 (that is, space and tab) are ignored when collecting macro arguments.
7170 However, this appears to be a bug in @acronym{POSIX}, since most
7171 traditional implementations also ignore all whitespace (formfeed,
7172 carriage return, and vertical tab). @acronym{GNU} @code{m4} follows
7173 tradition and ignores all leading unquoted whitespace.
7177 @section Experimental features in @acronym{GNU} M4
7179 Certain features of GNU @code{m4} are experimental.
7181 Some are only available if activated by an option given to
7182 @file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation
7183 time. The functionality
7184 might change or even go away in the future. @emph{Do not rely on it}.
7185 Please direct your comments about it the same way you would do for bugs.
7187 @section Changesyntax
7189 An experimental feature, which would improve @code{m4} usefulness,
7190 allows for changing the way the input is parsed (@pxref{Changesyntax}).
7192 No compile time option is needed for @code{changesyntax}.
7194 The implementation does not seem to slow down @code{m4}, more likely the
7197 @section Multiple precision arithmetic
7199 Another experimental feature, which would improve @code{m4} usefulness,
7200 allows for multiple precision rational arithmetic in @code{eval}.
7205 ./configure --with-gmp
7209 if you want this feature compiled in. The current implementation is
7210 unproven and might go away. Do not count on it yet.
7213 @chapter Correct version of some examples
7215 Some of the examples in this manuals are buggy or not very robust, for
7216 demonstration purposes. Improved versions of these composite macros are
7220 * Improved exch:: Solution for @code{exch}
7221 * Improved forloop:: Solution for @code{forloop}
7222 * Improved foreach:: Solution for @code{foreach}
7223 * Improved cleardivert:: Solution for @code{cleardivert}
7224 * Improved fatal_error:: Solution for @code{fatal_error}
7228 @section Solution for @code{exch}
7230 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
7231 to double quote their arguments. A nicer definition, which lets
7232 clients follow the rule of thumb of one level of quoting per level of
7233 parentheses, involves adding quotes in the definition of @code{exch}, as
7237 define(`exch', ``$2', `$1'')
7239 define(exch(`expansion text', `macro'))
7242 @result{}expansion text
7245 @node Improved forloop
7246 @section Solution for @code{forloop}
7248 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
7249 into an infinite loop if given an iterator that is not parsed as a macro
7250 name. It does not do any sanity checking on its numeric bounds, and
7251 only permits decimal numbers for bounds. Here is an improved version,
7252 shipped as @file{m4-@value{VERSION}/@/examples/@/forloop2.m4}; this
7253 version also optimizes based on the fact that the starting bound does
7254 not need to be passed to the helper @code{@w{_forloop}}.
7258 $ @kbd{m4 -I examples}
7259 undivert(`forloop2.m4')dnl
7260 @result{}divert(`-1')
7261 @result{}# forloop(var, from, to, stmt) - improved version:
7262 @result{}# works even if VAR is not a strict macro name
7263 @result{}# performs sanity check that FROM is larger than TO
7264 @result{}# allows complex numerical expressions in TO and FROM
7265 @result{}define(`forloop', `ifelse(eval(`($3) >= ($2)'), `1',
7266 @result{} `pushdef(`$1', eval(`$2'))_forloop(`$1',
7267 @result{} eval(`$3'), `$4')popdef(`$1')')')
7268 @result{}define(`_forloop',
7269 @result{} `$3`'ifelse(indir(`$1'), `$2', `',
7270 @result{} `define(`$1', incr(indir(`$1')))$0($@@)')')
7271 @result{}divert`'dnl
7272 include(`forloop2.m4')
7274 forloop(`i', `2', `1', `no iteration occurs')
7276 forloop(`', `1', `2', ` odd iterator name')
7277 @result{} odd iterator name odd iterator name
7278 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
7279 @result{} 0xa 0xb 0xc
7280 forloop(`i', `a', `b', `non-numeric bounds')
7281 @error{}m4:stdin:6: Warning: eval: bad input: (b) >= (a)
7285 Of course, it is possible to make even more improvements, such as
7286 adding an optional step argument, or allowing iteration through
7287 descending sequences. @acronym{GNU} Autoconf provides some of these
7288 additional bells and whistles in its @code{m4_for} macro.
7290 @node Improved foreach
7291 @section Solution for @code{foreach}
7293 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
7294 presented earlier each have flaws. First, we will examine and fix the
7295 quadratic behavior of @code{foreachq}:
7299 $ @kbd{m4 -I examples}
7300 include(`foreachq.m4')
7302 traceon(`shift')debugmode(`aq')
7304 foreachq(`x', ``1', `2', `3', `4'', `x
7307 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7308 @error{}m4trace: -2- shift(`1', `2', `3', `4')
7310 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7311 @error{}m4trace: -3- shift(`2', `3', `4')
7312 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7313 @error{}m4trace: -2- shift(`2', `3', `4')
7315 @error{}m4trace: -5- shift(`1', `2', `3', `4')
7316 @error{}m4trace: -4- shift(`2', `3', `4')
7317 @error{}m4trace: -3- shift(`3', `4')
7318 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7319 @error{}m4trace: -3- shift(`2', `3', `4')
7320 @error{}m4trace: -2- shift(`3', `4')
7322 @error{}m4trace: -6- shift(`1', `2', `3', `4')
7323 @error{}m4trace: -5- shift(`2', `3', `4')
7324 @error{}m4trace: -4- shift(`3', `4')
7325 @error{}m4trace: -3- shift(`4')
7328 Each successive iteration was adding more quoted @code{shift}
7329 invocations, and the entire list contents were passing through every
7330 iteration. In general, when recursing, it is a good idea to make the
7331 recursion use fewer arguments, rather than adding additional quoted
7332 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
7333 fewer macros, is less likely to run into machine limits, and most
7334 importantly, performs faster. The fixed version of @code{foreachq} can
7335 be found in @file{m4-@value{VERSION}/@/examples/@/foreachq2.m4}:
7339 $ @kbd{m4 -I examples}
7340 include(`foreachq2.m4')
7342 undivert(`foreachq2.m4')dnl
7343 @result{}include(`quote.m4')dnl
7344 @result{}divert(`-1')
7345 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
7346 @result{}# quoted list, improved version
7347 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
7348 @result{}define(`_arg1q', ``$1'')
7349 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
7350 @result{}define(`_foreachq', `ifelse(`$2', `', `',
7351 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
7352 @result{}divert`'dnl
7353 traceon(`shift')debugmode(`aq')
7355 foreachq(`x', ``1', `2', `3', `4'', `x
7358 @error{}m4trace: -3- shift(`1', `2', `3', `4')
7360 @error{}m4trace: -3- shift(`2', `3', `4')
7362 @error{}m4trace: -3- shift(`3', `4')
7366 Note that the fixed version calls unquoted helper macros in
7367 @code{@w{_foreachq}} to trim elements immediately; those helper macros
7368 in turn must re-supply the layer of quotes lost in the macro invocation.
7369 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
7370 element, with @code{@w{_arg1}} of the earlier implementation that
7371 returned the first list element directly.
7373 For a different approach, the improved version of @code{foreach},
7374 available in @file{m4-@value{VERSION}/@/examples/@/foreach2.m4}, simply
7375 overquotes the arguments to @code{@w{_foreach}} to begin with, using
7376 @code{dquote_elt}. Then @code{@w{_foreach}} can just use
7377 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
7382 $ @kbd{m4 -I examples}
7383 include(`foreach2.m4')
7385 undivert(`foreach2.m4')dnl
7386 @result{}include(`quote.m4')dnl
7387 @result{}divert(`-1')
7388 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
7389 @result{}# parenthesized list, improved version
7390 @result{}define(`foreach', `pushdef(`$1')_foreach(`$1',
7391 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
7392 @result{}define(`_arg1', `$1')
7393 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
7394 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
7395 @result{}divert`'dnl
7396 traceon(`shift')debugmode(`aq')
7398 foreach(`x', `(`1', `2', `3', `4')', `x
7400 @error{}m4trace: -4- shift(`1', `2', `3', `4')
7401 @error{}m4trace: -4- shift(`2', `3', `4')
7402 @error{}m4trace: -4- shift(`3', `4')
7404 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
7406 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
7408 @error{}m4trace: -3- shift(``3'', ``4'')
7410 @error{}m4trace: -3- shift(``4'')
7413 In summary, recursion over list elements is trickier than it appeared at
7414 first glance, but provides a powerful idiom within @code{m4} processing.
7415 As a final demonstration, both list styles are now able to handle
7416 several scenarios that would wreak havoc on the original
7417 implementations. This points out one other difference between the two
7418 list styles. @code{foreach} evaluates unquoted list elements only once,
7419 in preparation for calling @code{@w{_foreach}}. But @code{foreachq}
7420 evaluates unquoted list elements twice while visiting the first list
7421 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
7422 deciding which list style to use, one must take into account whether
7423 repeating the side effects of unquoted list elements will have any
7424 detrimental effects.
7428 $ @kbd{m4 -I examples}
7429 include(`foreach2.m4')
7431 include(`foreachq2.m4')
7434 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
7436 dnl 1-element list of empty element
7437 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
7439 dnl 2-element list of empty elements
7440 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
7441 @result{}<><> / <><>
7442 dnl 1-element list of a comma
7443 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
7445 dnl 2-element list of unbalanced parentheses
7446 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
7447 @result{}<(><)> / <(><)>
7448 define(`active', `ACT, IVE')
7452 dnl list of unquoted macros; expansion occurs before recursion
7453 foreach(`x', `(active, active)', `<x>
7455 @error{}m4trace: -4- active -> `ACT, IVE'
7456 @error{}m4trace: -4- active -> `ACT, IVE'
7461 foreachq(`x', `active, active', `<x>
7463 @error{}m4trace: -3- active -> `ACT, IVE'
7464 @error{}m4trace: -3- active -> `ACT, IVE'
7466 @error{}m4trace: -3- active -> `ACT, IVE'
7467 @error{}m4trace: -3- active -> `ACT, IVE'
7471 dnl list of quoted macros; expansion occurs during recursion
7472 foreach(`x', `(`active', `active')', `<x>
7474 @error{}m4trace: -1- active -> `ACT, IVE'
7476 @error{}m4trace: -1- active -> `ACT, IVE'
7478 foreachq(`x', ``active', `active'', `<x>
7480 @error{}m4trace: -1- active -> `ACT, IVE'
7482 @error{}m4trace: -1- active -> `ACT, IVE'
7484 dnl list of double-quoted macro names; no expansion
7485 foreach(`x', `(``active'', ``active'')', `<x>
7489 foreachq(`x', ```active'', ``active''', `<x>
7495 @node Improved cleardivert
7496 @section Solution for @code{cleardivert}
7498 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
7499 called without arguments to clear all pending diversions. That is
7500 because using undivert with an empty string for an argument is different
7501 than using it with no arguments at all. Compare the earlier definition
7502 with one that takes the number of arguments into account:
7505 define(`cleardivert',
7506 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
7516 define(`cleardivert',
7517 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
7518 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
7529 @node Improved fatal_error
7530 @section Solution for @code{fatal_error}
7532 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
7533 of @acronym{GNU} M4 earlier than 1.4.8, where invoking
7534 @code{@w{__file__}} (@pxref{Location}) inside @code{m4wrap} would result
7535 in an empty string, and @code{@w{__line__}} resulted in @samp{0} even
7536 though all files start at line 1. Furthermore, versions earlier than
7537 1.4.6 did not support the @code{@w{__program__}} macro. If you want
7538 @code{fatal_error} to work across the entire 1.4.x release series, a
7539 better implementation would be:
7543 define(`fatal_error',
7544 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
7545 `:ifelse(__line__, `0', `',
7546 `__file__:__line__:')` fatal error: $*
7549 m4wrap(`divnum(`demo of internal message')
7550 fatal_error(`inside wrapped text')')
7553 @error{}m4:stdin:6: Warning: divnum: extra arguments ignored: 1 > 0
7555 @error{}m4:stdin:6: fatal error: inside wrapped text
7558 @c ========================================================== Appendices
7560 @node Copying This Manual
7561 @appendix How to make copies of this manual
7565 * GNU Free Documentation License:: License for copying this manual
7571 @appendix Indices of concepts and macros
7574 * Concept index:: Index for many concepts
7575 * Macro index:: Index for all @code{m4} macros
7579 @appendixsec Index for many concepts
7584 @appendixsec Index for all @code{m4} macros
7586 References are exclusively to the places where a builtin is introduced
7599 @c ispell-local-dictionary: "american"
7600 @c indent-tabs-mode: nil
7601 @c whitespace-check-buffer-indent: nil