1 \input texinfo @c -*- texinfo -*-
2 @comment ========================================================
3 @comment %**start of header
6 @settitle GNU M4 @value{VERSION} macro processor
14 @c The testsuite expects literal tab output in some examples, but
15 @c literal tabs in texinfo leads to formatting issues.
21 @c -------------------
22 @c The ARG is an optional argument. To be used for macro arguments in
23 @c their documentation (@defmac).
25 @r{[}@var{\varname\}@r{]}@c
28 @c @dvar{ARG, DEFAULT}
29 @c -------------------
30 @c The ARG is an optional argument, defaulting to DEFAULT. To be used
31 @c for macro arguments in their documentation (@defmac).
32 @macro dvar{varname, default}
33 @r{[}@var{\varname\} = @samp{\default\}@r{]}@c
36 @comment %**end of header
37 @comment ========================================================
41 This manual (@value{UPDATED}) is for GNU M4 (version
42 @value{VERSION}), a package containing an implementation of the m4 macro
45 Copyright @copyright{} 1989-1994, 2004-2011, 2013-2014, 2017 Free
46 Software Foundation, Inc.
49 Permission is granted to copy, distribute and/or modify this document
50 under the terms of the GNU Free Documentation License,
51 Version 1.3 or any later version published by the Free Software
52 Foundation; with no Invariant Sections, no Front-Cover Texts, and no
53 Back-Cover Texts. A copy of the license is included in the section
54 entitled ``GNU Free Documentation License.''
58 @dircategory Text creation and manipulation
60 * M4: (m4). A powerful macro processor.
64 @title GNU M4, version @value{VERSION}
65 @subtitle A powerful macro processor
66 @subtitle Edition @value{EDITION}, @value{UPDATED}
67 @author by Ren@'e Seindal, Fran@,{c}ois Pinard,
68 @author Gary V. Vaughan, and Eric Blake
69 @author (@email{bug-m4@@gnu.org})
72 @vskip 0pt plus 1filll
84 GNU @code{m4} is an implementation of the traditional UNIX macro
85 processor. It is mostly SVR4 compatible, although it has some
86 extensions (for example, handling more than 9 positional parameters
87 to macros). @code{m4} also has builtin functions for including
88 files, running shell commands, doing arithmetic, etc. Autoconf needs
89 GNU @code{m4} for generating @file{configure} scripts, but not for
92 GNU @code{m4} was originally written by Ren@'e Seindal, with
93 subsequent changes by Fran@,{c}ois Pinard and other volunteers
94 on the Internet. All names and email addresses can be found in the
95 files @file{m4-@value{VERSION}/@/AUTHORS} and
96 @file{m4-@value{VERSION}/@/THANKS} from the GNU M4
100 This is release @value{VERSION}. It is now considered stable: future
101 releases on this branch are only meant to fix bugs, increase speed, or
102 improve documentation.
106 This is BETA release @value{VERSION}. This is a development release,
107 and as such, is prone to bugs, crashes, unforeseen features, incomplete
108 documentation@dots{}, therefore, use at your own peril. In case of
109 problems, please do not hesitate to report them (see the
110 @file{m4-@value{VERSION}/@/README} file in the distribution).
115 * Preliminaries:: Introduction and preliminaries
116 * Invoking m4:: Invoking @code{m4}
117 * Syntax:: Lexical and syntactic conventions
119 * Macros:: How to invoke macros
120 * Definitions:: How to define new macros
121 * Conditionals:: Conditionals, loops, and recursion
123 * Debugging:: How to debug macros and input
125 * Input Control:: Input control
126 * File Inclusion:: File inclusion
127 * Diversions:: Diverting and undiverting output
129 * Modules:: Extending M4 with dynamic runtime modules
131 * Text handling:: Macros for text handling
132 * Arithmetic:: Macros for doing arithmetic
133 * Shell commands:: Macros for running shell commands
134 * Miscellaneous:: Miscellaneous builtin macros
135 * Frozen files:: Fast loading of frozen state
137 * Compatibility:: Compatibility with other versions of @code{m4}
138 * Answers:: Correct version of some examples
140 * Copying This Package:: How to make copies of the overall M4 package
141 * Copying This Manual:: How to make copies of this manual
142 * Indices:: Indices of concepts and macros
145 --- The Detailed Node Listing ---
147 Introduction and preliminaries
149 * Intro:: Introduction to @code{m4}
150 * History:: Historical references
151 * Bugs:: Problems and bugs
152 * Manual:: Using this manual
156 * Operation modes:: Command line options for operation modes
157 * Preprocessor features:: Command line options for preprocessor features
158 * Limits control:: Command line options for limits control
159 * Frozen state:: Command line options for frozen state
160 * Debugging options:: Command line options for debugging
161 * Command line files:: Specifying input files on the command line
163 Lexical and syntactic conventions
165 * Names:: Macro names
166 * Quoted strings:: Quoting input to @code{m4}
167 * Comments:: Comments in @code{m4} input
168 * Other tokens:: Other kinds of input tokens
169 * Input processing:: How @code{m4} copies input to output
170 * Regular expression syntax:: How @code{m4} interprets regular expressions
174 * Invocation:: Macro invocation
175 * Inhibiting Invocation:: Preventing macro invocation
176 * Macro Arguments:: Macro arguments
177 * Quoting Arguments:: On Quoting Arguments to macros
178 * Macro expansion:: Expanding macros
180 How to define new macros
182 * Define:: Defining a new macro
183 * Arguments:: Arguments to macros
184 * Pseudo Arguments:: Special arguments to macros
185 * Undefine:: Deleting a macro
186 * Defn:: Renaming macros
187 * Pushdef:: Temporarily redefining macros
188 * Renamesyms:: Renaming macros with regular expressions
190 * Indir:: Indirect call of macros
191 * Builtin:: Indirect call of builtins
192 * M4symbols:: Getting the defined macro names
194 Conditionals, loops, and recursion
196 * Ifdef:: Testing if a macro is defined
197 * Ifelse:: If-else construct, or multibranch
198 * Shift:: Recursion in @code{m4}
199 * Forloop:: Iteration by counting
200 * Foreach:: Iteration by list contents
201 * Stacks:: Working with definition stacks
202 * Composition:: Building macros with macros
204 How to debug macros and input
206 * Dumpdef:: Displaying macro definitions
207 * Trace:: Tracing macro calls
208 * Debugmode:: Controlling debugging options
209 * Debuglen:: Limiting debug output
210 * Debugfile:: Saving debugging output
214 * Dnl:: Deleting whitespace in input
215 * Changequote:: Changing the quote characters
216 * Changecom:: Changing the comment delimiters
217 * Changeresyntax:: Changing the regular expression syntax
218 * Changesyntax:: Changing the lexical structure of the input
219 * M4wrap:: Saving text until end of input
223 * Include:: Including named files
224 * Search Path:: Searching for include files
226 Diverting and undiverting output
228 * Divert:: Diverting output
229 * Undivert:: Undiverting output
230 * Divnum:: Diversion numbers
231 * Cleardivert:: Discarding diverted text
233 Extending M4 with dynamic runtime modules
235 * M4modules:: Listing loaded modules
236 * Standard Modules:: Standard bundled modules
238 Macros for text handling
240 * Len:: Calculating length of strings
241 * Index macro:: Searching for substrings
242 * Regexp:: Searching for regular expressions
243 * Substr:: Extracting substrings
244 * Translit:: Translating characters
245 * Patsubst:: Substituting text by regular expression
246 * Format:: Formatting strings (printf-like)
248 Macros for doing arithmetic
250 * Incr:: Decrement and increment operators
251 * Eval:: Evaluating integer expressions
252 * Mpeval:: Multiple precision arithmetic
254 Macros for running shell commands
256 * Platform macros:: Determining the platform
257 * Syscmd:: Executing simple commands
258 * Esyscmd:: Reading the output of commands
259 * Sysval:: Exit status
260 * Mkstemp:: Making temporary files
261 * Mkdtemp:: Making temporary directories
263 Miscellaneous builtin macros
265 * Errprint:: Printing error messages
266 * Location:: Printing current location
267 * M4exit:: Exiting from @code{m4}
268 * Syncoutput:: Turning on and off sync lines
270 Fast loading of frozen state
272 * Using frozen files:: Using frozen files
273 * Frozen file format 1:: Frozen file format 1
274 * Frozen file format 2:: Frozen file format 2
276 Compatibility with other versions of @code{m4}
278 * Extensions:: Extensions in GNU M4
279 * Incompatibilities:: Other incompatibilities
280 * Experiments:: Experimental features in GNU M4
282 Correct version of some examples
284 * Improved exch:: Solution for @code{exch}
285 * Improved forloop:: Solution for @code{forloop}
286 * Improved foreach:: Solution for @code{foreach}
287 * Improved copy:: Solution for @code{copy}
288 * Improved m4wrap:: Solution for @code{m4wrap}
289 * Improved cleardivert:: Solution for @code{cleardivert}
290 * Improved capitalize:: Solution for @code{capitalize}
291 * Improved fatal_error:: Solution for @code{fatal_error}
293 How to make copies of the overall M4 package
295 * GNU General Public License:: License for copying the M4 package
297 How to make copies of this manual
299 * GNU Free Documentation License:: License for copying this manual
301 Indices of concepts and macros
303 * Macro index:: Index for all @code{m4} macros
304 * Concept index:: Index for many concepts
310 @chapter Introduction and preliminaries
312 This first chapter explains what GNU @code{m4} is, where @code{m4}
313 comes from, how to read and use this documentation, how to call the
314 @code{m4} program, and how to report bugs about it. It concludes by
315 giving tips for reading the remainder of the manual.
317 The following chapters then detail all the features of the @code{m4}
318 language, as shipped in the GNU M4 package.
321 * Intro:: Introduction to @code{m4}
322 * History:: Historical references
323 * Bugs:: Problems and bugs
324 * Manual:: Using this manual
328 @section Introduction to @code{m4}
330 @cindex overview of @code{m4}
331 @code{m4} is a macro processor, in the sense that it copies its
332 input to the output, expanding macros as it goes. Macros are either
333 builtin or user-defined, and can take any number of arguments.
334 Besides just doing macro expansion, @code{m4} has builtin functions
335 for including named files, running shell commands, doing integer
336 arithmetic, manipulating text in various ways, performing recursion,
337 etc.@dots{} @code{m4} can be used either as a front-end to a compiler,
338 or as a macro processor in its own right.
340 The @code{m4} macro processor is widely available on all UNIXes, and has
341 been standardized by POSIX.
342 Usually, only a small percentage of users are aware of its existence.
343 However, those who find it often become committed users. The
344 popularity of GNU Autoconf, which requires GNU
345 @code{m4} for @emph{generating} @file{configure} scripts, is an incentive
346 for many to install it, while these people will not themselves
347 program in @code{m4}. GNU @code{m4} is mostly compatible with the
348 System V, Release 4 version, except for some minor differences.
349 @xref{Compatibility}, for more details.
351 Some people find @code{m4} to be fairly addictive. They first use
352 @code{m4} for simple problems, then take bigger and bigger challenges,
353 learning how to write complex sets of @code{m4} macros along the way.
354 Once really addicted, users pursue writing of sophisticated @code{m4}
355 applications even to solve simple problems, devoting more time
356 debugging their @code{m4} scripts than doing real work. Beware that
357 @code{m4} may be dangerous for the health of compulsive programmers.
360 @section Historical references
362 @cindex history of @code{m4}
363 @cindex GNU M4, history of
364 Macro languages were invented early in the history of computing. In the
365 1950s Alan Perlis suggested that the macro language be independent of the
366 language being processed. Techniques such as conditional and recursive
367 macros, and using macros to define other macros, were described by Doug
368 McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler
369 Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20,
370 @url{http://dx.doi.org/10.1145/367177.367223}.
372 An important precursor of @code{m4} was GPM; see C. Strachey,
373 @c The title uses lower case and has no space between "macro" and "generator".
374 ``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3
375 (1965), 225--41, @url{http://dx.doi.org/10.1093/comjnl/8.3.225}. GPM is
376 also succinctly described in David Gries's book @emph{Compiler
377 Construction for Digital Computers}, Wiley (1971). Strachey was a
378 brilliant programmer: GPM fit into 250 machine instructions!
380 Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a
381 model preprocessor in that fit into a page of Snobol 3 code, and McIlroy
382 and Robert Morris developed a series of further models at Bell Labs.
383 Andrew D. Hall followed up with M6, a general purpose macro processor
384 used to port the Fortran source code of the Altran computer algebra
385 system; see Hall's ``The M6 Macro Processor'', Computing Science
386 Technical Report #2, Bell Labs (1972),
387 @url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}. M6's source code
388 consisted of about 600 Fortran statements. Its name was the first of
391 The Brian Kernighan and P.J. Plauger book @emph{Software Tools},
392 Addison-Wesley (1976), describes and implements a Unix
393 macro-processor language, which inspired Dennis Ritchie to write
394 @code{m3}, a macro processor for the AP-3 minicomputer.
396 Kernighan and Ritchie then joined forces to develop the original
397 @code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories
398 (1977), @url{http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}.
399 It had only 21 builtin macros.
401 While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with
402 the true intricacies of real life: macros can be recognized without
403 being pre-announced, skipping whitespace or end-of-lines is easier,
404 more constructs are builtin instead of derived, etc.
406 Originally, the Kernighan and Plauger macro-processor, and then
407 @code{m3}, formed the engine for the Rational FORTRAN preprocessor,
408 that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4}
409 was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}.
411 Ren@'e Seindal released his implementation of @code{m4}, GNU
413 in 1990, with the aim of removing the artificial limitations in many
414 of the traditional @code{m4} implementations, such as maximum line
415 length, macro size, or number of macros.
417 The late Professor A. Dain Samples described and implemented a further
418 evolution in the form of @code{M5}: ``User's Guide to the M5 Macro
419 Language: 2nd edition'', Electronic Announcement on comp.compilers
422 Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in
423 1992, until 1994 when he released GNU @code{m4} 1.4, which was
424 the stable release for 10 years. It was at this time that GNU
425 Autoconf decided to require GNU @code{m4} as its underlying
426 engine, since all other implementations of @code{m4} had too many
429 More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which
430 addressed some long standing bugs in the venerable 1.4 release. Then in
431 2005, Gary V. Vaughan collected together the many patches to
432 GNU @code{m4} 1.4 that were floating around the net and
433 released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
434 prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8.
435 More bug fixes were incorporated in 2007, with releases 1.4.9 and
436 1.4.10. Eric continued with some portability fixes for 1.4.11 and
437 1.4.12 in 2008, 1.4.13 in 2009, 1.4.14 and 1.4.15 in 2010, and 1.4.16
438 in 2011. Following a long hiatus, Gary released 1.4.17 after upgrading
439 to the latest autotools (and gnulib) along with all the small fixes they
442 Additionally, in 2008, Eric rewrote the scanning engine to reduce
443 recursive evaluation from quadratic to linear complexity. This was
444 released as M4 1.6 in 2009. The 1.x branch series remains open for bug
447 Meanwhile, development was underway for new features for @code{m4},
448 such as dynamic module loading and additional builtins, practically
449 rewriting the entire code base. This development has spurred
450 improvements to other GNU software, such as GNU
451 Libtool. GNU M4 2.0 is the result of this effort.
454 @section Problems and bugs
456 @cindex reporting bugs
458 @cindex suggestions, reporting
459 If you have problems with GNU M4 or think you've found a bug,
460 please report it. Before reporting a bug, make sure you've actually
461 found a real bug. Carefully reread the documentation and see if it
462 really says you can do what you're trying to do. If it's not clear
463 whether you should be able to do something or not, report that too; it's
464 a bug in the documentation!
466 Before reporting a bug or trying to fix it yourself, try to isolate it
467 to the smallest possible input file that reproduces the problem. Then
468 send us the input file and the exact results @code{m4} gave you. Also
469 say what you expected to occur; this will help us decide whether the
470 problem was really in the documentation.
472 Once you've got a precise problem, send e-mail to
473 @email{bug-m4@@gnu.org}. Please include the version number of @code{m4}
474 you are using. You can get this information with the command
475 @kbd{m4 --version}. You can also run @kbd{make check} to generate the
476 file @file{tests/@/testsuite.log}, useful for including in your report.
478 Non-bug suggestions are always welcome as well. If you have questions
479 about things that are unclear in the documentation or are just obscure
480 features, please report them too.
483 @section Using this manual
485 @cindex examples, understanding
486 This manual contains a number of examples of @code{m4} input and output,
487 and a simple notation is used to distinguish input, output and error
488 messages from @code{m4}. Examples are set out from the normal text, and
489 shown in a fixed width font, like this
493 This is an example of an example!
496 To distinguish input from output, all output from @code{m4} is prefixed
497 by the string @samp{@result{}}, and all error messages by the string
498 @samp{@error{}}. When showing how command line options affect matters,
499 the command line is shown with a prompt @samp{$ @kbd{like this}},
500 otherwise, you can assume that a simple @kbd{m4} invocation will work.
505 $ @kbd{command line to invoke m4}
506 Example of input line
507 @result{}Output line from m4
508 @error{}and an error message
511 The sequence @samp{^D} in an example indicates the end of the input
512 file. The sequence @samp{@key{NL}} refers to the newline character.
513 The majority of these examples are self-contained, and you can run them
514 with similar results. In fact, the testsuite that is bundled in the
515 GNU M4 package consists in part of the examples
516 in this document! Some of the examples assume that your current
517 directory is located where you unpacked the installation, so if you plan
518 on following along, you may find it helpful to do this now:
522 $ @kbd{cd m4-@value{VERSION}}
525 As each of the predefined macros in @code{m4} is described, a prototype
526 call of the macro will be shown, giving descriptive names to the
529 @deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @
530 @ovar{argument}@dots{})
531 This is a sample prototype. There is not really a macro named
532 @code{example}, but this documents that if there were, it would be a
533 Composite macro, rather than a Builtin, and would be provided by the
536 It requires at least one argument, @var{string}. Remember that in
537 @code{m4}, there must not be a space between the macro name and the
538 opening parenthesis, unless it was intended to call the macro without
539 any arguments. The brackets around @var{count} and @var{argument} show
540 that these arguments are optional. If @var{count} is omitted, the macro
541 behaves as if count were @samp{1}, whereas if @var{argument} is omitted,
542 the macro behaves as if it were the empty string. A blank argument is
543 not the same as an omitted argument. For example, @samp{example(`a')},
544 @samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave
545 identically with @var{count} set to @samp{1}; while @samp{example(`a',)}
546 and @samp{example(`a',`')} would explicitly pass the empty string for
547 @var{count}. The ellipses (@samp{@dots{}}) show that the macro
548 processes additional arguments after @var{argument}, rather than
552 Each builtin definition will list, in parentheses, the module that must
553 be loaded to use that macro. The standard modules include
554 @samp{m4} (which is always available), @samp{gnu} (for GNU specific
555 m4 extensions), and @samp{traditional} (for compatibility with System V
559 All macro arguments in @code{m4} are strings, but some are given
560 special interpretation, e.g., as numbers, file names, regular
561 expressions, etc. The documentation for each macro will state how the
562 parameters are interpreted, and what happens if the argument cannot be
563 parsed according to the desired interpretation. Unless specified
564 otherwise, a parameter specified to be a number is parsed as a decimal,
565 even if the argument has leading zeros; and parsing the empty string as
566 a number results in 0 rather than an error, although a warning will be
569 This document consistently writes and uses @dfn{builtin}, without a
570 hyphen, as if it were an English word. This is how the @code{builtin}
571 primitive is spelled within @code{m4}.
574 @chapter Invoking @code{m4}
577 @cindex invoking @code{m4}
578 The format of the @code{m4} command is:
582 @code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]}
585 @cindex command line, options
586 @cindex options, command line
587 @cindex @env{POSIXLY_CORRECT}
588 All options begin with @samp{-}, or if long option names are used, with
589 @samp{--}. A long option name need not be written completely, any
590 unambiguous prefix is sufficient. POSIX requires @code{m4} to
591 recognize arguments intermixed with files, even when
592 @env{POSIXLY_CORRECT} is set in the environment. Most options take
593 effect at startup regardless of their position, but some are documented
594 below as taking effect after any files that occurred earlier in the
595 command line. The argument @option{--} is a marker to denote the end of
598 With short options, options that do not take arguments may be combined
599 into a single command line argument with subsequent options, options
600 with mandatory arguments may be provided either as a single command line
601 argument or as two arguments, and options with optional arguments must
602 be provided as a single argument. In other words,
603 @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
604 @kbd{m4 -Q -P -D foo -d ./a -d+f}, although the latter form is
605 considered canonical.
607 With long options, options with mandatory arguments may be provided with
608 an equal sign (@samp{=}) in a single argument, or as two arguments, and
609 options with optional arguments must be provided as a single argument.
610 In other words, @kbd{m4 --def foo --debug a} is equivalent to
611 @kbd{m4 --define=foo --debug= -- ./a}, although the latter form is
612 considered canonical (not to mention more robust, in case a future
613 version of @code{m4} introduces an option named @option{--default}).
615 @code{m4} understands the following options, grouped by functionality.
618 * Operation modes:: Command line options for operation modes
619 * Preprocessor features:: Command line options for preprocessor features
620 * Limits control:: Command line options for limits control
621 * Frozen state:: Command line options for frozen state
622 * Debugging options:: Command line options for debugging
623 * Command line files:: Specifying input files on the command line
626 @node Operation modes
627 @section Command line options for operation modes
629 Several options control the overall operation of @code{m4}:
633 Print a help summary on standard output, then immediately exit
634 @code{m4} without reading any input files or performing any other
638 Print the version number of the program on standard output, then
639 immediately exit @code{m4} without reading any input files or
640 performing any other actions.
644 Makes this invocation of @code{m4} non-interactive. This means that
645 output will be buffered, and an interrupt or pipe write error will halt
646 execution. If neither
647 @option{-b} nor @option{-i} are specified, this is activated by default
648 when any input files are specified, or when either standard input or
649 standard error is not a terminal. Note that this means that @kbd{m4}
650 alone might be interactive, but @kbd{m4 -} is not, even though both
651 commands process only standard input. If both @option{-b} and
652 @option{-i} are specified, only the last one takes effect.
655 @itemx --discard-comments
656 Discard all comments instead of copying them to the output.
659 @itemx --fatal-warnings
660 @cindex errors, fatal
662 Controls the effect of warnings. If unspecified, then execution
663 continues and exit status is unaffected when a warning is printed. If
664 specified exactly once, warnings become fatal; when one is issued,
665 execution continues, but the exit status will be non-zero. If specified
666 multiple times, then execution halts with non-zero status the first time
667 a warning is issued. The introduction of behavior levels is new to M4
668 1.4.9; for behavior consistent with earlier versions, you should specify
672 For backwards compatibility reasons, using @option{-E} behaves as if an
673 implicit @option{--debug=-d} option is also present. This is so that
674 scripts written for older M4 versions will not fail if they used
675 constructs that were previously silently allowed, but would now trigger
681 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
706 @comment options: -E -d
711 @error{}m4:stdin:1: warning: defn: undefined macro 'oops'
725 Makes this invocation of @code{m4} interactive. This means that all
726 output will be unbuffered, interrupts will be ignored, and behavior on
727 pipe write errors is inherited from the parent process. If neither
728 @option{-b} nor @option{-i} are specified, this is activated by default
729 when no input files are specified, and when both standard input and
730 standard error are terminals (similar to the way that /bin/sh determines
731 when to be interactive). If both @option{-b} and @option{-i} are
732 specified, only the last one takes effect. The spelling @option{-e}
733 exists for compatibility with other @code{m4} implementations, and
734 issues a warning because it may be withdrawn in a future version of
738 @itemx --prefix-builtins
739 Internally modify @emph{all} builtin macro names so they all start with
740 the prefix @samp{m4_}. For example, using this option, one should write
741 @samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}}
742 instead of @samp{@w{__file__}}. This option has no effect if @option{-R}
748 Suppress warnings, such as missing or superfluous arguments in macro
749 calls, or treating the empty string as zero. Error messages are still
750 printed. The distinction between error and warning is fuzzy, and if
751 you encounter a situation where the message output did not match your
752 expectations, please report that as a bug. This option is implied if
753 @env{POSIXLY_CORRECT} is set in the environment.
755 @item -r@r{[}@var{resyntax-spec}@r{]}
756 @itemx --regexp-syntax@r{[}=@var{resyntax-spec}@r{]}
757 Set the regular expression syntax according to @var{resyntax-spec}.
758 When this option is not given, or @var{resyntax-spec} is omitted,
759 GNU M4 uses the flavor @code{GNU_M4}, which provides
760 emacs-compatible regular expressions. @xref{Changeresyntax}, for more
761 details on the format and meaning of @var{resyntax-spec}. This option
762 may be given more than once, and order with respect to file names is
766 Cripple the following builtins, since each can perform potentially
767 unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}),
768 @code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}),
769 @code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}).
770 An attempt to use any of these macros will result in an error. This
771 option is intended to make it safer to preprocess an input file of
776 Enable warnings. Warnings are on by default unless
777 @env{POSIXLY_CORRECT} was set in the environment; this option exists to
778 allow overriding @option{--silent}.
779 @comment FIXME should we accept -Wall, -Wnone, -Wcategory,
780 @comment -Wno-category...?
783 @node Preprocessor features
784 @section Command line options for preprocessor features
786 @cindex macro definitions, on the command line
787 @cindex command line, macro definitions on the
788 @cindex preprocessor features
789 Several options allow @code{m4} to behave more like a preprocessor.
790 Macro definitions and deletions can be made on the command line, the
791 search path can be altered, and the output file can track where the
792 input came from. These features occur with the following options:
795 @item -B @var{directory}
796 @itemx --prepend-include=@var{directory}
797 Make @code{m4} search @var{directory} for included files, prior to
798 searching the current working directory. @xref{Search Path}, for more
799 details. This option may be given more than once. Some other
800 implementations of @code{m4} use @option{-B @var{number}} to change their
801 hard-coded limits, but that is unnecessary in GNU where the
802 only limit is your hardware capability. So although it is unlikely that
803 you will want to include a relative directory whose name is purely
804 numeric, GNU @code{m4} will warn you about this potential
805 compatibility issue; you can avoid the warning by using the long
806 spelling, or by using @samp{./@var{number}} if you really meant it.
808 @item -D @var{name}@r{[}=@var{value}@r{]}
809 @itemx --define=@var{name}@r{[}=@var{value}@r{]}
810 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
811 missing, the value is taken to be the empty string. The @var{value} can
812 be any string, and the macro can be defined to take arguments, just as
813 if it was defined from within the input. This option may be given more
814 than once; order with respect to file names is significant, and
815 redefining the same @var{name} loses the previous value.
817 @item --import-environment
818 Imports every variable in the environment as a macro. This is done
819 before @option{-D} and @option{-U}, so they can override the
822 @item -I @var{directory}
823 @itemx --include=@var{directory}
824 Make @code{m4} search @var{directory} for included files that are not
825 found in the current working directory. @xref{Search Path}, for more
826 details. This option may be given more than once.
828 @item --popdef=@var{name}
829 This deletes the top-most meaning @var{name} might have. Obviously,
830 only predefined macros can be deleted in this way. This option may be
831 given more than once; popping a @var{name} that does not have a
832 definition is silently ignored. Order is significant with respect to
835 @item -p @var{name}@r{[}=@var{value}@r{]}
836 @itemx --pushdef=@var{name}@r{[}=@var{value}@r{]}
837 This enters @var{name} into the symbol table. If @samp{=@var{value}} is
838 missing, the value is taken to be the empty string. The @var{value} can
839 be any string, and the macro can be defined to take arguments, just as
840 if it was defined from within the input. This option may be given more
841 than once; order with respect to file names is significant, and
842 redefining the same @var{name} adds another definition to its stack.
846 Short for @option{--syncoutput=1}, turning on synchronization lines
847 (sometimes called @dfn{synclines}).
849 @item --syncoutput@r{[}=@var{state}@r{]}
850 @cindex synchronization lines
851 @cindex location, input
852 @cindex input location
853 Control the generation of synchronization lines from the command line.
854 Synchronization lines are for use by the C preprocessor or other
855 similar tools. Order is significant with respect to file names. This
856 option is useful, for example, when @code{m4} is used as a
857 front end to a compiler. Source file name and line number information
858 is conveyed by directives of the form @samp{#line @var{linenum}
859 "@var{file}"}, which are inserted as needed into the middle of the
860 output. Such directives mean that the following line originated or was
861 expanded from the contents of input file @var{file} at line
862 @var{linenum}. The @samp{"@var{file}"} part is often omitted when
863 the file name did not change from the previous directive.
865 Synchronization directives are always given on complete lines by
866 themselves. When a synchronization discrepancy occurs in the middle of
867 an output line, the associated synchronization directive is delayed
868 until the next newline that does not occur in the middle of a quoted
869 string or comment. @xref{Syncoutput}, for runtime control. @var{state}
870 is interpreted the same as the argument to @code{syncoutput}; if
871 @var{state} is omitted, or @option{--syncoutput} is not used,
872 synchronization lines are disabled.
875 @itemx --undefine=@var{name}
876 This deletes any predefined meaning @var{name} might have. Obviously,
877 only predefined macros can be deleted in this way. This option may be
878 given more than once; undefining a @var{name} that does not have a
879 definition is silently ignored. Order is significant with respect to
884 @section Command line options for limits control
886 There are some limits within @code{m4} that can be tuned. For
887 compatibility, @code{m4} also accepts some options that control limits
888 in other implementations, but which are automatically unbounded (limited
889 only by your hardware and operating system constraints) in GNU
895 Enable all the extensions in this implementation. This is on by
896 default unless @env{POSIXLY_CORRECT} is set in the environment; it
897 exists to allow overriding @option{--traditional}.
902 Suppress all the extensions made in this implementation, compared to the
903 System V version. @xref{Compatibility}, for a list of these. This
904 loads the @samp{traditional} module in place of the @samp{gnu} module.
905 It is implied if @env{POSIXLY_CORRECT} is set in the environment.
908 @itemx --nesting-limit=@var{num}
909 @cindex nesting limit
910 @cindex limit, nesting
911 Artificially limit the nesting of macro calls to @var{num} levels,
912 stopping program execution if this limit is ever exceeded. When not
913 specified, nesting is limited to 1024 levels. A value of zero means
914 unlimited; but then heavily nested code could potentially cause a stack
915 overflow. @var{num} can have an optional scaling suffix.
916 @comment FIXME - need a node on what scaling suffixes are supported (see
917 @comment [info coreutils 'block size'] for ideas), and need to consider
918 @comment whether builtins should also understand scaling suffixes:
919 @comment eval, mpeval, perhaps format
921 The precise effect of this option might be more correctly associated
922 with textual nesting than dynamic recursion. It has been useful
923 when some complex @code{m4} input was generated by mechanical means.
924 Most users would never need this option. If shown to be obtrusive,
925 this option (which is still experimental) might well disappear.
928 This option does @emph{not} have the ability to break endless
929 rescanning loops, since these do not necessarily consume much memory
930 or stack space. Through clever usage of rescanning loops, one can
931 request complex, time-consuming computations from @code{m4} with useful
932 results. Putting limitations in this area would break @code{m4} power.
933 There are many pathological cases: @w{@samp{define(`a', `a')a}} is
934 only the simplest example (but @pxref{Compatibility}). Expecting GNU
935 @code{m4} to detect these would be a little like expecting a compiler
936 system to detect and diagnose endless loops: it is a quite @emph{hard}
937 problem in general, if not undecidable!
940 @itemx --hashsize=@var{num}
941 @itemx --word-regexp=@var{regexp}
942 These options are present only for compatibility with previous versions
943 of GNU @code{m4}. They do nothing except issue a warning, because the
944 symbol table size is not fixed anymore, and because the new
945 @code{changesyntax} feature is more efficient than the withdrawn
946 experimental @code{changeword}. These options will eventually disappear
951 These options are present for compatibility with System V @code{m4}, but
952 do nothing in this implementation. They may disappear in future
953 releases, and issue a warning to that effect.
957 @section Command line options for frozen state
959 GNU @code{m4} comes with a feature of freezing internal state
960 (@pxref{Frozen files}). This can be used to speed up @code{m4}
961 execution when reusing a common initialization script.
965 @itemx --freeze-state=@var{file}
966 Once execution is finished, write out the frozen state on the specified
967 @var{file}. It is conventional, but not required, for @var{file} to end
971 @itemx --reload-state=@var{file}
972 Before execution starts, recover the internal state from the specified
973 frozen @var{file}. The options @option{-D}, @option{-U}, @option{-t},
974 @option{-m}, @option{-r}, and @option{--import-environment} take effect
975 after state is reloaded, but before the input files are read.
978 @node Debugging options
979 @section Command line options for debugging
981 Finally, there are several options for aiding in debugging @code{m4}
985 @item -d@r{[}@r{[}-@r{|}+@r{]}@var{flags}@r{]}
986 @itemx --debug@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
987 @itemx --debugmode@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]}
988 Set the debug-level according to the flags @var{flags}. The debug-level
989 controls the format and amount of information presented by the debugging
990 functions. @xref{Debugmode}, for more details on the format and
991 meaning of @var{flags}. If omitted, @var{flags} defaults to
992 @samp{+adeq}. If the option occurs multiple times, @var{flags} starting
993 with @samp{-} or @samp{+} are cumulative, while @var{flags} starting
994 with a letter override all earlier settings. The debug-level starts
995 with @samp{d} enabled and all other flags disabled. To disable all
996 previously set flags, specify an explicit @var{flags} of @samp{-V}. For
997 backward compatibility reasons, the option @option{--fatal-warnings}
998 implies @samp{--debug=-d} as part of its effects. The spelling
999 @option{--debug} is recognized as an unambiguous option for
1000 compatibility with earlier versions of GNU M4, but for
1001 consistency with the builtin name, you can also use the spelling
1002 @option{--debugmode}. Order is significant with respect to file names.
1004 The cumulative effect of the various options in this example is
1005 equivalent to a single invocation of @code{debugmode(`adlqx')}:
1007 @comment options: -d-V -d+lx --debug --debugmode=-e
1009 $ @kbd{m4 -d+lx --debug --debugmode=-e}
1013 @error{}m4trace:2: -1- id 2: len(`123')
1017 @item --debugfile@r{[}=@var{file}@r{]}
1018 @itemx -o @var{file}
1019 @itemx --error-output=@var{file}
1020 Redirect debug messages and trace output to the
1021 named @var{file}. Warnings, error messages, and @code{errprint} output
1022 are still printed to standard error. Output from @code{dumpdef} goes to
1023 this file when the debug level @code{o} is not set (@pxref{Debugmode}).
1024 If these options are not used, or
1025 if @var{file} is unspecified (only possible for @option{--debugfile}),
1026 debug output goes to standard error; if @var{file} is the empty string,
1027 debug output is discarded. @xref{Debugfile}, for more details. The
1028 option @option{--debugfile} may be given more than once, and order is
1029 significant with respect to file names. The spellings @option{-o} and
1030 @option{--error-output} are misleading and
1031 inconsistent with other GNU tools; using those spellings will
1032 evoke a warning, and they may be withdrawn or change semantics in a
1036 @itemx --debuglen=@var{num}
1037 @itemx --arglength=@var{num}
1038 Restrict the size of the output generated by macro tracing or by
1039 @code{dumpdef} to @var{num} characters per string. If unspecified or
1040 zero, output is unlimited. @xref{Debuglen}, for more details.
1041 @var{num} can have an optional scaling suffix. The spelling
1042 @option{--arglength} is deprecated, since it does not match the
1043 @code{debuglen} macro; using it will evoke a warning, and it may be
1044 withdrawn in a future release.
1045 @comment FIXME - Should we add an option that controls whether output
1046 @comment strings are sanitized with escape sequences, so that dumpdef is
1047 @comment truly one line per macro?
1048 @comment FIXME - see comment on --nesting-limit about NUM.
1051 @itemx --trace=@var{name}
1052 @itemx --traceon=@var{name}
1053 This enables tracing for the macro @var{name}, at any point where it is
1054 defined. @var{name} need not be defined when this option is given.
1055 This option may be given more than once, and order is significant with
1056 respect to file names. @xref{Trace}, for more details.
1058 @item --traceoff=@var{name}
1059 This disables tracing for the macro @var{name}, at any point where it is
1060 defined. @var{name} need not be defined when this option is given.
1061 This option may be given more than once, and order is significant with
1062 respect to file names. @xref{Trace}, for more details.
1065 @node Command line files
1066 @section Specifying input files on the command line
1068 @cindex command line, file names on the
1069 @cindex file names, on the command line
1070 The remaining arguments on the command line are taken to be input file
1071 names or module names (@pxref{Modules}). Whether or not any modules
1072 are loaded from command line arguments, when no actual input file names
1073 are given, then standard input is read. A file name of @file{-} can be
1074 used to denote standard input. It is conventional, but not required,
1075 for input file names to end in @samp{.m4} and for module names to end
1076 in @samp{.la}. The input files and modules are attended to in the
1079 Standard input can be read more than once, so the file name @file{-}
1080 may appear multiple times on the command line; this makes a difference
1081 when input is from a terminal or other special file type. It is an
1082 error if an input file ends in the middle of argument collection, a
1083 comment, or a quoted string.
1084 @comment FIXME - it would be nicer if we let these three things
1085 @comment continue across file boundaries, provided that we warn in
1086 @comment interactive use when switching to stdin in a non-default parse
1089 Various options, such as @option{--define} (@option{-D}), @option{--undefine}
1090 (@option{-U}), @option{--synclines} (@option{-s}), @option{--trace}
1091 (@option{-t}), and @option{--regexp-syntax} (@option{-r}), only take
1092 effect after processing input from any file names that occur earlier
1093 on the command line. For example, assume the file @file{foo} contains:
1101 The text @samp{bar} can then be redefined over multiple uses of
1104 @comment options: -Dbar=hello foo -Dbar=world foo
1106 $ @kbd{m4 -Dbar=hello foo -Dbar=world foo}
1111 @cindex command line, module names on the
1112 @cindex module names, on the command line
1113 The use of loadable runtime modules in any sense is a GNU M4
1114 extension, so if @option{-G} is also passed or if the @env{POSIXLY_CORRECT}
1115 environment variable is set, even otherwise valid module names will be
1116 treated as though they were input file names (and no doubt cause havoc as
1117 M4 tries to scan and expand the contents as if it were written in @code{m4}).
1119 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
1120 exit status of @code{m4} will be 0 for success, 1 for general failure
1121 (such as problems with reading an input file), and 63 for version
1122 mismatch (@pxref{Using frozen files}).
1124 If you need to read a file whose name starts with a @file{-}, you can
1125 specify it as @samp{./-file}, or use @option{--} to mark the end of
1129 @comment Test that 'm4 file/' detects that file is not a directory; we
1130 @comment can assume that the current directory contains a Makefile.
1131 @comment mingw fails with EINVAL rather than ENOTDIR.
1134 @comment xerr: ignore
1135 @comment options: Makefile/
1137 @error{}m4: cannot open file 'Makefile/': No such file or directory
1140 @comment Test that closed stderr does not cause a crash. Not all
1141 @comment systems have the same message for EBADF.
1143 @comment xerr: ignore
1146 `errprint(` skipping: syscmd does not have unix semantics
1148 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1149 `errprint(` skipping: system does not allow closing stdout
1151 changequote(`[', `]')dnl
1152 syscmd([echo | ']__program__[' >&-])dnl
1153 @error{}m4: write error: Bad file descriptor
1160 `errprint(` skipping: syscmd does not have unix semantics
1162 syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0',
1163 `errprint(` skipping: system does not allow closing stdout
1165 changequote(`[', `]')dnl
1166 syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye
1167 )d"nl)dnl' > tmp.m4 \
1168 && ']__program__[' tmp.m4 <&- >&- \
1169 && rm tmp.m4])sysval
1175 @comment Test that we obey POSIX semantics with -D interspersed with
1176 @comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong).
1181 `errprint(` skipping: syscmd does not have unix semantics
1183 changequote(`[', `]')dnl
1184 syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl
1193 @chapter Lexical and syntactic conventions
1195 @cindex input tokens
1197 As @code{m4} reads its input, it separates it into @dfn{tokens}. A
1198 token is either a name, a quoted string, or any single character, that
1199 is not a part of either a name or a string. Input to @code{m4} can also
1200 contain comments. GNU @code{m4} does not yet understand
1201 multibyte locales; all operations are byte-oriented rather than
1202 character-oriented (although if your locale uses a single byte
1203 encoding, such as @sc{ISO-8859-1}, you will not notice a difference).
1204 However, @code{m4} is eight-bit clean, so you can
1205 use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}),
1206 comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the
1207 exception of the @sc{nul} character (the zero byte @samp{'\0'}).
1209 @comment FIXME - each builtin needs to document how it handles NUL, then
1210 @comment update the above paragraph to mention that NUL is now handled
1211 @comment transparently.
1214 * Names:: Macro names
1215 * Quoted strings:: Quoting input to @code{m4}
1216 * Comments:: Comments in @code{m4} input
1217 * Other tokens:: Other kinds of input tokens
1218 * Input processing:: How @code{m4} copies input to output
1219 * Regular expression syntax:: How @code{m4} interprets regular expressions
1223 @section Macro names
1227 A name is any sequence of letters, digits, and the character @samp{_}
1228 (underscore), where the first character is not a digit. @code{m4} will
1229 use the longest such sequence found in the input. If a name has a
1230 macro definition, it will be subject to macro expansion
1231 (@pxref{Macros}). Names are case-sensitive.
1233 Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}.
1235 The definitions of letters, digits and other input characters can be
1236 changed at any time, using the builtin macro @code{changesyntax}.
1237 @xref{Changesyntax}, for more information.
1239 @node Quoted strings
1240 @section Quoting input to @code{m4}
1242 @cindex quoted string
1243 @cindex string, quoted
1244 A quoted string is a sequence of characters surrounded by quote
1245 strings, defaulting to
1246 @samp{`} (grave-accent, also known as back-tick, with UCS value U0060)
1247 and @samp{'} (apostrophe, also known as single-quote, with UCS value
1248 U0027), where the nested begin and end quotes within the
1249 string are balanced. The value of a string token is the text, with one
1250 level of quotes stripped off. Thus
1259 is the empty string, and double-quoting turns into single-quoting.
1267 The quote characters can be changed at any time, using the builtin macros
1268 @code{changequote} (@pxref{Changequote}) or @code{changesyntax}
1269 (@pxref{Changesyntax}).
1272 @section Comments in @code{m4} input
1275 Comments in @code{m4} are normally delimited by the characters @samp{#}
1276 and newline. All characters between the comment delimiters are ignored,
1277 but the entire comment (including the delimiters) is passed through to
1278 the output, unless you supply the @option{--discard-comments} or
1279 @option{-c} option at the command line (@pxref{Operation modes, ,
1280 Invoking m4}). When discarding comments, the comment delimiters are
1281 discarded, even if the close-comment string is a newline.
1283 Comments cannot be nested, so the first newline after a @samp{#} ends
1284 the comment. The commenting effect of the begin-comment string
1285 can be inhibited by quoting it.
1289 `quoted text' # `commented text'
1290 @result{}quoted text # `commented text'
1291 `quoting inhibits' `#' `comments'
1292 @result{}quoting inhibits # comments
1295 @comment options: -c
1298 `quoted text' # `commented text'
1299 `quoting inhibits' `#' `comments'
1300 @result{}quoted text quoting inhibits # comments
1303 The comment delimiters can be changed to any string at any time, using
1304 the builtin macros @code{changecom} (@pxref{Changecom}) or
1305 @code{changesyntax} (@pxref{Changesyntax}).
1308 @section Other kinds of input tokens
1310 @cindex tokens, special
1311 Any character, that is neither a part of a name, nor of a quoted string,
1312 nor a comment, is a token by itself. When not in the context of macro
1313 expansion, all of these tokens are just copied to output. However,
1314 during macro expansion, whitespace characters (space, tab, newline,
1315 formfeed, carriage return, vertical tab), parentheses (@samp{(} and
1316 @samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional
1317 roles, explained later. Which characters actually perform these roles
1318 can be adjusted with @code{changesyntax} (@pxref{Changesyntax}).
1320 @node Input processing
1321 @section How @code{m4} copies input to output
1323 As @code{m4} reads the input token by token, it will copy each token
1324 directly to the output immediately.
1326 The exception is when it finds a word with a macro definition. In that
1327 case @code{m4} will calculate the macro's expansion, possibly reading
1328 more input to get the arguments. It then inserts the expansion in front
1329 of the remaining input. In other words, the resulting text from a macro
1330 call will be read and parsed into tokens again.
1332 @code{m4} expands a macro as soon as possible. If it finds a macro call
1333 when collecting the arguments to another, it will expand the second call
1334 first. This process continues until there are no more macro calls to
1335 expand and all the input has been consumed.
1337 For a running example, examine how @code{m4} handles this input:
1341 format(`Result is %d', eval(`2**15'))
1345 First, @code{m4} sees that the token @samp{format} is a macro name, so
1346 it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,},
1347 and @samp{@w{ }}, before encountering another potential macro. Sure
1348 enough, @samp{eval} is a macro name, so the nested argument collection
1349 picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro
1350 with the lone argument of @samp{2**15}. The expansion of
1351 @samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five
1352 tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and
1353 combined with the next @samp{)}, the format macro now has all its
1354 arguments, as if the user had typed:
1358 format(`Result is %d', 32768)
1362 The format macro expands to @samp{Result is 32768}, and we have another
1363 round of scanning for the tokens @samp{Result}, @samp{@w{ }},
1364 @samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and
1365 @samp{8}. None of these are macros, so the final output is
1369 @result{}Result is 32768
1372 As a more complicated example, we will contrast an actual code example
1373 from the Gnulib project@footnote{Derived from a patch in
1374 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
1375 and a followup patch in
1376 @uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
1377 showing both a buggy approach and the desired results. The user desires
1378 to output a shell assignment statement that takes its argument and turns
1379 it into a shell variable by converting it to uppercase and prepending a
1380 prefix. The original attempt looks like this:
1384 define([gl_STRING_MODULE_INDICATOR],
1387 GNULIB_]translit([$1],[a-z],[A-Z])[=1
1389 gl_STRING_MODULE_INDICATOR([strcase])
1391 @result{} GNULIB_strcase=1
1395 Oops -- the argument did not get capitalized. And although the manual
1396 is not able to easily show it, both lines that appear empty actually
1397 contain two trailing spaces. By stepping through the parse, it is easy
1398 to see what happened. First, @code{m4} sees the token
1399 @samp{changequote}, which it recognizes as a macro, followed by
1400 @samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
1401 argument list. The macro expands to the empty string, but changes the
1402 quoting characters to something more useful for generating shell code
1403 (unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
1404 but unbalanced @samp{[]} tend to be rare). Also in the first line,
1405 @code{m4} sees the token @samp{dnl}, which it recognizes as a builtin
1406 macro that consumes the rest of the line, resulting in no output for
1409 The second line starts a macro definition. @code{m4} sees the token
1410 @samp{define}, which it recognizes as a macro, followed by a @samp{(},
1411 @samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted
1412 comma was encountered, the first argument is known to be the expansion
1413 of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
1414 Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this
1415 whitespace is discarded as part of argument collection. Then comes a
1416 rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl
1417 comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token
1418 @samp{translit}, which @code{m4} recognizes as a macro name, so a nested
1419 macro expansion has started.
1421 The arguments to the @code{translit} are found by the tokens @samp{(},
1422 @samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
1423 @samp{)}. All three string arguments are expanded (or in other words,
1424 the quotes are stripped), and since neither @samp{$} nor @samp{1} need
1425 capitalization, the result of the macro is @samp{$1}. This expansion is
1426 rescanned, resulting in the two literal characters @samp{$} and
1429 Scanning of the outer macro resumes, and picks up with
1430 @samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
1431 expanded text are concatenated, with the end result that the macro
1432 @samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence
1433 @samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }.
1434 Once again, @samp{dnl} is recognized and avoids a newline in the output.
1436 The final line is then parsed, beginning with @samp{ } and @samp{ }
1437 that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
1438 recognized as a macro name, with an argument list of @samp{(},
1439 @samp{[strcase]}, and @samp{)}. Since the definition of the macro
1440 contains the sequence @samp{$1}, that sequence is replaced with the
1441 argument @samp{strcase} prior to starting the rescan. The rescan sees
1442 @samp{@key{NL}} and four spaces, which are output literally, then
1443 @samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next
1444 comes four more spaces, also output literally, and the token
1445 @samp{GNULIB_strcase}, which resulted from the earlier parameter
1446 substitution. Since that is not a macro name, it is output literally,
1447 followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and
1448 two more spaces. Finally, the original @samp{@key{NL}} seen after the
1449 macro invocation is scanned and output literally.
1451 Now for a corrected approach. This rearranges the use of newlines and
1452 whitespace so that less whitespace is output (which, although harmless
1453 to shell scripts, can be visually unappealing), and fixes the quoting
1454 issues so that the capitalization occurs when the macro
1455 @samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is
1456 defined. It also adds another layer of quoting to the first argument of
1457 @code{translit}, to ensure that the output will be rescanned as a string
1458 rather than a potential uppercase macro name needing further expansion.
1462 define([gl_STRING_MODULE_INDICATOR],
1464 GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl
1466 gl_STRING_MODULE_INDICATOR([strcase])
1467 @result{} GNULIB_STRCASE=1
1470 The parsing of the first line is unchanged. The second line sees the
1471 name of the macro to define, then sees the discarded @samp{@key{NL}}
1472 and two spaces, as before. But this time, the next token is
1473 @samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z],
1474 [A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by
1475 @samp{)} to end the macro definition and @samp{dnl} to skip the
1476 newline. No early expansion of @code{translit} occurs, so the entire
1477 string becomes the definition of the macro.
1479 The final line is then parsed, beginning with two spaces that are
1480 output literally, and an invocation of
1481 @code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}.
1482 Again, the @samp{$1} in the macro definition is substituted prior to
1483 rescanning. Rescanning first encounters @samp{dnl}, and discards
1484 @samp{ comment@key{NL}}. Then two spaces are output literally. Next
1485 comes the token @samp{GNULIB_}, but that is not a macro, so it is
1486 output literally. The token @samp{[]} is an empty string, so it does
1487 not affect output. Then the token @samp{translit} is encountered.
1489 This time, the arguments to @code{translit} are parsed as @samp{(},
1490 @samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
1491 @samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the
1492 translit results in the desired result @samp{[STRCASE]}. This is
1493 rescanned, but since it is a string, the quotes are stripped and the
1494 only output is a literal @samp{STRCASE}.
1495 Then the scanner sees @samp{=} and @samp{1}, which are output
1496 literally, followed by @samp{dnl} which discards the rest of the
1497 definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
1498 end of output is the literal @samp{@key{NL}} that appeared after the
1499 invocation of the macro.
1501 The order in which @code{m4} expands the macros can be further explored
1502 using the trace facilities of GNU @code{m4} (@pxref{Trace}).
1504 @node Regular expression syntax
1505 @section How @code{m4} interprets regular expressions
1507 There are several contexts where @code{m4} parses an argument as a
1508 regular expression. This section describes the various flavors of
1509 regular expressions. @xref{Changeresyntax}.
1511 @include regexprops-generic.texi
1514 @chapter How to invoke macros
1516 This chapter covers macro invocation, macro arguments and how macro
1517 expansion is treated.
1520 * Invocation:: Macro invocation
1521 * Inhibiting Invocation:: Preventing macro invocation
1522 * Macro Arguments:: Macro arguments
1523 * Quoting Arguments:: On Quoting Arguments to macros
1524 * Macro expansion:: Expanding macros
1528 @section Macro invocation
1530 @cindex macro invocation
1531 @cindex invoking macros
1532 Macro invocations has one of the forms
1540 which is a macro invocation without any arguments, or
1544 name(arg1, arg2, @dots{}, arg@var{n})
1548 which is a macro invocation with @var{n} arguments. Macros can have any
1549 number of arguments. All arguments are strings, but different macros
1550 might interpret the arguments in different ways.
1552 The opening parenthesis @emph{must} follow the @var{name} directly, with
1553 no spaces in between. If it does not, the macro is called with no
1556 For a macro call to have no arguments, the parentheses @emph{must} be
1557 left out. The macro call
1565 is a macro call with one argument, which is the empty string, not a call
1568 @node Inhibiting Invocation
1569 @section Preventing macro invocation
1571 An innovation of the @code{m4} language, compared to some of its
1572 predecessors (like Strachey's @code{GPM}, for example), is the ability
1573 to recognize macro calls without resorting to any special, prefixed
1574 invocation character. While generally useful, this feature might
1575 sometimes be the source of spurious, unwanted macro calls. So, GNU
1576 @code{m4} offers several mechanisms or techniques for inhibiting the
1577 recognition of names as macro calls.
1579 @cindex GNU extensions
1581 @cindex macro, blind
1582 First of all, many builtin macros cannot meaningfully be called without
1583 arguments. As a GNU extension, for any of these macros,
1584 whenever an opening parenthesis does not immediately follow their name,
1585 the builtin macro call is not triggered. This solves the most usual
1586 cases, like for @samp{include} or @samp{eval}. Later in this document,
1587 the sentence ``This macro is recognized only with parameters'' refers to
1588 this specific provision of GNU M4, also known as a blind
1589 builtin macro. For the builtins defined by POSIX that bear
1590 this disclaimer, POSIX specifically states that invoking those
1591 builtins without arguments is unspecified, because many other
1592 implementations simply invoke the builtin as though it were given one
1593 empty argument instead.
1603 There is also a command line option (@option{--prefix-builtins}, or
1604 @option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all
1605 builtin macros with a prefix of @samp{m4_} at startup. The option has
1606 no effect whatsoever on user defined macros. For example, with this option,
1607 one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has
1608 no effect on whether a macro requires parameters.
1610 @comment options: -P
1623 Another alternative is to redefine problematic macros to a name less
1624 likely to cause conflicts, using @ref{Definitions}. Or the parsing
1625 engine can be changed to redefine what constitutes a valid macro name,
1626 using @ref{Changesyntax}.
1628 Of course, the simplest way to prevent a name from being interpreted
1629 as a call to an existing macro is to quote it. The remainder of
1630 this section studies a little more deeply how quoting affects macro
1631 invocation, and how quoting can be used to inhibit macro invocation.
1633 Even if quoting is usually done over the whole macro name, it can also
1634 be done over only a few characters of this name (provided, of course,
1635 that the unquoted portions are not also a macro). It is also possible
1636 to quote the empty string, but this works only @emph{inside} the name.
1651 all yield the string @samp{divert}. While in both:
1661 the @code{divert} builtin macro will be called, which expands to the
1665 The output of macro evaluations is always rescanned. In the following
1666 example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as
1668 has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input:
1671 define(`cde', `CDE')
1673 define(`x', `substr(ab')
1675 define(`y', `cde, `1', `3')')
1681 Unquoted strings on either side of a quoted string are subject to
1682 being recognized as macro names. In the following example, quoting the
1683 empty string allows for the second @code{macro} to be recognized as such:
1686 define(`macro', `m')
1694 Quoting may prevent recognizing as a macro name the concatenation of a
1695 macro expansion with the surrounding characters. In this example:
1698 define(`macro', `di$1')
1707 the input will produce the string @samp{divert}. When the quotes were
1708 removed, the @code{divert} builtin was called instead.
1710 @node Macro Arguments
1711 @section Macro arguments
1713 @cindex macros, arguments to
1714 @cindex arguments to macros
1715 When a name is seen, and it has a macro definition, it will be expanded
1718 If the name is followed by an opening parenthesis, the arguments will be
1719 collected before the macro is called. If too few arguments are
1720 supplied, the missing arguments are taken to be the empty string.
1721 However, some builtins are documented to behave differently for a
1722 missing optional argument than for an explicit empty string. If there
1723 are too many arguments, the excess arguments are ignored. Unquoted
1724 leading whitespace is stripped off all arguments, but whitespace
1725 generated by a macro expansion or occurring after a macro that expanded
1726 to an empty string remains intact. Whitespace includes space, tab,
1727 newline, carriage return, vertical tab, and formfeed.
1730 define(`macro', `$1')
1732 macro( unquoted leading space lost)
1733 @result{}unquoted leading space lost
1734 macro(` quoted leading space kept')
1735 @result{} quoted leading space kept
1737 divert `unquoted space kept after expansion')
1738 @result{} unquoted space kept after expansion
1740 ')`whitespace from expansion kept')
1742 @result{}whitespace from expansion kept
1743 macro(`unquoted trailing whitespace kept'
1745 @result{}unquoted trailing whitespace kept
1749 @cindex warnings, suppressing
1750 @cindex suppressing warnings
1751 Normally @code{m4} will issue warnings if a builtin macro is called
1752 with an inappropriate number of arguments, but it can be suppressed with
1753 the @option{--quiet} command line option (or @option{--silent}, or
1754 @option{-Q}, @pxref{Operation modes, , Invoking m4}). For user
1755 defined macros, there is no check of the number of arguments given.
1760 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
1764 index(`abc', `b', `0', `ignored')
1765 @error{}m4:stdin:3: warning: index: extra arguments ignored: 4 > 3
1769 @comment options: -Q
1776 index(`abc', `b', `', `ignored')
1780 Macros are expanded normally during argument collection, and whatever
1781 commas, quotes and parentheses that might show up in the resulting
1782 expanded text will serve to define the arguments as well. Thus, if
1783 @var{foo} expands to @samp{, b, c}, the macro call
1791 is a macro call with four arguments, which are @samp{a }, @samp{b},
1792 @samp{c} and @samp{d}. To understand why the first argument contains
1793 whitespace, remember that unquoted leading whitespace is never part
1794 of an argument, but trailing whitespace always is.
1796 It is possible for a macro's definition to change during argument
1797 collection, in which case the expansion uses the definition that was in
1798 effect at the time the opening @samp{(} was seen.
1809 It is an error if the end of file occurs while collecting arguments.
1814 @result{}hello world
1817 @error{}m4:stdin:2: define: end of file in argument list
1820 @node Quoting Arguments
1821 @section On Quoting Arguments to macros
1823 @cindex quoted macro arguments
1824 @cindex macros, quoted arguments to
1825 @cindex arguments, quoted macro
1826 Each argument has unquoted leading whitespace removed. Within each
1827 argument, all unquoted parentheses must match. For example, if
1828 @var{foo} is a macro,
1836 is a macro call, with one argument, whose value is @samp{() (() (}.
1837 Commas separate arguments, except when they occur inside quotes,
1838 comments, or unquoted parentheses. @xref{Pseudo Arguments}, for
1841 It is common practice to quote all arguments to macros, unless you are
1842 sure you want the arguments expanded. Thus, in the above
1843 example with the parentheses, the `right' way to do it is like this:
1850 @cindex quoting rule of thumb
1851 @cindex rule of thumb, quoting
1852 It is, however, in certain cases necessary (because nested expansion
1853 must occur to create the arguments for the outer macro) or convenient
1854 (because it uses fewer characters) to leave out quotes for some
1855 arguments, and there is nothing wrong in doing it. It just makes life a
1856 bit harder, if you are not careful to follow a consistent quoting style.
1857 For consistency, this manual follows the rule of thumb that each layer
1858 of parentheses introduces another layer of single quoting, except when
1859 showing the consequences of quoting rules. This is done even when the
1860 quoted string cannot be a macro, such as with integers when you have not
1861 changed the syntax via @code{changesyntax} (@pxref{Changesyntax}).
1863 The quoting rule of thumb of one level of quoting per parentheses has a
1864 nice property: when a macro name appears inside parentheses, you can
1865 determine when it will be expanded. If it is not quoted, it will be
1866 expanded prior to the outer macro, so that its expansion becomes the
1867 argument. If it is single-quoted, it will be expanded after the outer
1868 macro. And if it is double-quoted, it will be used as literal text
1869 instead of a macro name.
1872 define(`active', `ACT, IVE')
1874 define(`show', `$1 $1')
1879 @result{}ACT, IVE ACT, IVE
1881 @result{}active active
1884 @node Macro expansion
1885 @section Macro expansion
1887 @cindex macros, expansion of
1888 @cindex expansion of macros
1889 When the arguments, if any, to a macro call have been collected, the
1890 macro is expanded, and the expansion text is pushed back onto the input
1891 (unquoted), and reread. The expansion text from one macro call might
1892 therefore result in more macros being called, if the calls are included,
1893 completely or partially, in the first macro calls' expansion.
1895 Taking a very simple example, if @var{foo} expands to @samp{bar}, and
1896 @var{bar} expands to @samp{Hello world}, the input
1898 @comment options: -Dbar='Hello world' -Dfoo=bar
1900 $ @kbd{m4 -Dbar="Hello world" -Dfoo=bar}
1902 @result{}Hello world
1906 will expand first to @samp{bar}, and when this is reread and
1907 expanded, into @samp{Hello world}.
1910 @chapter How to define new macros
1912 @cindex macros, how to define new
1913 @cindex defining new macros
1914 Macros can be defined, redefined and deleted in several different ways.
1915 Also, it is possible to redefine a macro without losing a previous
1916 value, and bring back the original value at a later time.
1919 * Define:: Defining a new macro
1920 * Arguments:: Arguments to macros
1921 * Pseudo Arguments:: Special arguments to macros
1922 * Undefine:: Deleting a macro
1923 * Defn:: Renaming macros
1924 * Pushdef:: Temporarily redefining macros
1925 * Renamesyms:: Renaming macros with regular expressions
1927 * Indir:: Indirect call of macros
1928 * Builtin:: Indirect call of builtins
1929 * M4symbols:: Getting the defined macro names
1933 @section Defining a macro
1935 The normal way to define or redefine macros is to use the builtin
1938 @deffn {Builtin (m4)} define (@var{name}, @ovar{expansion})
1939 Defines @var{name} to expand to @var{expansion}. If
1940 @var{expansion} is not given, it is taken to be empty.
1942 The expansion of @code{define} is void.
1943 The macro @code{define} is recognized only with parameters.
1945 @comment Other implementations, such as Solaris, can define a macro
1946 @comment with a builtin token attached to text:
1947 @comment define(foo, a`'defn(`divnum')b)
1948 @comment defn(`foo') => ab
1949 @comment dumpdef(`foo') => foo: a<divnum>b
1950 @comment len(defn(`foo')) => 3
1951 @comment index(defn(`foo'), defn(`divnum')) => 1
1953 @comment It may be worth making some changes to support this behavior,
1954 @comment or something similar to it.
1956 @comment But be sure it has sane semantics, with potentially deferred
1957 @comment expansion of builtins. For example, this should not warn
1958 @comment about trying to access the definition of an undefined macro:
1959 @comment define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops')
1960 @comment Also, think how to handle conflicting argument counts:
1961 @comment define(`bar', defn(`dnl', `len'))
1963 The following example defines the macro @var{foo} to expand to the text
1964 @samp{Hello World.}.
1967 define(`foo', `Hello world.')
1970 @result{}Hello world.
1973 The empty line in the output is there because the newline is not
1974 a part of the macro definition, and it is consequently copied to
1975 the output. This can be avoided by use of the macro @code{dnl}.
1976 @xref{Dnl}, for details.
1978 The first argument to @code{define} should be quoted; otherwise, if the
1979 macro is already defined, you will be defining a different macro. This
1980 example shows the problems with underquoting, since we did not want to
1981 redefine @code{one}:
1992 @cindex GNU extensions
1993 GNU @code{m4} normally replaces only the @emph{topmost}
1994 definition of a macro if it has several definitions from @code{pushdef}
1995 (@pxref{Pushdef}). Some other implementations of @code{m4} replace all
1996 definitions of a macro with @code{define}. @xref{Incompatibilities},
1999 As a GNU extension, the first argument to @code{define} does
2000 not have to be a simple word.
2001 It can be any text string, even the empty string. A macro with a
2002 non-standard name cannot be invoked in the normal way, as the name is
2003 not recognized. It can only be referenced by the builtins @code{Indir}
2004 (@pxref{Indir}) and @code{Defn} (@pxref{Defn}).
2007 Arrays and associative arrays can be simulated by using non-standard
2010 @deffn Composite array (@var{index})
2011 @deffnx Composite array_set (@var{index}, @ovar{value})
2012 Provide access to entries within an array. @code{array} reads the entry
2013 at location @var{index}, and @code{array_set} assigns @var{value} to
2014 location @var{index}.
2018 define(`array', `defn(format(``array[%d]'', `$1'))')
2020 define(`array_set', `define(format(``array[%d]'', `$1'), `$2')')
2022 array_set(`4', `array element no. 4')
2024 array_set(`17', `array element no. 17')
2027 @result{}array element no. 4
2028 array(eval(`10 + 7'))
2029 @result{}array element no. 17
2032 Change the @samp{%d} to @samp{%s} and it is an associative array.
2035 @section Arguments to macros
2037 @cindex macros, arguments to
2038 @cindex arguments to macros
2039 Macros can have arguments. The @var{n}th argument is denoted by
2040 @code{$n} in the expansion text, and is replaced by the @var{n}th actual
2041 argument, when the macro is expanded. Replacement of arguments happens
2042 before rescanning, regardless of how many nesting levels of quoting
2043 appear in the expansion. Here is an example of a macro with
2046 @deffn Composite exch (@var{arg1}, @var{arg2})
2047 Expands to @var{arg2} followed by @var{arg1}, effectively exchanging
2052 define(`exch', `$2, $1')
2054 exch(`arg1', `arg2')
2058 This can be used, for example, if you like the arguments to
2059 @code{define} to be reversed.
2062 define(`exch', `$2, $1')
2064 define(exch(``expansion text'', ``macro''))
2067 @result{}expansion text
2070 @xref{Quoting Arguments}, for an explanation of the double quotes.
2071 (You should try and improve this example so that clients of @code{exch}
2072 do not have to double quote; or @pxref{Improved exch, , Answers}).
2074 @cindex GNU extensions
2075 GNU @code{m4} allows the number following the @samp{$} to
2077 or more digits, allowing macros to have any number of arguments. This
2078 is not so in UNIX implementations of @code{m4}, which only recognize
2080 @comment FIXME - See Austin group XCU ERN 111. POSIX says that $11 must
2081 @comment be the first argument concatenated with 1, and instead reserves
2082 @comment ${11} for implementation use. Once this is implemented, the
2083 @comment documentation needs to reflect how these extended arguments
2084 @comment are handled, as well as backwards compatibility issues with
2085 @comment 1.4.x. Also, consider adding further extensions such as
2086 @comment ${1-default}, which expands to `default' if $1 is empty.
2088 As a special case, the zeroth argument, @code{$0}, is always the name
2089 of the macro being expanded.
2092 define(`test', ``Macro name: $0'')
2095 @result{}Macro name: test
2098 If you want quoted text to appear as part of the expansion text,
2099 remember that quotes can be nested in quoted strings. Thus, in
2102 define(`foo', `This is macro `foo'.')
2105 @result{}This is macro foo.
2109 The @samp{foo} in the expansion text is @emph{not} expanded, since it is
2110 a quoted string, and not a name.
2112 @node Pseudo Arguments
2113 @section Special arguments to macros
2115 @cindex special arguments to macros
2116 @cindex macros, special arguments to
2117 @cindex arguments to macros, special
2118 There is a special notation for the number of actual arguments supplied,
2119 and for all the actual arguments.
2121 The number of actual arguments in a macro call is denoted by @code{$#}
2122 in the expansion text.
2124 @deffn Composite nargs (@dots{})
2125 Expands to a count of the number of arguments supplied.
2129 define(`nargs', `$#')
2135 nargs(`arg1', `arg2', `arg3')
2137 nargs(`commas can be quoted, like this')
2139 nargs(arg1#inside comments, commas do not separate arguments
2142 nargs((unquoted parentheses, like this, group arguments))
2146 Remember that @samp{#} defaults to the comment character; if you forget
2147 quotes to inhibit the comment behavior, your macro definition may not
2148 end where you expected.
2151 dnl Attempt to define a macro to just `$#'
2152 define(underquoted, $#)
2160 The notation @code{$*} can be used in the expansion text to denote all
2161 the actual arguments, unquoted, with commas in between. For example
2164 define(`echo', `$*')
2166 echo(arg1, arg2, arg3 , arg4)
2167 @result{}arg1,arg2,arg3 ,arg4
2170 Often each argument should be quoted, and the notation @code{$@@} handles
2171 that. It is just like @code{$*}, except that it quotes each argument.
2172 A simple example of that is:
2175 define(`echo', `$@@')
2177 echo(arg1, arg2, arg3 , arg4)
2178 @result{}arg1,arg2,arg3 ,arg4
2181 Where did the quotes go? Of course, they were eaten, when the expanded
2182 text were reread by @code{m4}. To show the difference, try
2185 define(`echo1', `$*')
2187 define(`echo2', `$@@')
2189 define(`foo', `This is macro `foo'.')
2192 @result{}This is macro This is macro foo..
2194 @result{}This is macro foo.
2196 @result{}This is macro foo.
2202 @xref{Trace}, if you do not understand this. As another example of the
2203 difference, remember that comments encountered in arguments are passed
2204 untouched to the macro, and that quoting disables comments.
2207 define(`echo1', `$*')
2209 define(`echo2', `$@@')
2211 define(`foo', `bar')
2223 A @samp{$} sign in the expansion text, that is not followed by anything
2224 @code{m4} understands, is simply copied to the macro expansion, as any
2228 define(`foo', `$$$ hello $$$')
2231 @result{}$$$ hello $$$
2235 @cindex literal output
2236 @cindex output, literal
2237 If you want a macro to expand to something like @samp{$12}, the
2238 judicious use of nested quoting can put a safe character between the
2239 @code{$} and the next character, relying on the rescanning to remove the
2240 nested quote. This will prevent @code{m4} from interpreting the
2241 @code{$} sign as a reference to an argument.
2244 define(`foo', `no nested quote: $1')
2247 @result{}no nested quote: arg
2248 define(`foo', `nested quote around $: `$'1')
2251 @result{}nested quote around $: $1
2252 define(`foo', `nested empty quote after $: $`'1')
2255 @result{}nested empty quote after $: $1
2256 define(`foo', `nested quote around next character: $`1'')
2259 @result{}nested quote around next character: $1
2260 define(`foo', `nested quote around both: `$1'')
2263 @result{}nested quote around both: arg
2267 @section Deleting a macro
2269 @cindex macros, how to delete
2270 @cindex deleting macros
2271 @cindex undefining macros
2272 A macro definition can be removed with @code{undefine}:
2274 @deffn {Builtin (m4)} undefine (@var{name}@dots{})
2275 For each argument, remove the macro @var{name}. The macro names must
2276 necessarily be quoted, since they will be expanded otherwise. If an
2277 argument is not a defined macro, then the @samp{d} debug level controls
2278 whether a warning is issued (@pxref{Debugmode}).
2280 The expansion of @code{undefine} is void.
2281 The macro @code{undefine} is recognized only with parameters.
2286 @result{}foo bar blah
2287 define(`foo', `some')define(`bar', `other')define(`blah', `text')
2290 @result{}some other text
2294 @result{}foo other text
2295 undefine(`bar', `blah')
2298 @result{}foo bar blah
2301 Undefining a macro inside that macro's expansion is safe; the macro
2302 still expands to the definition that was in effect at the @samp{(}.
2305 define(`f', ``$0':$1')
2307 f(f(f(undefine(`f')`hello world')))
2308 @result{}f:f:f:hello world
2313 As of M4 1.6, @code{undefine} can warn if @var{name} is not a macro, by
2314 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2315 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2321 @error{}m4:stdin:1: warning: undefine: undefined macro 'a'
2330 @section Renaming macros
2332 @cindex macros, how to rename
2333 @cindex renaming macros
2334 @cindex macros, displaying definitions
2335 @cindex definitions, displaying macro
2336 It is possible to rename an already defined macro. To do this, you need
2337 the builtin @code{defn}:
2339 @deffn {Builtin (m4)} defn (@var{name}@dots{})
2340 Expands to the @emph{quoted definition} of each @var{name}. If an
2341 argument is not a defined macro, the expansion for that argument is
2342 empty, and the @samp{d} debug level controls whether a warning is issued
2343 (@pxref{Debugmode}).
2345 If @var{name} is a user-defined macro, the quoted definition is simply
2346 the quoted expansion text. If, instead, @var{name} is a builtin, the
2347 expansion is a special token, which points to the builtin's internal
2348 definition. This token meaningful primarily as the second argument to
2349 @code{define} (and @code{pushdef}), and is silently converted to an
2350 empty string in many other contexts.
2352 The macro @code{defn} is recognized only with parameters.
2355 Its normal use is best understood through an example, which shows how to
2356 rename @code{undefine} to @code{zap}:
2359 define(`zap', defn(`undefine'))
2364 @result{}undefine(zap)
2367 In this way, @code{defn} can be used to copy macro definitions, and also
2368 definitions of builtin macros. Even if the original macro is removed,
2369 the other name can still be used to access the definition.
2371 The fact that macro definitions can be transferred also explains why you
2372 should use @code{$0}, rather than retyping a macro's name in its
2376 define(`foo', `This is `$0'')
2378 define(`bar', defn(`foo'))
2381 @result{}This is bar
2384 Macros used as string variables should be referred through @code{defn},
2385 to avoid unwanted expansion of the text:
2388 define(`string', `The macro dnl is very useful
2392 @result{}The macro@w{ }
2394 @result{}The macro dnl is very useful
2399 However, it is important to remember that @code{m4} rescanning is purely
2400 textual. If an unbalanced end-quote string occurs in a macro
2401 definition, the rescan will see that embedded quote as the termination
2402 of the quoted string, and the remainder of the macro's definition will
2403 be rescanned unquoted. Thus it is a good idea to avoid unbalanced
2404 end-quotes in macro definitions or arguments to macros.
2411 define(`echo', `$@@')
2421 On the other hand, it is possible to exploit the fact that @code{defn}
2422 can concatenate multiple macros prior to the rescanning phase, in order
2423 to join the definitions of macros that, in isolation, have unbalanced
2424 quotes. This is particularly useful when one has used several macros to
2425 accumulate text that M4 should rescan as a whole. In the example below,
2426 note how the use of @code{defn} on @code{l} in isolation opens a string,
2427 which is not closed until the next line; but used on @code{l} and
2428 @code{r} together results in nested quoting.
2431 define(`l', `<[>')define(`r', `<]>')
2433 changequote(`[', `]')
2437 @result{}<[>]defn([r])
2443 @cindex builtins, special tokens
2444 @cindex tokens, builtin macro
2445 Using @code{defn} to generate special tokens for builtin macros will
2446 generate a warning in contexts where a macro name is expected. But in
2447 contexts that operate on text, the builtin token is just silently
2448 converted to an empty string. As of M4 1.6, expansion of user macros
2449 will also preserve builtin tokens. However, any use of builtin tokens
2450 outside of the second argument to @code{define} and @code{pushdef} is
2451 generally not portable, since earlier GNU M4 versions, as well
2452 as other @code{m4} implementations, vary on how such tokens are treated.
2458 define(defn(`divnum'), `cannot redefine a builtin token')
2459 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2465 define(`echo', `$@@')
2467 define(`mydivnum', shift(echo(`', defn(`divnum'))))
2471 define(`', `empty-$1')
2473 defn(defn(`divnum'))
2474 @error{}m4:stdin:9: warning: defn: invalid macro name ignored
2476 pushdef(defn(`divnum'), `oops')
2477 @error{}m4:stdin:10: warning: pushdef: invalid macro name ignored
2479 traceon(defn(`divnum'))
2480 @error{}m4:stdin:11: warning: traceon: invalid macro name ignored
2482 indir(defn(`divnum'), `string')
2483 @error{}m4:stdin:12: warning: indir: invalid macro name ignored
2486 @result{}empty-string
2487 traceoff(defn(`divnum'))
2488 @error{}m4:stdin:14: warning: traceoff: invalid macro name ignored
2490 popdef(defn(`divnum'))
2491 @error{}m4:stdin:15: warning: popdef: invalid macro name ignored
2493 dumpdef(defn(`divnum'))
2494 @error{}m4:stdin:16: warning: dumpdef: invalid macro name ignored
2496 undefine(defn(`divnum'))
2497 @error{}m4:stdin:17: warning: undefine: invalid macro name ignored
2500 @error{}:@tabchar{}`empty-$1'
2502 m4symbols(defn(`divnum'))
2503 @error{}m4:stdin:19: warning: m4symbols: invalid macro name ignored
2505 define(`foo', `define(`$1', $2)')dnl
2506 foo(`bar', defn(`divnum'))
2512 As of M4 1.6, @code{defn} can warn if @var{name} is not a macro, by
2513 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2514 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2515 m4}). Also, @code{defn} with multiple arguments can join text with
2516 builtin tokens. However, when defining a macro via @code{define} or
2517 @code{pushdef}, a warning is issued and the builtin token ignored if the
2518 builtin token does not occur in isolation. A future version of
2519 GNU M4 may lift this restriction.
2524 @error{}m4:stdin:1: warning: defn: undefined macro 'foo'
2530 define(`a', `A')define(`AA', `b')
2532 traceon(`defn', `define')
2534 defn(`a', `divnum', `a')
2535 @error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A''
2537 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
2538 @error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
2539 @error{}m4:stdin:7: warning: define: cannot concatenate builtins
2540 @error{}m4trace: -1- define(`mydivnum', `<divnum><divnum>') -> `'
2542 traceoff(`defn', `define')dumpdef(`mydivnum')
2543 @error{}mydivnum:@tabchar{}`'
2545 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
2546 @error{}m4:stdin:9: warning: define: cannot concatenate builtins
2548 define(`mydivnum', defn(`divnum')`a')mydivnum
2549 @error{}m4:stdin:10: warning: define: cannot concatenate builtins
2551 define(`mydivnum', `a'defn(`divnum'))mydivnum
2552 @error{}m4:stdin:11: warning: define: cannot concatenate builtins
2554 define(`q', ``$@@'')
2556 define(`foo', q(`a', defn(`divnum')))foo
2557 @error{}m4:stdin:13: warning: define: cannot concatenate builtins
2559 ifdef(`foo', `yes', `no')
2564 @section Temporarily redefining macros
2566 @cindex macros, temporary redefinition of
2567 @cindex temporary redefinition of macros
2568 @cindex redefinition of macros, temporary
2569 @cindex definition stack
2570 @cindex pushdef stack
2571 @cindex stack, macro definition
2572 It is possible to redefine a macro temporarily, reverting to the
2573 previous definition at a later time. This is done with the builtins
2574 @code{pushdef} and @code{popdef}:
2576 @deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion})
2577 @deffnx {Builtin (m4)} popdef (@var{name}@dots{})
2578 Analogous to @code{define} and @code{undefine}.
2580 These macros work in a stack-like fashion. A macro is temporarily
2581 redefined with @code{pushdef}, which replaces an existing definition of
2582 @var{name}, while saving the previous definition, before the new one is
2583 installed. If there is no previous definition, @code{pushdef} behaves
2584 exactly like @code{define}.
2586 If a macro has several definitions (of which only one is accessible),
2587 the topmost definition can be removed with @code{popdef}. If there is
2588 no previous definition, @code{popdef} behaves like @code{undefine}, and
2589 if there is no definition at all, the @samp{d} debug level controls
2590 whether a warning is issued (@pxref{Debugmode}).
2592 The expansion of both @code{pushdef} and @code{popdef} is void.
2593 The macros @code{pushdef} and @code{popdef} are recognized only with
2598 define(`foo', `Expansion one.')
2601 @result{}Expansion one.
2602 pushdef(`foo', `Expansion two.')
2605 @result{}Expansion two.
2606 pushdef(`foo', `Expansion three.')
2608 pushdef(`foo', `Expansion four.')
2613 @result{}Expansion three.
2614 popdef(`foo', `foo')
2617 @result{}Expansion one.
2624 If a macro with several definitions is redefined with @code{define}, the
2625 topmost definition is @emph{replaced} with the new definition. If it is
2626 removed with @code{undefine}, @emph{all} the definitions are removed,
2627 and not only the topmost one. However, POSIX allows other
2628 implementations that treat @code{define} as replacing an entire stack
2629 of definitions with a single new definition, so to be portable to other
2630 implementations, it may be worth explicitly using @code{popdef} and
2631 @code{pushdef} rather than relying on the GNU behavior of
2635 define(`foo', `Expansion one.')
2638 @result{}Expansion one.
2639 pushdef(`foo', `Expansion two.')
2642 @result{}Expansion two.
2643 define(`foo', `Second expansion two.')
2646 @result{}Second expansion two.
2653 @cindex local variables
2654 @cindex variables, local
2655 Local variables within macros are made with @code{pushdef} and
2656 @code{popdef}. At the start of the macro a new definition is pushed,
2657 within the macro it is manipulated and at the end it is popped,
2658 revealing the former definition.
2660 It is possible to temporarily redefine a builtin with @code{pushdef}
2663 As of M4 1.6, @code{popdef} can warn if @var{name} is not a macro, by
2664 using @code{debugmode} (@pxref{Debugmode}) or the command line option
2665 @option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking
2674 @error{}m4:stdin:3: warning: popdef: undefined macro 'a'
2683 @section Renaming macros with regular expressions
2685 @cindex regular expressions
2686 @cindex macros, how to rename
2687 @cindex renaming macros
2688 @cindex GNU extensions
2689 Sometimes it is desirable to rename multiple symbols without having to
2690 use a long sequence of calls to @code{define}. The @code{renamesyms}
2691 builtin allows this:
2693 @deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @
2695 Global renaming of macros is done by @code{renamesyms}, which selects
2696 all macros with names that match @var{regexp}, and renames each match
2697 according to @var{replacement}. It is unspecified what happens if the
2698 rename causes multiple macros to map to the same name.
2699 @comment FIXME - right now, collisions cause a core dump on some platforms:
2700 @comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa')
2702 If @var{resyntax} is given, the particular flavor of regular
2703 expression understood with respect to @var{regexp} can be changed from
2704 the current default. @xref{Changeresyntax}, for details of the values
2705 that can be given for this argument.
2707 A macro that does not have a name that matches @var{regexp} is left
2708 with its original name. If only part of the name matches, any part of
2709 the name that is not covered by @var{regexp} is copied to the
2710 replacement name. Whenever a match is found in the name, the search
2711 proceeds from the end of the match, so no character in the original
2712 name can be substituted twice. If @var{regexp} matches a string of
2713 zero length, the start position for the continued search is
2714 incremented to avoid infinite loops.
2716 Where a replacement is to be made, @var{replacement} replaces the
2717 matched text in the original name, with @samp{\@var{n}} substituted by
2718 the text matched by the @var{n}th parenthesized sub-expression of
2719 @var{regexp}, and @samp{\&} being the text matched by the entire
2722 The expansion of @code{renamesyms} is void.
2723 The macro @code{renamesyms} is recognized only with parameters.
2724 This macro was added in M4 2.0.
2727 The following example starts with a rename similar to the
2728 @option{--prefix-builtins} option (or @option{-P}), prefixing every
2729 macro with @code{m4_}. However, note that @option{-P} only renames M4
2730 builtin macros, even if other macros were defined previously, while
2731 @code{renamesyms} will rename any macros that match when it runs,
2732 including text macros. The rest of the example demonstrates the
2733 behavior of unanchored regular expressions in symbol renaming.
2735 @comment options: -Dfoo=bar -P
2737 $ @kbd{m4 -Dfoo=bar -P}
2748 define(`foo', `bar')
2750 renamesyms(`^.*$', `m4_\&')
2758 m4_renamesyms(`f', `g')
2760 m4_igdeg(`m4_goo', `m4_goo')
2764 If @var{resyntax} is given, @var{regexp} must be given according to
2765 the syntax chosen, though the default regular expression syntax
2766 remains unchanged for other invocations. Here is a more realistic
2767 example that performs a similar renaming on macros, except that it
2768 ignores macros with names that begin with @samp{_}, and avoids creating
2769 macros with names that begin with @samp{m4_m4}.
2772 renamesyms(`^[^_]\w*$', `m4_\&')
2774 m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED')
2783 When a symbol has multiple definitions, thanks to @code{pushdef}, the
2784 entire stack is renamed.
2787 pushdef(`foo', `1')pushdef(`foo', `2')
2789 renamesyms(`^foo$', `bar')
2800 @section Indirect call of macros
2802 @cindex indirect call of macros
2803 @cindex call of macros, indirect
2804 @cindex macros, indirect call of
2805 @cindex GNU extensions
2806 Any macro can be called indirectly with @code{indir}:
2808 @deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}})
2809 Results in a call to the macro @var{name}, which is passed the rest of
2810 the arguments @var{args}. If @var{name} is not defined, the expansion
2811 is void, and the @samp{d} debug level controls whether a warning is
2812 issued (@pxref{Debugmode}).
2814 The macro @code{indir} is recognized only with parameters.
2817 This can be used to call macros with computed or ``invalid''
2818 names (@code{define} allows such names to be defined):
2821 define(`$$internal$macro', `Internal macro (name `$0')')
2824 @result{}$$internal$macro
2825 indir(`$$internal$macro')
2826 @result{}Internal macro (name $$internal$macro)
2829 The point is, here, that larger macro packages can have private macros
2830 defined, that will not be called by accident. They can @emph{only} be
2831 called through the builtin @code{indir}.
2833 One other point to observe is that argument collection occurs before
2834 @code{indir} invokes @var{name}, so if argument collection changes the
2835 value of @var{name}, that will be reflected in the final expansion.
2836 This is different than the behavior when invoking macros directly,
2837 where the definition that was in effect before argument collection is
2846 indir(`f', define(`f', `3'))
2848 indir(`f', undefine(`f'))
2849 @error{}m4:stdin:4: warning: indir: undefined macro 'f'
2857 When handed the result of @code{defn} (@pxref{Defn}) as one of its
2858 arguments, @code{indir} defers to the invoked @var{name} for whether a
2859 token representing a builtin is recognized or flattened to the empty
2864 indir(defn(`defn'), `divnum')
2865 @error{}m4:stdin:1: warning: indir: invalid macro name ignored
2867 indir(`define', defn(`defn'), `divnum')
2868 @error{}m4:stdin:2: warning: define: invalid macro name ignored
2870 indir(`define', `foo', defn(`divnum'))
2874 indir(`divert', defn(`foo'))
2875 @error{}m4:stdin:5: warning: divert: empty string treated as 0
2879 Warning messages issued on behalf of an indirect macro use an
2880 unambiguous representation of the macro name, using escape sequences
2881 similar to C strings, and with colons also quoted.
2885 odd', defn(`divnum'))
2889 @error{}m4:stdin:3: warning: %%\:\\\nodd: extra arguments ignored: 1 > 0
2894 @section Indirect call of builtins
2896 @cindex indirect call of builtins
2897 @cindex call of builtins, indirect
2898 @cindex builtins, indirect call of
2899 @cindex GNU extensions
2900 Builtin macros can be called indirectly with @code{builtin}:
2902 @deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}})
2903 @deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1})
2904 Results in a call to the builtin @var{name}, which is passed the
2905 rest of the arguments @var{args}. If @var{name} does not name a
2906 builtin, the expansion is void, and the @samp{d} debug level controls
2907 whether a warning is issued (@pxref{Debugmode}).
2909 As a special case, if @var{name} is exactly the special token
2910 representing the @code{builtin} macro, as obtained by @code{defn}
2911 (@pxref{Defn}), then @var{args} must consist of a single @var{name1},
2912 and the expansion is the special token representing the builtin macro
2913 named by @var{name1}.
2915 The macro @code{builtin} is recognized only with parameters.
2918 This can be used even if @var{name} has been given another definition
2919 that has covered the original, or been undefined so that no macro
2920 maps to the builtin.
2923 pushdef(`define', `hidden')
2925 undefine(`undefine')
2927 define(`foo', `bar')
2931 builtin(`define', `foo', defn(`divnum'))
2935 builtin(`define', `foo', `BAR')
2940 @result{}undefine(foo)
2943 builtin(`undefine', `foo')
2949 The @var{name} argument only matches the original name of the builtin,
2950 even when the @option{--prefix-builtins} option (or @option{-P},
2951 @pxref{Operation modes, , Invoking m4}) is in effect. This is different
2952 from @code{indir}, which only tracks current macro names.
2954 @comment options: -P
2957 m4_builtin(`divnum')
2959 m4_builtin(`m4_divnum')
2960 @error{}m4:stdin:2: warning: m4_builtin: undefined builtin 'm4_divnum'
2963 @error{}m4:stdin:3: warning: m4_indir: undefined macro 'divnum'
2965 m4_indir(`m4_divnum')
2969 m4_builtin(`m4_divnum')
2973 Note that @code{indir} and @code{builtin} can be used to invoke builtins
2974 without arguments, even when they normally require parameters to be
2975 recognized; but it will provoke a warning, and the expansion will behave
2976 as though empty strings had been passed as the required arguments.
2982 @error{}m4:stdin:2: warning: builtin: undefined builtin ''
2985 @error{}m4:stdin:3: warning: builtin: too few arguments: 0 < 1
2988 @error{}m4:stdin:4: warning: builtin: undefined builtin ''
2990 builtin(`builtin', ``'
2992 @error{}m4:stdin:5: warning: builtin: undefined builtin '`\'\n'
2995 @error{}m4:stdin:7: warning: index: too few arguments: 0 < 2
2999 Normally, once a builtin macro is undefined, the only way to retrieve
3000 its functionality is by defining a new macro that expands to
3001 @code{builtin} under the hood. But this extra layer of expansion is
3002 slightly inefficient, not to mention the fact that it is not robust to
3003 changes in the current quoting scheme due to @code{changequote}
3004 (@pxref{Changequote}). On the other hand, defining a macro to the
3005 special token produced by @code{defn} (@pxref{Defn}) is very efficient,
3006 and avoids the need for quoting within the macro definition; but
3007 @code{defn} only works if the desired macro is already defined by some
3008 other name. So @code{builtin} provides a special case where it is
3009 possible to retrieve the same special token representing a builtin as
3010 what @code{defn} would provide, were the desired macro still defined.
3011 This feature is activated by passing @code{defn(`builtin')} as the first
3012 argument to builtin. Normally, passing a special token representing a
3013 macro as @var{name} results in a warning and an empty expansion, but in
3014 this case, if the second argument @var{name1} names a valid builtin,
3015 there is no warning and the expansion is the appropriate special
3016 token. In fact, with just the @code{builtin} macro accessible, it is
3017 possible to reconstitute the entire startup state of @code{m4}.
3019 In the example below, compare the number of macro invocations performed
3020 by @code{defn1} and @code{defn2}, and the differences once quoting is
3027 define(`foo', `bar')
3029 define(`defn1', `builtin(`defn', $@@)')
3031 define(`defn2', builtin(builtin(`defn', `builtin'), `defn'))
3033 dumpdef(`defn1', `defn2')
3034 @error{}defn1:@tabchar{}`builtin(`defn', $@@)'
3035 @error{}defn2:@tabchar{}<defn>
3040 @error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')'
3041 @error{}m4trace: -1- builtin(`defn', `foo') -> ``bar''
3044 @error{}m4trace: -1- defn2(`foo') -> ``bar''
3047 @error{}m4trace: -1- traceoff -> `'
3049 changequote(`[', `]')
3052 @error{}m4:stdin:11: warning: builtin: undefined builtin '`defn\''
3056 define([defn1], [builtin([defn], $@@)])
3063 @error{}m4:stdin:16: warning: builtin: undefined builtin '[defn]'
3068 @section Getting the defined macro names
3070 @cindex macro names, listing
3071 @cindex listing macro names
3072 @cindex currently defined macros
3073 @cindex GNU extensions
3074 The name of the currently defined macros can be accessed by
3077 @deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}})
3078 Without arguments, @code{m4symbols} expands to a sorted list of quoted
3079 strings, separated by commas. This contrasts with @code{dumpdef}
3080 (@pxref{Dumpdef}), whose output cannot be accessed by @code{m4}
3083 When given arguments, @code{m4symbols} returns the sorted subset of the
3084 @var{names} currently defined, and silently ignores the rest.
3085 This macro was added in M4 2.0.
3089 m4symbols(`ifndef', `ifdef', `define', `undef')
3090 @result{}define,ifdef
3094 @chapter Conditionals, loops, and recursion
3096 Macros, expanding to plain text, perhaps with arguments, are not quite
3097 enough. We would like to have macros expand to different things, based
3098 on decisions taken at run-time. For that, we need some kind of conditionals.
3099 Also, we would like to have some kind of loop construct, so we could do
3100 something a number of times, or while some condition is true.
3103 * Ifdef:: Testing if a macro is defined
3104 * Ifelse:: If-else construct, or multibranch
3105 * Shift:: Recursion in @code{m4}
3106 * Forloop:: Iteration by counting
3107 * Foreach:: Iteration by list contents
3108 * Stacks:: Working with definition stacks
3109 * Composition:: Building macros with macros
3113 @section Testing if a macro is defined
3115 @cindex conditionals
3116 There are two different builtin conditionals in @code{m4}. The first is
3119 @deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2})
3120 If @var{name} is defined as a macro, @code{ifdef} expands to
3121 @var{string-1}, otherwise to @var{string-2}. If @var{string-2} is
3122 omitted, it is taken to be the empty string (according to the normal
3125 The macro @code{ifdef} is recognized only with parameters.
3129 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3130 @result{}foo is not defined
3133 ifdef(`foo', ``foo' is defined', ``foo' is not defined')
3134 @result{}foo is defined
3135 ifdef(`no_such_macro', `yes', `no', `extra argument')
3136 @error{}m4:stdin:4: warning: ifdef: extra arguments ignored: 4 > 3
3140 As of M4 1.6, @code{ifdef} transparently handles builtin tokens
3141 generated by @code{defn} (@pxref{Defn}) that occur in either
3142 @var{string}, although a warning is issued for invalid macro names.
3147 ifdef(defn(`defn'), `yes', `no')
3148 @error{}m4:stdin:2: warning: ifdef: invalid macro name ignored
3150 define(`foo', ifdef(`divnum', defn(`divnum'), `undefined'))
3157 @section If-else construct, or multibranch
3159 @cindex comparing strings
3160 @cindex discarding input
3161 @cindex input, discarding
3162 The other conditional, @code{ifelse}, is much more powerful. It can be
3163 used as a way to introduce a long comment, as an if-else construct, or
3164 as a multibranch, depending on the number of arguments supplied:
3166 @deffn {Builtin (m4)} ifelse (@var{comment})
3167 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @
3169 @deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @
3170 @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal})
3171 Used with only one argument, the @code{ifelse} simply discards it and
3174 If called with three or four arguments, @code{ifelse} expands into
3175 @var{equal}, if @var{string-1} and @var{string-2} are equal (character
3176 for character), otherwise it expands to @var{not-equal}. A final fifth
3177 argument is ignored, after triggering a warning.
3179 If called with six or more arguments, and @var{string-1} and
3180 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1},
3181 otherwise the first three arguments are discarded and the processing
3184 The macro @code{ifelse} is recognized only with parameters.
3187 Using only one argument is a common @code{m4} idiom for introducing a
3188 block comment, as an alternative to repeatedly using @code{dnl}. This
3189 special usage is recognized by GNU @code{m4}, so that in this
3190 case, the warning about missing arguments is never triggered.
3193 ifelse(`some comments')
3195 ifelse(`foo', `bar')
3196 @error{}m4:stdin:2: warning: ifelse: too few arguments: 2 < 3
3200 Using three or four arguments provides decision points.
3203 ifelse(`foo', `bar', `true')
3205 ifelse(`foo', `foo', `true')
3207 define(`foo', `bar')
3209 ifelse(foo, `bar', `true', `false')
3211 ifelse(foo, `foo', `true', `false')
3215 @cindex macro, blind
3217 Notice how the first argument was used unquoted; it is common to compare
3218 the expansion of a macro with a string. With this macro, you can now
3219 reproduce the behavior of blind builtins, where the macro is recognized
3220 only with arguments.
3223 define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')')
3228 @result{}arguments:1
3230 @result{}arguments:3
3233 For an example of a way to make defining blind macros easier, see
3236 @cindex multibranches
3237 @cindex switch statement
3238 @cindex case statement
3239 The macro @code{ifelse} can take more than four arguments. If given more
3240 than four arguments, @code{ifelse} works like a @code{case} or @code{switch}
3241 statement in traditional programming languages. If @var{string-1} and
3242 @var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise
3243 the procedure is repeated with the first three arguments discarded. This
3244 calls for an example:
3247 ifelse(`foo', `bar', `third', `gnu', `gnats')
3248 @error{}m4:stdin:1: warning: ifelse: extra arguments ignored: 5 > 4
3250 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth')
3252 ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh')
3254 ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8')
3255 @error{}m4:stdin:4: warning: ifelse: extra arguments ignored: 8 > 7
3259 As of M4 1.6, @code{ifelse} transparently handles builtin tokens
3260 generated by @code{defn} (@pxref{Defn}). Because of this, it is always
3261 safe to compare two macro definitions, without worrying whether the
3262 macro might be a builtin.
3265 ifelse(defn(`defn'), `', `yes', `no')
3267 ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
3269 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
3271 define(`foo', ifelse(`', `', defn(`divnum')))
3277 Naturally, the normal case will be slightly more advanced than these
3278 examples. A common use of @code{ifelse} is in macros implementing loops
3282 @section Recursion in @code{m4}
3284 @cindex recursive macros
3285 @cindex macros, recursive
3286 There is no direct support for loops in @code{m4}, but macros can be
3287 recursive. There is no limit on the number of recursion levels, other
3288 than those enforced by your hardware and operating system.
3291 Loops can be programmed using recursion and the conditionals described
3294 There is a builtin macro, @code{shift}, which can, among other things,
3295 be used for iterating through the actual arguments to a macro:
3297 @deffn {Builtin (m4)} shift (@var{arg1}, @dots{})
3298 Takes any number of arguments, and expands to all its arguments except
3299 @var{arg1}, separated by commas, with each argument quoted.
3301 The macro @code{shift} is recognized only with parameters.
3309 shift(`foo', `bar', `baz')
3313 An example of the use of @code{shift} is this macro:
3315 @cindex reversing arguments
3316 @cindex arguments, reversing
3317 @deffn Composite reverse (@dots{})
3318 Takes any number of arguments, and reverses their order.
3321 It is implemented as:
3324 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
3325 `reverse(shift($@@)), `$1'')')
3331 reverse(`foo', `bar', `gnats', `and gnus')
3332 @result{}and gnus, gnats, bar, foo
3335 While not a very interesting macro, it does show how simple loops can be
3336 made with @code{shift}, @code{ifelse} and recursion. It also shows
3337 that @code{shift} is usually used with @samp{$@@}. Another example of
3338 this is an implementation of a short-circuiting conditional operator.
3340 @cindex short-circuiting conditional
3341 @cindex conditional, short-circuiting
3342 @deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @
3343 @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal})
3344 Similar to @code{ifelse}, where an equal comparison between the first
3345 two strings results in the third, otherwise the first three arguments
3346 are discarded and the process repeats. The difference is that each
3347 @var{test-<n>} is expanded only when it is encountered. This means that
3348 every third argument to @code{cond} is normally given one more level of
3349 quoting than the corresponding argument to @code{ifelse}.
3352 Here is the implementation of @code{cond}, along with a demonstration of
3353 how it can short-circuit the side effects in @code{side}. Notice how
3354 all the unquoted side effects happen regardless of how many comparisons
3355 are made with @code{ifelse}, compared with only the relevant effects
3360 `ifelse(`$#', `1', `$1',
3361 `ifelse($1, `$2', `$3',
3362 `$0(shift(shift(shift($@@))))')')')dnl
3363 define(`side', `define(`counter', incr(counter))$1')dnl
3365 `define(`counter', `0')dnl
3366 ifelse(side(`$1'), `yes', `one comparison: ',
3367 side(`$1'), `no', `two comparisons: ',
3368 side(`$1'), `maybe', `three comparisons: ',
3369 `side(`default answer: ')')counter')dnl
3371 `define(`counter', `0')dnl
3372 cond(`side(`$1')', `yes', `one comparison: ',
3373 `side(`$1')', `no', `two comparisons: ',
3374 `side(`$1')', `maybe', `three comparisons: ',
3375 `side(`default answer: ')')counter')dnl
3377 @result{}one comparison: 3
3379 @result{}two comparisons: 3
3381 @result{}three comparisons: 3
3382 example1(`feeling rather indecisive today')
3383 @result{}default answer: 4
3385 @result{}one comparison: 1
3387 @result{}two comparisons: 2
3389 @result{}three comparisons: 3
3390 example2(`feeling rather indecisive today')
3391 @result{}default answer: 4
3394 @cindex joining arguments
3395 @cindex arguments, joining
3396 @cindex concatenating arguments
3397 Another common task that requires iteration is joining a list of
3398 arguments into a single string.
3400 @deffn Composite join (@ovar{separator}, @ovar{args@dots{}})
3401 @deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}})
3402 Generate a single-quoted string, consisting of each @var{arg} separated
3403 by @var{separator}. While @code{joinall} always outputs a
3404 @var{separator} between arguments, @code{join} avoids the
3405 @var{separator} for an empty @var{arg}.
3408 Here are some examples of its usage, based on the implementation
3409 @file{m4-@value{VERSION}/@/doc/examples/@/join.m4} distributed in this
3414 $ @kbd{m4 -I examples}
3417 join,join(`-'),join(`-', `'),join(`-', `', `')
3419 joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `')
3423 join(`-', `1', `2', `3')
3425 join(`', `1', `2', `3')
3427 join(`-', `', `1', `', `', `2', `')
3429 joinall(`-', `', `1', `', `', `2', `')
3431 join(`,', `1', `2', `3')
3433 define(`nargs', `$#')dnl
3434 nargs(join(`,', `1', `2', `3'))
3438 Examining the implementation shows some interesting points about several
3439 m4 programming idioms.
3443 $ @kbd{m4 -I doc/examples}
3444 undivert(`join.m4')dnl
3445 @result{}divert(`-1')
3446 @result{}# join(sep, args) - join each non-empty ARG into a single
3447 @result{}# string, with each element separated by SEP
3448 @result{}define(`join',
3449 @result{}`ifelse(`$#', `2', ``$2'',
3450 @result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')')
3451 @result{}define(`_join',
3452 @result{}`ifelse(`$#$2', `2', `',
3453 @result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')')
3454 @result{}# joinall(sep, args) - join each ARG, including empty ones,
3455 @result{}# into a single string, with each element separated by SEP
3456 @result{}define(`joinall', ``$2'_$0(`$1', shift($@@))')
3457 @result{}define(`_joinall',
3458 @result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')')
3459 @result{}divert`'dnl
3462 First, notice that this implementation creates helper macros
3463 @code{_join} and @code{_joinall}. This division of labor makes it
3464 easier to output the correct number of @var{separator} instances:
3465 @code{join} and @code{joinall} are responsible for the first argument,
3466 without a separator, while @code{_join} and @code{_joinall} are
3467 responsible for all remaining arguments, always outputting a separator
3468 when outputting an argument.
3470 Next, observe how @code{join} decides to iterate to itself, because the
3471 first @var{arg} was empty, or to output the argument and swap over to
3472 @code{_join}. If the argument is non-empty, then the nested
3473 @code{ifelse} results in an unquoted @samp{_}, which is concatenated
3474 with the @samp{$0} to form the next macro name to invoke. The
3475 @code{joinall} implementation is simpler since it does not have to
3476 suppress empty @var{arg}; it always executes once then defers to
3479 Another important idiom is the idea that @var{separator} is reused for
3480 each iteration. Each iteration has one less argument, but rather than
3481 discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro
3482 discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}.
3484 Next, notice that it is possible to compare more than one condition in a
3485 single @code{ifelse} test. The test of @samp{$#$2} against @samp{2}
3486 allows @code{_join} to iterate for two separate reasons---either there
3487 are still more than two arguments, or there are exactly two arguments
3488 but the last argument is not empty.
3490 Finally, notice that these macros require exactly two arguments to
3491 terminate recursion, but that they still correctly result in empty
3492 output when given no @var{args} (i.e., zero or one macro argument). On
3493 the first pass when there are too few arguments, the @code{shift}
3494 results in no output, but leaves an empty string to serve as the
3495 required second argument for the second pass. Put another way,
3496 @samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the
3497 former guarantees at least two arguments.
3499 @cindex quote manipulation
3500 @cindex manipulating quotes
3501 Sometimes, a recursive algorithm requires adding quotes to each element,
3502 or treating multiple arguments as a single element:
3504 @deffn Composite quote (@dots{})
3505 @deffnx Composite dquote (@dots{})
3506 @deffnx Composite dquote_elt (@dots{})
3507 Takes any number of arguments, and adds quoting. With @code{quote},
3508 only one level of quoting is added, effectively removing whitespace
3509 after commas and turning multiple arguments into a single string. With
3510 @code{dquote}, two levels of quoting are added, one around each element,
3511 and one around the list. And with @code{dquote_elt}, two levels of
3512 quoting are added around each element.
3515 An actual implementation of these three macros is distributed as
3516 @file{m4-@value{VERSION}/@/doc/examples/@/quote.m4} in this package.
3517 First, let's examine their usage:
3521 $ @kbd{m4 -I doc/examples}
3524 -quote-dquote-dquote_elt-
3526 -quote()-dquote()-dquote_elt()-
3528 -quote(`1')-dquote(`1')-dquote_elt(`1')-
3529 @result{}-1-`1'-`1'-
3530 -quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')-
3531 @result{}-1,2-`1',`2'-`1',`2'-
3532 define(`n', `$#')dnl
3533 -n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))-
3535 dquote(dquote_elt(`1', `2'))
3536 @result{}``1'',``2''
3537 dquote_elt(dquote(`1', `2'))
3541 The last two lines show that when given two arguments, @code{dquote}
3542 results in one string, while @code{dquote_elt} results in two. Now,
3543 examine the implementation. Note that @code{quote} and
3544 @code{dquote_elt} make decisions based on their number of arguments, so
3545 that when called without arguments, they result in nothing instead of a
3546 quoted empty string; this is so that it is possible to distinguish
3547 between no arguments and an empty first argument. @code{dquote}, on the
3548 other hand, results in a string no matter what, since it is still
3549 possible to tell whether it was invoked without arguments based on the
3554 $ @kbd{m4 -I doc/examples}
3555 undivert(`quote.m4')dnl
3556 @result{}divert(`-1')
3557 @result{}# quote(args) - convert args to single-quoted string
3558 @result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')')
3559 @result{}# dquote(args) - convert args to quoted list of quoted strings
3560 @result{}define(`dquote', ``$@@'')
3561 @result{}# dquote_elt(args) - convert args to list of double-quoted strings
3562 @result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''',
3563 @result{} ```$1'',$0(shift($@@))')')
3564 @result{}divert`'dnl
3567 It is worth pointing out that @samp{quote(@var{args})} is more efficient
3568 than @samp{joinall(`,', @var{args})} for producing the same output.
3570 @cindex nine arguments, more than
3571 @cindex more than nine arguments
3572 @cindex arguments, more than nine
3573 One more useful macro based on @code{shift} allows portably selecting
3574 an arbitrary argument (usually greater than the ninth argument), without
3575 relying on the GNU extension of multi-digit arguments
3576 (@pxref{Arguments}).
3578 @deffn Composite argn (@var{n}, @dots{})
3579 Expands to argument @var{n} out of the remaining arguments. @var{n}
3580 must be a positive number. Usually invoked as
3581 @samp{argn(`@var{n}',$@@)}.
3584 It is implemented as:
3587 define(`argn', `ifelse(`$1', 1, ``$2'',
3588 `argn(decr(`$1'), shift(shift($@@)))')')
3592 define(`foo', `argn(`11', $@@)')
3594 foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
3599 @section Iteration by counting
3602 @cindex loops, counting
3603 @cindex counting loops
3604 Here is an example of a loop macro that implements a simple for loop.
3606 @deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text})
3607 Takes the name in @var{iterator}, which must be a valid macro name, and
3608 successively assign it each integer value from @var{start} to @var{end},
3609 inclusive. For each assignment to @var{iterator}, append @var{text} to
3610 the expansion of the @code{forloop}. @var{text} may refer to
3611 @var{iterator}. Any definition of @var{iterator} prior to this
3612 invocation is restored.
3615 It can, for example, be used for simple counting:
3619 $ @kbd{m4 -I doc/examples}
3620 include(`forloop.m4')
3622 forloop(`i', `1', `8', `i ')
3623 @result{}1 2 3 4 5 6 7 8@w{ }
3626 For-loops can be nested, like:
3630 $ @kbd{m4 -I doc/examples}
3631 include(`forloop.m4')
3633 forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)')
3635 @result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8)
3636 @result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8)
3637 @result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8)
3638 @result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8)
3642 The implementation of the @code{forloop} macro is fairly
3643 straightforward. The @code{forloop} macro itself is simply a wrapper,
3644 which saves the previous definition of the first argument, calls the
3645 internal macro @code{@w{_forloop}}, and re-establishes the saved
3646 definition of the first argument.
3648 The macro @code{@w{_forloop}} expands the fourth argument once, and
3649 tests to see if the iterator has reached the final value. If it has
3650 not finished, it increments the iterator (using the predefined macro
3651 @code{incr}, @pxref{Incr}), and recurses.
3653 Here is an actual implementation of @code{forloop}, distributed as
3654 @file{m4-@value{VERSION}/@/doc/examples/@/forloop.m4} in this package:
3658 $ @kbd{m4 -I doc/examples}
3659 undivert(`forloop.m4')dnl
3660 @result{}divert(`-1')
3661 @result{}# forloop(var, from, to, stmt) - simple version
3662 @result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')')
3663 @result{}define(`_forloop',
3664 @result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')')
3665 @result{}divert`'dnl
3668 Notice the careful use of quotes. Certain macro arguments are left
3669 unquoted, each for its own reason. Try to find out @emph{why} these
3670 arguments are left unquoted, and see what happens if they are quoted.
3671 (As presented, these two macros are useful but not very robust for
3672 general use. They lack even basic error handling for cases like
3673 @var{start} less than @var{end}, @var{end} not numeric, or
3674 @var{iterator} not being a macro name. See if you can improve these
3675 macros; or @pxref{Improved forloop, , Answers}).
3678 @section Iteration by list contents
3680 @cindex for each loops
3681 @cindex loops, list iteration
3682 @cindex iterating over lists
3683 Here is an example of a loop macro that implements list iteration.
3685 @deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text})
3686 @deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text})
3687 Takes the name in @var{iterator}, which must be a valid macro name, and
3688 successively assign it each value from @var{paren-list} or
3689 @var{quote-list}. In @code{foreach}, @var{paren-list} is a
3690 comma-separated list of elements contained in parentheses. In
3691 @code{foreachq}, @var{quote-list} is a comma-separated list of elements
3692 contained in a quoted string. For each assignment to @var{iterator},
3693 append @var{text} to the overall expansion. @var{text} may refer to
3694 @var{iterator}. Any definition of @var{iterator} prior to this
3695 invocation is restored.
3698 As an example, this displays each word in a list inside of a sentence,
3699 using an implementation of @code{foreach} distributed as
3700 @file{m4-@value{VERSION}/@/doc/examples/@/foreach.m4}, and @code{foreachq}
3701 in @file{m4-@value{VERSION}/@/doc/examples/@/foreachq.m4}.
3705 $ @kbd{m4 -I doc/examples}
3706 include(`foreach.m4')
3708 foreach(`x', (foo, bar, foobar), `Word was: x
3710 @result{}Word was: foo
3711 @result{}Word was: bar
3712 @result{}Word was: foobar
3713 include(`foreachq.m4')
3715 foreachq(`x', `foo, bar, foobar', `Word was: x
3717 @result{}Word was: foo
3718 @result{}Word was: bar
3719 @result{}Word was: foobar
3722 It is possible to be more complex; each element of the @var{paren-list}
3723 or @var{quote-list} can itself be a list, to pass as further arguments
3724 to a helper macro. This example generates a shell case statement:
3728 $ @kbd{m4 -I doc/examples}
3729 include(`foreach.m4')
3731 define(`_case', ` $1)
3734 define(`_cat', `$1$2')dnl
3737 foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')',
3738 `_cat(`_case', x)')dnl
3740 @result{} vara=" a";;
3742 @result{} varb=" b";;
3744 @result{} varc=" c";;
3749 The implementation of the @code{foreach} macro is a bit more involved;
3750 it is a wrapper around two helper macros. First, @code{@w{_arg1}} is
3751 needed to grab the first element of a list. Second,
3752 @code{@w{_foreach}} implements the recursion, successively walking
3753 through the original list. Here is a simple implementation of
3758 $ @kbd{m4 -I doc/examples}
3759 undivert(`foreach.m4')dnl
3760 @result{}divert(`-1')
3761 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
3762 @result{}# parenthesized list, simple version
3763 @result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')')
3764 @result{}define(`_arg1', `$1')
3765 @result{}define(`_foreach', `ifelse(`$2', `()', `',
3766 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')')
3767 @result{}divert`'dnl
3770 Unfortunately, that implementation is not robust to macro names as list
3771 elements. Each iteration of @code{@w{_foreach}} is stripping another
3772 layer of quotes, leading to erratic results if list elements are not
3773 already fully expanded. The first cut at implementing @code{foreachq}
3774 takes this into account. Also, when using quoted elements in a
3775 @var{paren-list}, the overall list must be quoted. A @var{quote-list}
3776 has the nice property of requiring fewer characters to create a list
3777 containing the same quoted elements. To see the difference between the
3778 two macros, we attempt to pass double-quoted macro names in a list,
3779 expecting the macro name on output after one layer of quotes is removed
3780 during list iteration and the final layer removed during the final
3785 $ @kbd{m4 -I doc/examples}
3786 define(`a', `1')define(`b', `2')define(`c', `3')
3788 include(`foreach.m4')
3790 include(`foreachq.m4')
3792 foreach(`x', `(``a'', ``(b'', ``c)'')', `x
3799 foreachq(`x', ```a'', ``(b'', ``c)''', `x
3806 Obviously, @code{foreachq} did a better job; here is its implementation:
3810 $ @kbd{m4 -I doc/examples}
3811 undivert(`foreachq.m4')dnl
3812 @result{}include(`quote.m4')dnl
3813 @result{}divert(`-1')
3814 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
3815 @result{}# quoted list, simple version
3816 @result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')')
3817 @result{}define(`_arg1', `$1')
3818 @result{}define(`_foreachq', `ifelse(quote($2), `', `',
3819 @result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')')
3820 @result{}divert`'dnl
3823 Notice that @code{@w{_foreachq}} had to use the helper macro
3824 @code{quote} defined earlier (@pxref{Shift}), to ensure that the
3825 embedded @code{ifelse} call does not go haywire if a list element
3826 contains a comma. Unfortunately, this implementation of @code{foreachq}
3827 has its own severe flaw. Whereas the @code{foreach} implementation was
3828 linear, this macro is quadratic in the number of list elements, and is
3829 much more likely to trip up the limit set by the command line option
3830 @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, ,
3831 Invoking m4}). Additionally, this implementation does not expand
3832 @samp{defn(`@var{iterator}')} very well, when compared with
3837 $ @kbd{m4 -I doc/examples}
3838 include(`foreach.m4')include(`foreachq.m4')
3840 foreach(`name', `(`a', `b')', ` defn(`name')')
3842 foreachq(`name', ``a', `b'', ` defn(`name')')
3843 @result{} _arg1(`a', `b') _arg1(shift(`a', `b'))
3846 It is possible to have robust iteration with linear behavior and sane
3847 @var{iterator} contents for either list style. See if you can learn
3848 from the best elements of both of these implementations to create robust
3849 macros (or @pxref{Improved foreach, , Answers}).
3852 @section Working with definition stacks
3854 @cindex definition stack
3855 @cindex pushdef stack
3856 @cindex stack, macro definition
3857 Thanks to @code{pushdef}, manipulation of a stack is an intrinsic
3858 operation in @code{m4}. Normally, only the topmost definition in a
3859 stack is important, but sometimes, it is desirable to manipulate the
3860 entire definition stack.
3862 @deffn Composite stack_foreach (@var{macro}, @var{action})
3863 @deffnx Composite stack_foreach_lifo (@var{macro}, @var{action})
3864 For each of the @code{pushdef} definitions associated with @var{macro},
3865 invoke the macro @var{action} with a single argument of that definition.
3866 @code{stack_foreach} visits the oldest definition first, while
3867 @code{stack_foreach_lifo} visits the current definition first.
3868 @var{action} should not modify or dereference @var{macro}. There are a
3869 few special macros, such as @code{defn}, which cannot be used as the
3870 @var{macro} parameter.
3873 A sample implementation of these macros is distributed in the file
3874 @file{m4-@value{VERSION}/@/doc/examples/@/stack.m4}.
3878 $ @kbd{m4 -I doc/examples}
3881 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
3883 define(`show', ``$1'
3886 stack_foreach(`a', `show')dnl
3890 stack_foreach_lifo(`a', `show')dnl
3896 Now for the implementation. Note the definition of a helper macro,
3897 @code{_stack_reverse}, which destructively swaps the contents of one
3898 stack of definitions into the reverse order in the temporary macro
3899 @samp{tmp-$1}. By calling the helper twice, the original order is
3900 restored back into the macro @samp{$1}; since the operation is
3901 destructive, this explains why @samp{$1} must not be modified or
3902 dereferenced during the traversal. The caller can then inject
3903 additional code to pass the definition currently being visited to
3904 @samp{$2}. The choice of helper names is intentional; since @samp{-} is
3905 not valid as part of a macro name, there is no risk of conflict with a
3906 valid macro name, and the code is guaranteed to use @code{defn} where
3907 necessary. Finally, note that any macro used in the traversal of a
3908 @code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be
3909 handled by @code{stack_foreach}, since the macro would temporarily be
3910 undefined during the algorithm.
3914 $ @kbd{m4 -I doc/examples}
3915 undivert(`stack.m4')dnl
3916 @result{}divert(`-1')
3917 @result{}# stack_foreach(macro, action)
3918 @result{}# Invoke ACTION with a single argument of each definition
3919 @result{}# from the definition stack of MACRO, starting with the oldest.
3920 @result{}define(`stack_foreach',
3921 @result{}`_stack_reverse(`$1', `tmp-$1')'dnl
3922 @result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')')
3923 @result{}# stack_foreach_lifo(macro, action)
3924 @result{}# Invoke ACTION with a single argument of each definition
3925 @result{}# from the definition stack of MACRO, starting with the newest.
3926 @result{}define(`stack_foreach_lifo',
3927 @result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl
3928 @result{}`_stack_reverse(`tmp-$1', `$1')')
3929 @result{}define(`_stack_reverse',
3930 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')')
3931 @result{}divert`'dnl
3935 @section Building macros with macros
3937 @cindex macro composition
3938 @cindex composing macros
3939 Since m4 is a macro language, it is possible to write macros that
3940 can build other macros. First on the list is a way to automate the
3941 creation of blind macros.
3943 @cindex macro, blind
3945 @deffn Composite define_blind (@var{name}, @ovar{value})
3946 Defines @var{name} as a blind macro, such that @var{name} will expand to
3947 @var{value} only when given explicit arguments. @var{value} should not
3948 be the result of @code{defn} (@pxref{Defn}). This macro is only
3949 recognized with parameters, and results in an empty string.
3952 Defining a macro to define another macro can be a bit tricky. We want
3953 to use a literal @samp{$#} in the argument to the nested @code{define}.
3954 However, if @samp{$} and @samp{#} are adjacent in the definition of
3955 @code{define_blind}, then it would be expanded as the number of
3956 arguments to @code{define_blind} rather than the intended number of
3957 arguments to @var{name}. The solution is to pass the difficult
3958 characters through extra arguments to a helper macro
3959 @code{_define_blind}. When composing macros, it is a common idiom to
3960 need a helper macro to concatenate text that forms parameters in the
3961 composed macro, rather than interpreting the text as a parameter of the
3964 As for the limitation against using @code{defn}, there are two reasons.
3965 If a macro was previously defined with @code{define_blind}, then it can
3966 safely be renamed to a new blind macro using plain @code{define}; using
3967 @code{define_blind} to rename it just adds another layer of
3968 @code{ifelse}, occupying memory and slowing down execution. And if a
3969 macro is a builtin, then it would result in an attempt to define a macro
3970 consisting of both text and a builtin token; this is not supported, and
3971 the builtin token is flattened to an empty string.
3973 With that explanation, here's the definition, and some sample usage.
3974 Notice that @code{define_blind} is itself a blind macro.
3978 define(`define_blind', `ifelse(`$#', `0', ``$0'',
3979 `_$0(`$1', `$2', `$'`#', `$'`0')')')
3981 define(`_define_blind', `define(`$1',
3982 `ifelse(`$3', `0', ``$4'', `$2')')')
3985 @result{}define_blind
3986 define_blind(`foo', `arguments were $*')
3991 @result{}arguments were bar
3992 define(`blah', defn(`foo'))
3997 @result{}arguments were a,b
3999 @result{}ifelse(`$#', `0', ``$0'', `arguments were $*')
4002 @cindex currying arguments
4003 @cindex argument currying
4004 Another interesting composition tactic is argument @dfn{currying}, or
4005 factoring a macro that takes multiple arguments for use in a context
4006 that provides exactly one argument.
4008 @deffn Composite curry (@var{macro}, @dots{})
4009 Expand to a macro call that takes exactly one argument, then appends
4010 that argument to the original arguments and invokes @var{macro} with the
4011 resulting list of arguments.
4014 A demonstration of currying makes the intent of this macro a little more
4015 obvious. The macro @code{stack_foreach} mentioned earlier is an example
4016 of a context that provides exactly one argument to a macro name. But
4017 coupled with currying, we can invoke @code{reverse} with two arguments
4018 for each definition of a macro stack. This example uses the file
4019 @file{m4-@value{VERSION}/@/doc/examples/@/curry.m4} included in the
4024 $ @kbd{m4 -I doc/examples}
4025 include(`curry.m4')include(`stack.m4')
4027 define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'',
4028 `reverse(shift($@@)), `$1'')')
4030 pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3')
4032 stack_foreach(`a', `:curry(`reverse', `4')')
4033 @result{}:1, 4:2, 4:3, 4
4034 curry(`curry', `reverse', `1')(`2')(`3')
4038 Now for the implementation. Notice how @code{curry} leaves off with a
4039 macro name but no open parenthesis, while still in the middle of
4040 collecting arguments for @samp{$1}. The macro @code{_curry} is the
4041 helper macro that takes one argument, then adds it to the list and
4042 finally supplies the closing parenthesis. The use of a comma inside the
4043 @code{shift} call allows currying to also work for a macro that takes
4044 one argument, although it often makes more sense to invoke that macro
4045 directly rather than going through @code{curry}.
4049 $ @kbd{m4 -I doc/examples}
4050 undivert(`curry.m4')dnl
4051 @result{}divert(`-1')
4052 @result{}# curry(macro, args)
4053 @result{}# Expand to a macro call that takes one argument, then invoke
4054 @result{}# macro(args, extra).
4055 @result{}define(`curry', `$1(shift($@@,)_$0')
4056 @result{}define(`_curry', ``$1')')
4057 @result{}divert`'dnl
4060 Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin
4061 tokens, which are silently flattened to the empty string when passed
4062 through another text macro. The following example demonstrates a usage
4063 of @code{curry} that works in M4 1.6, but is not portable to earlier
4068 $ @kbd{m4 -I doc/examples}
4071 curry(`define', `mylen')(defn(`len'))
4077 @cindex renaming macros
4078 @cindex copying macros
4079 @cindex macros, copying
4080 Putting the last few concepts together, it is possible to copy or rename
4081 an entire stack of macro definitions.
4083 @deffn Composite copy (@var{source}, @var{dest})
4084 @deffnx Composite rename (@var{source}, @var{dest})
4085 Ensure that @var{dest} is undefined, then define it to the same stack of
4086 definitions currently in @var{source}. @code{copy} leaves @var{source}
4087 unchanged, while @code{rename} undefines @var{source}. There are only a
4088 few macros, such as @code{copy} or @code{defn}, which cannot be copied
4092 The implementation is relatively straightforward (although since it uses
4093 @code{curry}, it is unable to copy builtin macros when used with M4
4094 1.4.x. See if you can design a portable version that works across all
4095 M4 versions, or @pxref{Improved copy, , Answers}).
4099 $ @kbd{m4 -I doc/examples}
4100 include(`curry.m4')include(`stack.m4')
4102 define(`rename', `copy($@@)undefine(`$1')')dnl
4103 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
4105 `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl
4106 pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2')
4121 @chapter How to debug macros and input
4123 @cindex debugging macros
4124 @cindex macros, debugging
4125 When writing macros for @code{m4}, they often do not work as intended on
4126 the first try (as is the case with most programming languages).
4127 Fortunately, there is support for macro debugging in @code{m4}.
4130 * Dumpdef:: Displaying macro definitions
4131 * Trace:: Tracing macro calls
4132 * Debugmode:: Controlling debugging options
4133 * Debuglen:: Limiting debug output
4134 * Debugfile:: Saving debugging output
4138 @section Displaying macro definitions
4140 @cindex displaying macro definitions
4141 @cindex macros, displaying definitions
4142 @cindex definitions, displaying macro
4143 @cindex standard error, output to
4144 If you want to see what a name expands into, you can use the builtin
4147 @deffn {Builtin (m4)} dumpdef (@ovar{name@dots{}})
4148 Accepts any number of arguments. If called without any arguments, it
4149 displays the definitions of all known names, otherwise it displays the
4150 definitions of each @var{name} given, sorted by name. If a @var{name}
4151 is undefined, the @samp{d} debug level controls whether a warning is
4152 issued (@pxref{Debugmode}). Likewise, the @samp{o} debug level controls
4153 whether the output is issued to standard error or the current debug
4154 file (@pxref{Debugfile}).
4156 The expansion of @code{dumpdef} is void.
4161 define(`foo', `Hello world.')
4164 @error{}foo:@tabchar{}`Hello world.'
4167 @error{}define:@tabchar{}<define>
4171 The last example shows how builtin macros definitions are displayed.
4172 The definition that is dumped corresponds to what would occur if the
4173 macro were to be called at that point, even if other definitions are
4174 still live due to redefining a macro during argument collection.
4178 pushdef(`f', ``$0'1')pushdef(`f', ``$0'2')
4180 f(popdef(`f')dumpdef(`f'))
4181 @error{}f:@tabchar{}``$0'1'
4183 f(popdef(`f')dumpdef(`f'))
4184 @error{}m4:stdin:3: warning: dumpdef: undefined macro 'f'
4192 @xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and
4193 @samp{s} flags affect the details of the display. Remember, the
4194 @samp{q} flag is implied when the @option{--debug} option (@option{-d},
4195 @pxref{Debugging options, , Invoking m4}) is used in the command line
4196 without arguments. Also, @option{--debuglen} (@pxref{Debuglen}) can affect
4197 output, by truncating longer strings (but not builtin and module names).
4199 @comment options: -ds -l3
4202 pushdef(`foo', `1 long string')
4204 pushdef(`foo', defn(`divnum'))
4210 dumpdef(`foo', `dnl', `indir', `__gnu__')
4211 @error{}__gnu__:@tabchar{}@{gnu@}
4212 @error{}dnl:@tabchar{}<dnl>@{m4@}
4213 @error{}foo:@tabchar{}3, <divnum>@{m4@}, 1 l...
4214 @error{}indir:@tabchar{}<indir>@{gnu@}
4216 debugmode(`-ms')debugmode(`+q')
4219 @error{}foo:@tabchar{}`3'
4224 @section Tracing macro calls
4226 @cindex tracing macro expansion
4227 @cindex macro expansion, tracing
4228 @cindex expansion, tracing macro
4229 @cindex standard error, output to
4230 It is possible to trace macro calls and expansions through the builtins
4231 @code{traceon} and @code{traceoff}:
4233 @deffn {Builtin (m4)} traceon (@ovar{names@dots{}})
4234 @deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}})
4235 When called without any arguments, @code{traceon} and @code{traceoff}
4236 will turn tracing on and off, respectively, for all macros, identical to
4237 using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}).
4239 When called with arguments, only the macros listed in @var{names} are
4240 affected, whether or not they are currently defined. A macro's
4241 expansion will be traced if global tracing is on, or if the individual
4242 macro tracing flag is set; to avoid tracing a macro, both the global
4243 flag and the macro must have tracing off.
4245 The expansion of @code{traceon} and @code{traceoff} is void.
4248 Whenever a traced macro is called and the arguments have been collected,
4249 the call is displayed. If the expansion of the macro call is not void,
4250 the expansion can be displayed after the call. The output is printed
4251 to the current debug file (defaulting to standard error,
4256 define(`foo', `Hello World.')
4258 define(`echo', `$@@')
4260 traceon(`foo', `echo')
4263 @error{}m4trace: -1- foo -> `Hello World.'
4264 @result{}Hello World.
4265 echo(`gnus', `and gnats')
4266 @error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats''
4267 @result{}gnus,and gnats
4270 The number between dashes is the depth of the expansion. It is one most
4271 of the time, signifying an expansion at the outermost level, but it
4272 increases when macro arguments contain unquoted macro calls. The
4273 maximum number that will appear between dashes is controlled by the
4274 option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control,
4275 , Invoking m4}). Additionally, the option @option{--trace} (or
4276 @option{-t}) can be used to invoke @code{traceon(@var{name})} before
4279 @comment options: -d-V -L3 -tifelse
4282 $ @kbd{m4 -L 3 -t ifelse}
4284 @error{}m4trace: -1- ifelse
4286 ifelse(ifelse(ifelse(`three levels')))
4287 @error{}m4trace: -3- ifelse
4288 @error{}m4trace: -2- ifelse
4289 @error{}m4trace: -1- ifelse
4291 ifelse(ifelse(ifelse(ifelse(`four levels'))))
4292 @error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it
4295 Tracing by name is an attribute that is preserved whether the macro is
4296 defined or not. This allows the selection of macros to trace before
4297 those macros are defined.
4308 @error{}m4:stdin:4: warning: defn: undefined macro 'foo'
4311 @error{}m4:stdin:5: warning: undefine: undefined macro 'foo'
4318 @error{}m4:stdin:8: warning: popdef: undefined macro 'foo'
4320 define(`foo', `bar')
4323 @error{}m4trace: -1- foo -> `bar'
4327 ifdef(`foo', `yes', `no')
4330 @error{}m4:stdin:13: warning: indir: undefined macro 'foo'
4332 define(`foo', `blah')
4335 @error{}m4trace: -1- foo -> `blah'
4339 Tracing even works on builtins. However, @code{defn} (@pxref{Defn})
4340 does not transfer tracing status.
4347 @error{}m4trace: -1- traceon(`traceoff') -> `'
4349 traceoff(`traceoff')
4350 @error{}m4trace: -1- traceoff(`traceoff') -> `'
4354 traceon(`eval', `m4_divnum')
4356 define(`m4_eval', defn(`eval'))
4358 define(`m4_divnum', defn(`divnum'))
4361 @error{}m4trace: -1- eval(`0') -> `0'
4364 @error{}m4trace: -2- m4_divnum -> `0'
4368 As of GNU M4 2.0, named macro tracing is independent of global
4369 tracing status; calling @code{traceoff} without arguments turns off the
4370 global trace flag, but does not turn off tracing for macros where
4371 tracing was requested by name. Likewise, calling @code{traceon} without
4372 arguments will affect tracing of macros that are not defined yet. This
4373 behavior matches traditional implementations of @code{m4}.
4379 define(`foo', `bar')
4380 @error{}m4trace: -1- define(`foo', `bar') -> `'
4382 foo # traced, even though foo was not defined at traceon
4383 @error{}m4trace: -1- foo -> `bar'
4384 @result{}bar # traced, even though foo was not defined at traceon
4386 @error{}m4trace: -1- traceoff(`foo') -> `'
4388 foo # traced, since global tracing is still on
4389 @error{}m4trace: -1- foo -> `bar'
4390 @result{}bar # traced, since global tracing is still on
4392 @error{}m4trace: -1- traceon(`foo') -> `'
4395 @error{}m4trace: -1- traceoff -> `'
4397 foo # traced, since foo is now traced by name
4398 @error{}m4trace: -1- foo -> `bar'
4399 @result{}bar # traced, since foo is now traced by name
4403 @result{}bar # untraced
4406 However, GNU M4 prior to 2.0 had slightly different
4407 semantics, where @code{traceon} without arguments only affected symbols
4408 that were defined at that moment, and @code{traceoff} without arguments
4409 stopped all tracing, even when tracing was requested by macro name. The
4410 addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it
4411 possible to write a file that approximates the older semantics
4412 regardless of which version of GNU M4 is in use.
4414 @comment options: -d-V
4418 `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)',
4419 `builtin(`traceon', $@@)')')dnl
4420 define(`traceoff', `ifelse(`$#', `0',
4421 `builtin(`traceoff')builtin(`traceoff', m4symbols)',
4422 `builtin(`traceoff', $@@)')')')dnl
4425 traceon # called before b is defined, so b is not traced
4426 @result{} # called before b is defined, so b is not traced
4428 @error{}m4trace: -1- define
4431 @error{}m4trace: -1- a
4434 @error{}m4trace: -1- traceon
4435 @error{}m4trace: -1- ifelse
4436 @error{}m4trace: -1- builtin
4439 @error{}m4trace: -1- a
4440 @error{}m4trace: -1- b
4442 traceoff # stops tracing b, even though it was traced by name
4443 @error{}m4trace: -1- traceoff
4444 @error{}m4trace: -1- ifelse
4445 @error{}m4trace: -1- builtin
4446 @error{}m4trace: -2- m4symbols
4447 @error{}m4trace: -1- builtin
4448 @result{} # stops tracing b, even though it was traced by name
4453 @xref{Debugmode}, for information on controlling the details of the
4454 display. The format of the trace output is not specified by
4455 POSIX, and varies between implementations of @code{m4}.
4457 Starting with M4 1.6, tracing also works via @code{indir}
4458 (@pxref{Indir}). However, since tracing is an attribute tracked by
4459 macro names, and @code{builtin} bypasses macro names (@pxref{Builtin}),
4460 it is not possible for @code{builtin} to trace which subsidiary builtin
4461 it invokes. If you are worried about tracking all invocations of a
4462 given builtin, you should also trace @code{builtin}, or enable global
4463 tracing (the @samp{t} debug level, @pxref{Debugmode}).
4467 define(`my_defn', defn(`defn'))undefine(`defn')
4469 define(`foo', `bar')traceon(`foo', `defn', `my_defn')
4472 @error{}m4trace: -1- foo -> `bar'
4475 @error{}m4trace: -1- foo -> `bar'
4478 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4480 indir(`my_defn', `foo')
4481 @error{}m4trace: -1- my_defn(`foo') -> ``bar''
4483 builtin(`defn', `foo')
4487 builtin(`defn', builtin(`shift', `', `foo'))
4488 @error{}m4trace: -1- id 12: builtin ... = <builtin>
4489 @error{}m4trace: -2- id 13: builtin ... = <builtin>
4490 @error{}m4trace: -2- id 13: builtin(`shift', `', `foo') -> ``foo''
4491 @error{}m4trace: -1- id 12: builtin(`defn', `foo') -> ``bar''
4493 indir(`my_defn', indir(`shift', `', `foo'))
4494 @error{}m4trace: -1- id 14: indir ... = <indir>
4495 @error{}m4trace: -2- id 15: indir ... = <indir>
4496 @error{}m4trace: -2- id 15: shift ... = <shift>
4497 @error{}m4trace: -2- id 15: shift(`', `foo') -> ``foo''
4498 @error{}m4trace: -2- id 15: indir(`shift', `', `foo') -> ``foo''
4499 @error{}m4trace: -1- id 14: my_defn ... = <defn>
4500 @error{}m4trace: -1- id 14: my_defn(`foo') -> ``bar''
4501 @error{}m4trace: -1- id 14: indir(`my_defn', `foo') -> ``bar''
4506 @section Controlling debugging options
4508 @cindex controlling debugging output
4509 @cindex debugging output, controlling
4510 The @option{--debug} option to @code{m4} (also spelled
4511 @option{--debugmode} or @option{-d}, @pxref{Debugging options, ,
4512 Invoking m4}) controls the amount of details presented in three
4513 categories of output. Trace output is requested by @code{traceon}
4514 (@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in
4515 relation to a macro invocation. Debug output tracks useful events not
4516 associated with a macro invocation, and each line is prefixed by
4517 @samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is
4518 affected, with no prefix added to the output lines.
4520 The @var{flags} following the option can be one or more of the
4525 In trace output, show the actual arguments that were collected before
4526 invoking the macro. Arguments are subject to length truncation
4527 specified by @code{debuglen} (@pxref{Debuglen}).
4530 In trace output, show an additional line for each macro call, when the
4531 macro is seen, but before the arguments are collected, and show the
4532 definition of the macro that will be used for the expansion. By
4533 default, only one line is printed, after all arguments are collected and
4534 the expansion determined. The definition is subject to length
4535 truncation specified by @code{debuglen} (@pxref{Debuglen}). This is
4536 often used with the @samp{x} flag.
4539 Output a warning on any attempt to dereference an undefined macro via
4540 @code{builtin}, @code{defn}, @code{dumpdef}, @code{indir},
4541 @code{popdef}, or @code{undefine}. Note that @code{indef},
4543 @code{traceon}, and @code{traceoff} do not dereference undefined macros.
4544 Like any other warning, the warnings enabled by this flag go to standard
4545 error regardless of the current @code{debugfile} setting, and will
4546 change exit status if the command line option @option{--fatal-warnings}
4547 was specified. This flag is useful in diagnosing spelling mistakes in
4548 macro names. It is enabled by default when neither @option{--debug} nor
4549 @option{--fatal-warnings} are specified on the command line.
4552 In trace output, show the expansion of each macro call. The expansion
4553 is subject to length truncation specified by @code{debuglen}
4557 In debug and trace output, include the name of the current input file in
4561 In debug output, print a message each time the current input file is
4565 In debug and trace output, include the current input line number in the
4569 In debug output, print a message each time a module is manipulated
4570 (@pxref{Modules}). In trace output when the @samp{c} flag is in effect,
4571 and in dumpdef output, follow builtin macros with their module name,
4572 surrounded by braces (@samp{@{@}}).
4575 Output @code{dumpdef} data to standard error instead of the current
4576 debug file. This can be useful when post-processing trace output, where
4577 interleaving dumpdef and trace output can cause ambiguities.
4580 In debug output, print a message when a named file is found through the
4581 path search mechanism (@pxref{Search Path}), giving the actual file name
4585 In trace and dumpdef output, quote actual arguments and macro expansions
4586 in the display with the current quotes. This is useful in connection
4587 with the @samp{a} and @samp{e} flags above.
4590 In dumpdef output, show the entire stack of definitions associated with
4591 a symbol via @code{pushdef}.
4594 In trace output, trace all macro calls made in this invocation of
4595 @code{m4}. This is equivalent to using @code{traceon} without
4599 In trace output, add a unique `macro call id' to each line of the trace
4600 output. This is useful in connection with the @samp{c} flag above, to
4601 match where a macro is first recognized with where it is finally
4602 expanded, in spite of intermediate expansions that occur while
4603 collecting arguments. It can also be used in isolation to determine how
4604 many macros have been expanded.
4607 A shorthand for all of the above flags.
4610 As special cases, if @var{flags} starts with a @samp{+}, the named flags
4611 are enabled without impacting other flags, and if it starts with a
4612 @samp{-}, the named flags are disabled without impacting other flags.
4613 Without either of these starting characters, @var{flags} simply replaces
4614 the previous setting.
4615 @comment FIXME - should we accept usage like debugmode(+fl-q)? Also,
4616 @comment should we add debugmode(?) which expands to the current
4617 @comment enabled flags, and debugmode(e?) which expands to e if e is
4618 @comment currently enabled?
4620 If no flags are specified with the @option{--debug} option, the default is
4621 @samp{+adeq}. Many examples in this manual show their output using
4624 @cindex GNU extensions
4625 There is a builtin macro @code{debugmode}, which allows on-the-fly control of
4626 the debugging output format:
4628 @deffn {Builtin (gnu)} debugmode (@ovar{flags})
4629 The argument @var{flags} should be a subset of the letters listed above.
4630 If no argument is present, all debugging flags are cleared (as if
4631 @var{flags} were an explicit @samp{-V}). With an empty argument, the
4632 most common flags are enabled (as if @var{flags} were an explicit
4633 @samp{+adeq}). If an unknown flag is encountered, an error is issued.
4635 The expansion of @code{debugmode} is void.
4638 @comment options: -d-V
4641 define(`foo', `FOO$1')
4643 traceon(`foo', `divnum')
4645 debugmode()dnl same as debugmode(`+adeq')
4647 @error{}m4trace: -1- foo -> `FOO'
4649 debugmode(`V')debugmode(`-q')
4650 @error{}m4trace:stdin:5: -1- id 7: debugmode ... = <debugmode>@{gnu@}
4651 @error{}m4trace:stdin:5: -1- id 7: debugmode(`-q') -> `'
4655 @error{}m4trace:stdin:6: -1- id 8: foo ... = FOO$1
4656 @error{}m4trace:stdin:6: -1- id 8: foo(BAR) -> FOOBAR
4658 debugmode`'dnl same as debugmode(`-V')
4659 @error{}m4trace:stdin:8: -1- id 9: debugmode ... = <debugmode>@{gnu@}
4660 @error{}m4trace:stdin:8: -1- id 9: debugmode ->@w{ }
4662 @error{}m4trace: -1- foo
4667 @error{}m4trace:11: -1- id 13: foo ... = FOO$1
4668 @error{}m4trace:11: -2- id 14: divnum ... = <divnum>@{m4@}
4669 @error{}m4trace:11: -2- id 14: divnum
4670 @error{}m4trace:11: -1- id 13: foo
4676 This example shows the effects of the debug flags that are not related
4680 @comment options: -dip
4682 $ @kbd{m4 -dip -I doc/examples}
4683 @error{}m4debug: input read from 'stdin'
4684 define(`foo', `m4wrap(`wrapped text
4687 include(`incl.m4')dnl
4688 @error{}m4debug: path search for 'incl.m4' found 'doc/examples/incl.m4'
4689 @error{}m4debug: input read from 'doc/examples/incl.m4'
4690 @result{}Include file start
4691 @result{}Include file end
4692 @error{}m4debug: input reverted to stdin, line 3
4694 @error{}m4debug: input exhausted
4695 @error{}m4debug: input from m4wrap recursion level 1
4696 @result{}wrapped text
4697 @error{}m4debug: input from m4wrap exhausted
4701 @section Limiting debug output
4703 @cindex GNU extensions
4706 @cindex limiting trace output length
4707 @cindex trace output, limiting length
4708 @cindex dumpdef output, limiting length
4709 When debugging, sometimes it is desirable to reduce the clutter of
4710 arbitrary-length strings, because the prefix carries enough information
4711 to understand the issues. The builtin macro @code{debuglen}, along with
4712 the command line option counterpart @option{--debuglen} (or @option{-l},
4713 @pxref{Debugging options, , Invoking m4}), allow on-the-fly control of
4714 debugging string lengths:
4716 @deffn {Builtin (gnu)} debuglen (@var{len})
4717 The argument @var{len} is an integer that controls how much of
4718 arbitrary-length strings should be output during trace and dumpdef
4719 output. If specified to a non-zero value, then strings longer than that
4720 length are truncated, and @samp{...} included in the output to show that
4721 truncation took place. A warning is issued if @var{len} cannot be
4722 parsed as an integer.
4723 @comment FIXME - make this understand an optional suffix, similar to how
4724 @comment --debuglen does. Also, we need a section documenting scaling
4726 @comment FIXME - should we allow len to be `?', meaning expand to the
4727 @comment current value?
4729 The macro @code{debuglen} is recognized only with parameters.
4732 The following example demonstrates the behavior of length truncation.
4733 Note that each argument and the final result are individually truncated.
4734 Also, the special tokens for builtin functions are not truncated.
4736 @comment options: -l6 -techo -tdefn
4738 $ @kbd{m4 -d -l 6 -t echo -t defn}
4740 @error{}m4:stdin:1: warning: debuglen: non-numeric argument 'oops'
4742 define(`echo', `$@@')
4744 echo(`1', `long string')
4745 @error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...'
4746 @result{}1,long string
4747 indir(`echo', defn(`changequote'))
4748 @error{}m4trace: -2- defn(`change...') -> `<changequote>'
4749 @error{}m4trace: -1- echo(<changequote>) -> ``<changequote>''
4756 @error{}m4trace: -1- echo(`long string') -> ``long string''
4757 @result{}long string
4761 @error{}m4trace: -1- echo(`long string') -> ``long string...'
4762 @result{}long string
4766 @section Saving debugging output
4768 @cindex saving debugging output
4769 @cindex debugging output, saving
4770 @cindex output, saving debugging
4771 @cindex GNU extensions
4772 Debug and tracing output can be redirected to files using either the
4773 @option{--debugfile} option to @code{m4} (@pxref{Debugging options, ,
4774 Invoking m4}), or with the builtin macro @code{debugfile}:
4776 @deffn {Builtin (gnu)} debugfile (@ovar{file})
4777 Send all further debug and trace output to @var{file}, opened in append
4778 mode. If @var{file} is the empty string, debug and trace output are
4779 discarded. If @code{debugfile} is called without any arguments, debug
4780 and trace output are sent to standard error. Output from @code{dumpdef}
4781 is sent to this file if the debug level @code{o} is not set
4782 (@pxref{Debugmode}). This does not affect
4783 warnings, error messages, or @code{errprint} output, which are
4784 always sent to standard error. If @var{file} cannot be opened, the
4785 current debug file is unchanged, and an error is issued.
4787 When the @option{--safer} option (@pxref{Operation modes, , Invoking
4788 m4}) is in effect, @var{file} must be empty or omitted, since otherwise
4789 an input file could cause the modification of arbitrary files.
4791 The expansion of @code{debugfile} is void.
4799 @error{}m4:stdin:2: warning: divnum: extra arguments ignored: 1 > 0
4800 @error{}m4trace: -1- divnum(`extra') -> `0'
4805 @error{}m4:stdin:4: warning: divnum: extra arguments ignored: 1 > 0
4810 @error{}m4trace: -1- divnum -> `0'
4814 Although the @option{--safer} option cripples @code{debugfile} to a
4815 limited subset of capabilities, you may still use the @option{--debugfile}
4816 option from the command line with no restrictions.
4818 @comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l
4821 $ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq}
4822 foo # traced to `trace'
4823 @result{}bar # traced to `trace'
4825 @error{}m4:stdin:2: debugfile: disabled by --safer
4827 foo # traced to `trace'
4828 @result{}bar # traced to `trace'
4831 foo # trace discarded
4832 @result{}bar # trace discarded
4835 foo # traced to stderr
4836 @error{}m4trace:7: -1- foo -> `bar'
4837 @result{}bar # traced to stderr
4838 undivert(`trace')dnl
4839 @result{}m4trace:1: -1- foo -> `bar'
4840 @result{}m4trace:3: -1- foo -> `bar'
4843 Sometimes it is useful to post-process trace output, even though there
4844 is no standardized format for trace output. In this situation, forcing
4845 @code{dumpdef} to output to standard error instead of the default of the
4846 current debug file will avoid any ambiguities between the two types of
4847 output; it also allows debugging via @code{dumpdef} when debug output is
4855 @error{}m4trace: -1- divnum -> `0'
4858 @error{}divnum:@tabchar{}<divnum>
4871 @error{}divnum:@tabchar{}<divnum>
4876 @chapter Input control
4878 This chapter describes various builtin macros for controlling the input
4882 * Dnl:: Deleting whitespace in input
4883 * Changequote:: Changing the quote characters
4884 * Changecom:: Changing the comment delimiters
4885 * Changeresyntax:: Changing the regular expression syntax
4886 * Changesyntax:: Changing the lexical structure of the input
4887 * M4wrap:: Saving text until end of input
4891 @section Deleting whitespace in input
4893 @cindex deleting whitespace in input
4894 @cindex discarding input
4895 @cindex input, discarding
4896 The builtin @code{dnl} stands for ``Discard to Next Line'':
4898 @deffn {Builtin (m4)} dnl
4899 All characters, up to and including the next newline, are discarded
4900 without performing any macro expansion. A warning is issued if the end
4901 of the file is encountered without a newline.
4903 The expansion of @code{dnl} is void.
4906 It is often used in connection with @code{define}, to remove the
4907 newline that follows the call to @code{define}. Thus
4910 define(`foo', `Macro `foo'.')dnl A very simple macro, indeed.
4915 The input up to and including the next newline is discarded, as opposed
4916 to the way comments are treated (@pxref{Comments}), when the command
4917 line option @option{--discard-comments} is not in effect
4918 (@pxref{Operation modes, , Invoking m4}).
4920 Usually, @code{dnl} is immediately followed by an end of line or some
4921 other whitespace. GNU @code{m4} will produce a warning diagnostic if
4922 @code{dnl} is followed by an open parenthesis. In this case, @code{dnl}
4923 will collect and process all arguments, looking for a matching close
4924 parenthesis. All predictable side effects resulting from this
4925 collection will take place. @code{dnl} will return no output. The
4926 input following the matching close parenthesis up to and including the
4927 next newline, on whatever line containing it, will still be discarded.
4930 dnl(`args are ignored, but side effects occur',
4931 define(`foo', `like this')) while this text is ignored: undefine(`foo')
4932 @error{}m4:stdin:1: warning: dnl: extra arguments ignored: 2 > 0
4933 See how `foo' was defined, foo?
4934 @result{}See how foo was defined, like this?
4937 If the end of file is encountered without a newline character, a
4938 warning is issued and dnl stops consuming input.
4941 m4wrap(`m4wrap(`2 hi
4947 @error{}m4:stdin:1: warning: dnl: end of file treated as newline
4952 @section Changing the quote characters
4954 @cindex changing quote delimiters
4955 @cindex quote delimiters, changing
4956 @cindex delimiters, changing
4957 The default quote delimiters can be changed with the builtin
4960 @deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '})
4961 This sets @var{start} as the new begin-quote delimiter and @var{end} as
4962 the new end-quote delimiter. If both arguments are missing, the default
4963 quotes (@code{`} and @code{'}) are used. If @var{start} is void, then
4964 quoting is disabled. Otherwise, if @var{end} is missing or void, the
4965 default end-quote delimiter (@code{'}) is used. The quote delimiters
4966 can be of any length.
4968 The expansion of @code{changequote} is void.
4972 changequote(`[', `]')
4974 define([foo], [Macro [foo].])
4980 The quotation strings can safely contain eight-bit characters.
4981 If no single character is appropriate, @var{start} and @var{end} can be
4982 of any length. Other implementations cap the delimiter length to five
4983 characters, but GNU has no inherent limit.
4986 changequote(`[[[', `]]]')
4988 define([[[foo]]], [[[Macro [[[[[foo]]]]].]]])
4991 @result{}Macro [[foo]].
4994 Calling @code{changequote} with @var{start} as the empty string will
4995 effectively disable the quoting mechanism, leaving no way to quote text.
4996 However, using an empty string is not portable, as some other
4997 implementations of @code{m4} revert to the default quoting, while others
4998 preserve the prior non-empty delimiter. If @var{start} is not empty,
4999 then an empty @var{end} will use the default end-quote delimiter of
5000 @samp{'}, as otherwise, it would be impossible to end a quoted string.
5001 Again, this is not portable, as some other @code{m4} implementations
5002 reuse @var{start} as the end-quote delimiter, while others preserve the
5003 previous non-empty value. Omitting both arguments restores the default
5004 begin-quote and end-quote delimiters; fortunately this behavior is
5005 portable to all implementations of @code{m4}.
5008 define(`foo', `Macro `FOO'.')
5013 @result{}Macro `FOO'.
5015 @result{}`Macro `FOO'.'
5022 There is no way in @code{m4} to quote a string containing an unmatched
5023 begin-quote, except using @code{changequote} to change the current
5026 If the quotes should be changed from, say, @samp{[} to @samp{[[},
5027 temporary quote characters have to be defined. To achieve this, two
5028 calls of @code{changequote} must be made, one for the temporary quotes
5029 and one for the new quotes.
5031 Macros are recognized in preference to the begin-quote string, so if a
5032 prefix of @var{start} can be recognized as part of a potential macro
5033 name, the quoting mechanism is effectively disabled. Unless you use
5034 @code{changesyntax} (@pxref{Changesyntax}), this means that @var{start}
5035 should not begin with a letter, digit, or @samp{_} (underscore).
5036 However, even though quoted strings are not recognized, the quote
5037 characters can still be discerned in macro expansion and in trace
5041 define(`echo', `$@@')
5045 changequote(`q', `Q')
5053 changequote(`-', `EOF')
5059 changequote(`1', `2')
5067 Quotes are recognized in preference to argument collection. In
5068 particular, if @var{start} is a single @samp{(}, then argument
5069 collection is effectively disabled. For portability with other
5070 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5071 @samp{)} as the first character in @var{start}.
5074 define(`echo', `$#:$@@:')
5078 changequote(`(',`)')
5084 changequote(`((', `))')
5092 changequote(`,', `)')
5098 However, if you are not worried about portability, using @samp{(} and
5099 @samp{)} as quoting characters has an interesting property---you can use
5100 it to compute a quoted string containing the expansion of any quoted
5101 text, as long as the expansion results in both balanced quotes and
5102 balanced parentheses. The trick is realizing @code{expand} uses
5103 @samp{$1} unquoted, to trigger its expansion using the normal quoting
5104 characters, but uses extra parentheses to group unquoted commas that
5105 occur in the expansion without consuming whitespace following those
5106 commas. Then @code{_expand} uses @code{changequote} to convert the
5107 extra parentheses back into quoting characters. Note that it takes two
5108 more @code{changequote} invocations to restore the original quotes.
5109 Contrast the behavior on whitespace when using @samp{$*}, via
5110 @code{quote}, to attempt the same task.
5113 changequote(`[', `]')dnl
5114 define([a], [1, (b)])dnl
5116 define([quote], [[$*]])dnl
5117 define([expand], [_$0(($1))])dnl
5119 [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl
5120 expand([a, a, [a, a], [[a, a]]])
5121 @result{}1, (2), 1, (2), a, a, [a, a]
5122 quote(a, a, [a, a], [[a, a]])
5123 @result{}1,(2),1,(2),a, a,[a, a]
5126 If @var{end} is a prefix of @var{start}, the end-quote will be
5127 recognized in preference to a nested begin-quote. In particular,
5128 changing the quotes to have the same string for @var{start} and
5129 @var{end} disables nesting of quotes. When quote nesting is disabled,
5130 it is impossible to double-quote strings across macro expansions, so
5131 using the same string is not done very often.
5136 changequote(`""', `"')
5148 changequote(`"', `"')
5154 It is an error if the end of file occurs within a quoted string.
5159 @result{}hello world
5162 @error{}m4:stdin:2: end of file in string
5167 ifelse(`dangling quote
5169 @error{}m4:stdin:1: ifelse: end of file in string
5173 @section Changing the comment delimiters
5175 @cindex changing comment delimiters
5176 @cindex comment delimiters, changing
5177 @cindex delimiters, changing
5178 The default comment delimiters can be changed with the builtin
5179 macro @code{changecom}:
5181 @deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}})
5182 This sets @var{start} as the new begin-comment delimiter and @var{end}
5183 as the new end-comment delimiter. If both arguments are missing, or
5184 @var{start} is void, then comments are disabled. Otherwise, if
5185 @var{end} is missing or void, the default end-comment delimiter of
5186 newline is used. The comment delimiters can be of any length.
5188 The expansion of @code{changecom} is void.
5192 define(`comment', `COMMENT')
5195 @result{}# A normal comment
5196 changecom(`/*', `*/')
5198 # Not a comment anymore
5199 @result{}# Not a COMMENT anymore
5200 But: /* this is a comment now */ while this is not a comment
5201 @result{}But: /* this is a comment now */ while this is not a COMMENT
5204 @cindex comments, copied to output
5205 Note how comments are copied to the output, much as if they were quoted
5206 strings. If you want the text inside a comment expanded, quote the
5207 begin-comment delimiter.
5209 Calling @code{changecom} without any arguments, or with @var{start} as
5210 the empty string, will effectively disable the commenting mechanism. To
5211 restore the original comment start of @samp{#}, you must explicitly ask
5212 for it. If @var{start} is not empty, then an empty @var{end} will use
5213 the default end-comment delimiter of newline, as otherwise, it would be
5214 impossible to end a comment. However, this is not portable, as some
5215 other @code{m4} implementations preserve the previous non-empty
5219 define(`comment', `COMMENT')
5223 # Not a comment anymore
5224 @result{}# Not a COMMENT anymore
5228 @result{}# comment again
5231 The comment strings can safely contain eight-bit characters.
5232 If no single character is appropriate, @var{start} and @var{end} can be
5233 of any length. Other implementations cap the delimiter length to five
5234 characters, but GNU has no inherent limit.
5236 As of M4 1.6, macros and quotes are recognized in preference to
5237 comments, so if a prefix of @var{start} can be recognized as part of a
5238 potential macro name, or confused with a quoted string, the comment
5239 mechanism is effectively disabled (earlier versions of GNU M4
5240 favored comments, but this was inconsistent with other implementations).
5241 Unless you use @code{changesyntax} (@pxref{Changesyntax}), this means
5242 that @var{start} should not begin with a letter, digit, or @samp{_}
5243 (underscore), and that neither the start-quote nor the start-comment
5244 string should be a prefix of the other.
5249 define(`hi1hi2', `hello')
5261 changecom(`[[', `]]')
5263 changequote(`[[[', `]]]')
5273 changecom(`[[[', `]]]')
5275 changequote(`[[', `]]')
5283 Comments are recognized in preference to argument collection. In
5284 particular, if @var{start} is a single @samp{(}, then argument
5285 collection is effectively disabled. For portability with other
5286 implementations, it is a good idea to avoid @samp{(}, @samp{,}, and
5287 @samp{)} as the first character in @var{start}.
5290 define(`echo', `$#:$*:$@@:')
5300 changecom(`((', `))')
5309 @result{}1:HI,hi)bye:HI,hi)bye:
5313 @result{}3:HI,,HI,HI:HI,,`'hi,HI:
5314 echo(hi,`,`'hi',hi`'changecom(`,,', `hi'))
5315 @result{}3:HI,,`'hi,HI:HI,,`'hi,HI:
5318 It is an error if the end of file occurs within a comment.
5322 changecom(`/*', `*/')
5326 @error{}m4:stdin:2: end of file in comment
5331 changecom(`/*', `*/')
5333 len(/*dangling comment
5335 @error{}m4:stdin:2: len: end of file in comment
5338 @node Changeresyntax
5339 @section Changing the regular expression syntax
5341 @cindex regular expression syntax, changing
5342 @cindex basic regular expressions
5343 @cindex extended regular expressions
5344 @cindex regular expressions
5345 @cindex expressions, regular
5346 @cindex syntax, changing regular expression
5347 @cindex flavors of regular expressions
5348 @cindex GNU extensions
5349 The GNU extensions @code{patsubst}, @code{regexp}, and more
5350 recently, @code{renamesyms} each deal with regular expressions. There
5351 are multiple flavors of regular expressions, so the
5352 @code{changeresyntax} builtin exists to allow choosing the default
5355 @deffn {Builtin (gnu)} changeresyntax (@var{resyntax})
5356 Changes the default regular expression syntax used by M4 according to
5357 the value of @var{resyntax}, equivalent to passing @var{resyntax} as the
5358 argument to the command line option @option{--regexp-syntax}
5359 (@pxref{Operation modes, , Invoking m4}). If @var{resyntax} is empty,
5360 the default flavor is reverted to the @code{GNU_M4} style, compatible
5363 @var{resyntax} can be any one of the values in the table below. Case is
5364 not important, and @samp{-} or @samp{ } can be substituted for @samp{_} in
5365 the given names. If @var{resyntax} is unrecognized, a warning is
5366 issued and the default flavor is not changed.
5370 @xref{awk regular expression syntax}, for details.
5376 @xref{posix-basic regular expression syntax}, for details.
5380 @itemx POSIX_EXTENDED
5381 @xref{posix-extended regular expression syntax}, for details.
5385 @xref{gnu-awk regular expression syntax}, for details.
5389 @xref{egrep regular expression syntax}, for details.
5394 @xref{emacs regular expression syntax}, for details. This is the
5395 default regular expression flavor.
5398 @xref{grep regular expression syntax}, for details.
5401 @itemx POSIX_MINIMAL
5402 @itemx POSIX_MINIMAL_BASIC
5403 @xref{posix-minimal-basic regular expression syntax}, for details.
5406 @xref{posix-awk regular expression syntax}, for details.
5409 @xref{posix-egrep regular expression syntax}, for details.
5412 The expansion of @code{changeresyntax} is void.
5413 The macro @code{changeresyntax} is recognized only with parameters.
5414 This macro was added in M4 2.0.
5417 For an example of how @var{resyntax} is recognized, the first three
5418 usages select the @samp{GNU_M4} regular expression flavor:
5421 changeresyntax(`gnu m4')
5423 changeresyntax(`GNU-m4')
5425 changeresyntax(`Gnu_M4')
5427 changeresyntax(`unknown')
5428 @error{}m4:stdin:4: warning: changeresyntax: bad syntax-spec: 'unknown'
5432 Using @code{changeresyntax} makes it possible to omit the optional
5433 @var{resyntax} parameter to other macros, while still using a different
5434 regular expression flavor.
5437 patsubst(`ab', `a|b', `c')
5439 patsubst(`ab', `a\|b', `c')
5441 patsubst(`ab', `a|b', `c', `EXTENDED')
5443 changeresyntax(`EXTENDED')
5445 patsubst(`ab', `a|b', `c')
5447 patsubst(`ab', `a\|b', `c')
5452 @section Changing the lexical structure of the input
5454 @cindex lexical structure of the input
5455 @cindex input, lexical structure of the
5456 @cindex syntax table
5457 @cindex changing syntax
5458 @cindex GNU extensions
5460 The macro @code{changesyntax} and all associated functionality is
5461 experimental (@pxref{Experiments}). The functionality might change in
5462 the future. Please direct your comments about it the same way you would
5466 The input to @code{m4} is read character by character, and these
5467 characters are grouped together to form input tokens (such as macro
5468 names, strings, comments, etc.).
5470 Each token is parsed according to certain rules. For example, a macro
5471 name starts with a letter or @samp{_} and consists of the longest
5472 possible string of letters, @samp{_} and digits. But who is to decide
5473 what characters are letters, digits, quotes, white space? Earlier the
5474 operating system decided, now you do. The builtin macro
5475 @code{changesyntax} is used to change the way @code{m4} parses the input
5478 @deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
5479 Each @var{syntax-spec} is a two-part string. The first part is a
5480 command, consisting of a single character describing a syntax category,
5481 and an optional one-character action. The action can be @samp{-} to
5482 remove the listed characters from that category, @samp{=} to set the
5483 category to the listed characters
5484 and reassign all other characters previously in that category to
5485 `Other', or @samp{+} to add the listed characters to the category
5486 without affecting other characters. If an action is not specified, but
5487 additional characters are present, then @samp{=} is assumed.
5489 The remaining characters of each @var{syntax-spec} form the set of
5490 characters to perform the action on for that syntax category. Character
5491 ranges are expanded as for @code{translit} (@pxref{Translit}). To start
5492 the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
5495 If @var{syntax-spec} is just a category, and no action or characters
5496 were specified, then all characters in that category are reset to their
5497 default state. A warning is issued if the category character is not
5498 valid. If @var{syntax-spec} is the empty string, then all categories
5499 are reset to their default state.
5501 Syntax categories are divided into basic and context. Every input
5502 byte belongs to exactly one basic syntax category. Additionally, any
5503 byte can be assigned to a context category regardless of its current
5504 basic category. Context categories exist because a character can
5505 behave differently when parsed in isolation than when it occurs in
5506 context to close out a token started by another basic category (for
5507 example, @kbd{newline} defaults to the basic category `Whitespace' as
5508 well as the context category `End comment').
5510 The following table describes the case-insensitive designation for each
5511 syntax category (the first byte in @var{syntax-spec}), and a description
5512 of what each category controls.
5514 @multitable @columnfractions .06 .20 .13 .55
5515 @headitem Code @tab Category @tab Type @tab Description
5517 @item @kbd{W} @tab @dfn{Words} @tab Basic
5518 @tab Characters that can start a macro name. Defaults to the letters as
5519 defined by the locale, and the character @samp{_}.
5521 @item @kbd{D} @tab @dfn{Digits} @tab Basic
5522 @tab Characters that, together with the letters, form the remainder of a
5523 macro name. Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any
5524 other digits defined by the locale.
5526 @item @kbd{S} @tab @dfn{White space} @tab Basic
5527 @tab Characters that should be trimmed from the beginning of each argument to
5528 a macro call. The defaults are space, tab, newline, carriage return,
5529 form feed, and vertical tab, and any others as defined by the locale.
5531 @item @kbd{(} @tab @dfn{Open parenthesis} @tab Basic
5532 @tab Characters that open the argument list of a macro call. The default is
5533 the single character @samp{(}.
5535 @item @kbd{)} @tab @dfn{Close parenthesis} @tab Basic
5536 @tab Characters that close the argument list of a macro call. The default
5537 is the single character @samp{)}.
5539 @item @kbd{,} @tab @dfn{Argument separator} @tab Basic
5540 @tab Characters that separate the arguments of a macro call. The default is
5541 the single character @samp{,}.
5543 @item @kbd{L} @tab @dfn{Left quote} @tab Basic
5544 @tab The set of characters that can start a single-character quoted string.
5545 The default is the single character @samp{`}. For multiple-character
5546 quote delimiters, use @code{changequote} (@pxref{Changequote}).
5548 @item @kbd{R} @tab @dfn{Right quote} @tab Context
5549 @tab The set of characters that can end a single-character quoted string.
5550 The default is the single character @samp{'}. For multiple-character
5551 quote delimiters, use @code{changequote} (@pxref{Changequote}). Note
5552 that @samp{'} also defaults to the syntax category `Other', when it
5553 appears in isolation.
5555 @item @kbd{B} @tab @dfn{Begin comment} @tab Basic
5556 @tab The set of characters that can start a single-character comment. The
5557 default is the single character @samp{#}. For multiple-character
5558 comment delimiters, use @code{changecom} (@pxref{Changecom}).
5560 @item @kbd{E} @tab @dfn{End comment} @tab Context
5561 @tab The set of characters that can end a single-character comment. The
5562 default is the single character @kbd{newline}. For multiple-character
5563 comment delimiters, use @code{changecom} (@pxref{Changecom}). Note that
5564 newline also defaults to the syntax category `White space', when it
5565 appears in isolation.
5567 @item @kbd{$} @tab @dfn{Dollar} @tab Context
5568 @tab Characters that can introduce an argument reference in the body of a
5569 macro. The default is the single character @samp{$}.
5571 @comment FIXME - implement ${10} argument parsing.
5572 @item @kbd{@{} @tab @dfn{Left brace} @tab Context
5573 @tab Characters that introduce an extended argument reference in the body of
5574 a macro immediately after a character in the Dollar category. The
5575 default is the single character @samp{@{}.
5577 @item @kbd{@}} @tab @dfn{Right brace} @tab Context
5578 @tab Characters that conclude an extended argument reference in the body of a
5579 macro. The default is the single character @samp{@}}.
5581 @item @kbd{O} @tab @dfn{Other} @tab Basic
5582 @tab Characters that have no special syntactical meaning to @code{m4}.
5583 Defaults to all characters except those in the categories above.
5585 @item @kbd{A} @tab @dfn{Active} @tab Basic
5586 @tab Characters that themselves, alone, form macro names. This is a
5587 GNU extension, and active characters have lower precedence
5588 than comments. By default, no characters are active.
5590 @item @kbd{@@} @tab @dfn{Escape} @tab Basic
5591 @tab Characters that must precede macro names for them to be recognized.
5592 This is a GNU extension. When an escape character is defined,
5593 then macros are not recognized unless the escape character is present;
5594 however, the macro name, visible by @samp{$0} in macro definitions, does
5595 not include the escape character. By default, no characters are
5598 @comment FIXME - we should also consider supporting:
5599 @comment @item @kbd{I} @tab @dfn{Ignore} @tab Basic
5600 @comment @tab Characters that are ignored if they appear in
5601 @comment the input; perhaps defaulting to '\0'.
5604 The expansion of @code{changesyntax} is void.
5605 The macro @code{changesyntax} is recognized only with parameters. Use
5606 this macro with caution, as it is possible to change the syntax in such
5607 a way that no further macros can be recognized by @code{m4}.
5608 This macro was added in M4 2.0.
5611 With @code{changesyntax} we can modify what characters form a word. For
5612 example, we can make @samp{.} a valid character in a macro name, or even
5613 start a macro name with a number.
5616 define(`test.1', `TEST ONE')
5624 dnl Add `.' and remove `_'.
5625 changesyntax(`W+.', `W-_')
5631 dnl Set words to include numbers.
5632 changesyntax(`W=a-zA-Z0-9_')
5638 dnl Reset words to default (a-zA-Z_).
5647 Another possibility is to change the syntax of a macro call.
5650 define(`test', `$#')
5654 dnl Change macro syntax.
5655 changesyntax(`(<', `,|', `)>')
5663 Leading spaces are always removed from macro arguments in @code{m4}, but
5664 by changing the syntax categories we can avoid it. The use of
5665 @code{format} is an alternative to using a literal tab character.
5668 define(`test', `$1$2$3')
5672 dnl Don't ignore whitespace.
5673 changesyntax(`O 'format(``%c'', `9')`
5682 It is possible to redefine the @samp{$} used to indicate macro arguments
5683 in user defined macros. Dollar class syntax elements are copied to the
5684 output if there is no valid expansion.
5687 define(`argref', `Dollar: $#, Question: ?#')
5690 @result{}Dollar: 3, Question: ?#
5691 dnl Change argument identifier.
5695 @result{}Dollar: $#, Question: 3
5696 define(`escape', `$?`'1$?1?')
5700 dnl Multiple argument identifiers.
5704 @result{}Dollar: 3, Question: 3
5707 Macro calls can be given a @TeX{} or Texinfo like syntax using an
5708 escape. If one or more characters are defined as escapes, macro names
5709 are only recognized if preceded by an escape character.
5711 If the escape is not followed by what is normally a word (a letter
5712 optionally followed by letters and/or numerals), that single character
5713 is returned as a macro name.
5715 As always, words without a macro definition cause no error message.
5716 They and the escape character are simply output.
5719 define(`foo', `bar')
5721 dnl Require @@ escape before any macro.
5722 changesyntax(`@@@@')
5730 @@dnl Change escape character.
5731 @@changesyntax(`@@\', `O@@')
5739 define(`#', `No comment')
5740 @result{}define(#, No comment)
5741 \define(`#', `No comment')
5743 \# \foo # Comment \foo
5744 @result{}No comment bar # Comment \foo
5747 Active characters are known from @TeX{}. In @code{m4} an active
5748 character is always seen as a one-letter word, and so, if it has a macro
5749 definition, the macro will be called.
5752 define(`@@', `TEST')
5754 define(`a@@a', `hello')
5771 There is obviously an overlap between @code{changesyntax} and
5772 @code{changequote}, since there are now two ways to modify quote
5773 delimiters. To avoid incompatibilities, if the quotes are modified by
5774 @code{changequote}, any characters previously set to either quote
5775 delimiter by @code{changesyntax} are first demoted to the other category
5776 (@samp{O}), so the result is only a single set of quotes. In the other
5777 direction, if quotes were already disabled, or if both the start and end
5778 delimiter set by @code{changequote} are single bytes, then
5779 @code{changesyntax} preserves those settings. But if either delimiter
5780 occupies multiple bytes, @code{changesyntax} first disables both
5781 delimiters. Quotes can be disabled via @code{changesyntax} by emptying
5782 the left quote basic category (@samp{L}). Meanwhile, the right quote
5783 context category (@samp{R}) will never be empty; if a
5784 @code{changesyntax} action would otherwise leave that category empty,
5785 then the default end delimiter from @code{changequote} (@samp{'}) is
5786 used; thus, it is never possible to get @code{m4} in a state where a
5787 quoted string cannot be terminated. These interactions apply to comment
5788 delimiters as well, @i{mutatis mutandis} with @code{changecom}.
5791 define(`test', `TEST')
5793 dnl Add additional single-byte delimiters.
5794 changesyntax(`L+<', `R+>')
5796 <test> `test' [test] <<test>>
5797 @result{}test test [TEST] <test>
5798 dnl Use standard interface, overriding changesyntax settings.
5799 changequote(<[>, `]')
5801 <test> `test' [test] <<test>>
5802 @result{}<TEST> `TEST' test <<TEST>>
5803 dnl Introduce multi-byte delimiters.
5804 changequote([<<], [>>])
5806 <test> `test' [test] <<test>>
5807 @result{}<TEST> `TEST' [TEST] test
5808 dnl Change end quote, effectively disabling quotes.
5809 changesyntax(<<R]>>)
5811 <test> `test' [test] <<test>>
5812 @result{}<TEST> `TEST' [TEST] <<TEST>>
5813 dnl Change beginning quote, make ] normal, thus making ' end quote.
5814 changesyntax(L`, R-])
5816 <test> `test' [test] <<test>>
5817 @result{}<TEST> test [TEST] <<TEST>>
5818 dnl Set multi-byte quote; unrelated changes don't impact it.
5819 changequote(`<<', `>>')changesyntax(<<@@\>>)
5821 <\test> `\test' [\test] <<\test>>
5822 @result{}<TEST> `TEST' [TEST] \test
5825 If several characters are assigned to a category that forms single
5826 character tokens, all such characters are treated as equal. Any open
5827 parenthesis will match any close parenthesis, etc.
5830 dnl Go crazy with symbols.
5831 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
5837 The syntax table is initialized to be backwards compatible, so if you
5838 never call @code{changesyntax}, nothing will have changed.
5840 For now, debugging output continues to use @kbd{(}, @kbd{,} and @kbd{)}
5841 to show macro calls; and macro expansions that result in a list of
5842 arguments (such as @samp{$@@} or @code{shift}) use @samp{,}, regardless
5843 of the current syntax settings. However, this is likely to change in a
5844 future release, so it should not be relied on, particularly since it is
5845 next to impossible to write recursive macros if the argument separator
5846 doesn't match between expansion and rescanning.
5848 @c FIXME - changing syntax of , should not break iterative macros.
5851 changesyntax(`,=|')traceon(`foo')define(`foo'|`$#:$@@')
5854 @error{}m4trace: -2- foo(`1', `2', `3') -> `3:`1',`2',`3''
5855 @error{}m4trace: -1- foo(`3:1,2,3') -> `1:`3:1,2,3''
5860 @section Saving text until end of input
5862 @cindex saving input
5863 @cindex input, saving
5864 @cindex deferring expansion
5865 @cindex expansion, deferring
5866 It is possible to `save' some text until the end of the normal input has
5867 been seen. Text can be saved, to be read again by @code{m4} when the
5868 normal input has been exhausted. This feature is normally used to
5869 initiate cleanup actions before normal exit, e.g., deleting temporary
5872 To save input text, use the builtin @code{m4wrap}:
5874 @deffn {Builtin (m4)} m4wrap (@var{string}, @dots{})
5875 Stores @var{string} in a safe place, to be reread when end of input is
5876 reached. As a GNU extension, additional arguments are
5877 concatenated with a space to the @var{string}.
5879 Successive invocations of @code{m4wrap} accumulate saved text in
5880 first-in, first-out order, as required by POSIX.
5882 The expansion of @code{m4wrap} is void.
5883 The macro @code{m4wrap} is recognized only with parameters.
5887 define(`cleanup', `This is the `cleanup' action.
5892 This is the first and last normal input line.
5893 @result{}This is the first and last normal input line.
5895 @result{}This is the cleanup action.
5898 The saved input is only reread when the end of normal input is seen, and
5899 not if @code{m4exit} is used to exit @code{m4}.
5901 It is safe to call @code{m4wrap} from wrapped text, where all the
5902 recursively wrapped text is deferred until the current wrapped text is
5903 exhausted. As of M4 1.6, when @code{m4wrap} is not used recursively,
5904 the saved pieces of text are reread in the same order in which they were
5905 saved (FIFO---first in, first out), as required by POSIX.
5919 However, earlier versions had reverse ordering (LIFO---last in, first
5920 out), as this behavior is more like the semantics of the C function
5921 @code{atexit}. It is possible to emulate POSIX behavior even
5922 with older versions of GNU M4 by including the file
5923 @file{m4-@value{VERSION}/@/doc/examples/@/wrapfifo.m4} from the
5928 $ @kbd{m4 -I doc/examples}
5929 undivert(`wrapfifo.m4')dnl
5930 @result{}dnl Redefine m4wrap to have FIFO semantics.
5931 @result{}define(`_m4wrap_level', `0')dnl
5932 @result{}define(`m4wrap',
5933 @result{}`ifdef(`m4wrap'_m4wrap_level,
5934 @result{} `define(`m4wrap'_m4wrap_level,
5935 @result{} defn(`m4wrap'_m4wrap_level)`$1')',
5936 @result{} `builtin(`m4wrap', `define(`_m4wrap_level',
5937 @result{} incr(_m4wrap_level))dnl
5938 @result{}m4wrap'_m4wrap_level)dnl
5939 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5940 include(`wrapfifo.m4')
5942 m4wrap(`a`'m4wrap(`c
5943 ', `d')')m4wrap(`b')
5949 It is likewise possible to emulate LIFO behavior without resorting to
5950 the GNU M4 extension of @code{builtin}, by including the file
5951 @file{m4-@value{VERSION}/@/doc/examples/@/wraplifo.m4} from the
5952 distribution. (Unfortunately, both examples shown here share some
5953 subtle bugs. See if you can find and correct them; or @pxref{Improved
5954 m4wrap, , Answers}).
5958 $ @kbd{m4 -I doc/examples}
5959 undivert(`wraplifo.m4')dnl
5960 @result{}dnl Redefine m4wrap to have LIFO semantics.
5961 @result{}define(`_m4wrap_level', `0')dnl
5962 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
5963 @result{}define(`m4wrap',
5964 @result{}`ifdef(`m4wrap'_m4wrap_level,
5965 @result{} `define(`m4wrap'_m4wrap_level,
5966 @result{} `$1'defn(`m4wrap'_m4wrap_level))',
5967 @result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl
5968 @result{}m4wrap'_m4wrap_level)dnl
5969 @result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl
5970 include(`wraplifo.m4')
5972 m4wrap(`a`'m4wrap(`c
5973 ', `d')')m4wrap(`b')
5979 Here is an example of implementing a factorial function using
5983 define(`f', `ifelse(`$1', `0', `Answer: 0!=1
5984 ', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1')
5985 ', `m4wrap(`f(decr(`$1'), `$2$1*')')')')
5990 @result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800
5993 Invocations of @code{m4wrap} at the same recursion level are
5994 concatenated and rescanned as usual:
6000 m4wrap(`a')m4wrap(`b')
6007 however, the transition between recursion levels behaves like an end of
6008 file condition between two input files.
6012 m4wrap(`m4wrap(`)')len(abc')
6015 @error{}m4:stdin:1: len: end of file in argument list
6018 As of M4 1.6, @code{m4wrap} transparently handles builtin tokens
6019 generated by @code{defn} (@pxref{Defn}). However, for portability, it
6020 is better to defer the evaluation of @code{defn} along with the rest of
6021 the wrapped text, as is done for @code{foo} in the example below, rather
6022 than computing the builtin token up front, as is done for @code{bar}.
6025 m4wrap(`define(`foo', defn(`divnum'))foo
6028 m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar
6036 @node File Inclusion
6037 @chapter File inclusion
6039 @cindex file inclusion
6040 @cindex inclusion, of files
6041 @code{m4} allows you to include named files at any point in the input.
6044 * Include:: Including named files and modules
6045 * Search Path:: Searching for include files
6049 @section Including named files and modules
6051 There are two builtin macros in @code{m4} for including files:
6053 @deffn {Builtin (m4)} include (@var{file})
6054 @deffnx {Builtin (m4)} sinclude (@var{file})
6055 Both macros cause the file named @var{file} to be read by
6056 @code{m4}. When the end of the file is reached, input is resumed from
6057 the previous input file.
6059 The expansion of @code{include} and @code{sinclude} is therefore the
6060 contents of @var{file}.
6062 If @var{file} does not exist, is a directory, or cannot otherwise be
6063 read, the expansion is void,
6064 and @code{include} will fail with an error while @code{sinclude} is
6065 silent. The empty string counts as a file that does not exist.
6067 The macros @code{include} and @code{sinclude} are recognized only with
6074 @error{}m4:stdin:1: include: cannot open file 'n': No such file or directory
6077 @error{}m4:stdin:2: include: cannot open file '': No such file or directory
6085 This section uses the @option{--include} command-line option (or
6086 @option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab
6087 files from the @file{m4-@value{VERSION}/@/doc/examples}
6088 directory shipped as part of the GNU @code{m4} package. The
6089 file @file{m4-@value{VERSION}/@/doc/examples/@/incl.m4} in the distribution
6094 $ @kbd{cat doc/examples/incl.m4}
6095 @result{}Include file start
6097 @result{}Include file end
6100 Normally file inclusion is used to insert the contents of a file
6101 into the input stream. The contents of the file will be read by
6102 @code{m4} and macro calls in the file will be expanded:
6106 $ @kbd{m4 -I doc/examples}
6107 define(`foo', `FOO')
6110 @result{}Include file start
6112 @result{}Include file end
6116 The fact that @code{include} and @code{sinclude} expand to the contents
6117 of the file can be used to define macros that operate on entire files.
6118 Here is an example, which defines @samp{bar} to expand to the contents
6123 $ @kbd{m4 -I doc/examples}
6124 define(`bar', include(`incl.m4'))
6126 This is `bar': >>bar<<
6127 @result{}This is bar: >>Include file start
6129 @result{}Include file end
6133 This use of @code{include} is not trivial, though, as files can contain
6134 quotes, commas, and parentheses, which can interfere with the way the
6135 @code{m4} parser works. GNU M4 seamlessly concatenates
6136 the file contents with the next character, even if the included file
6137 ended in the middle of a comment, string, or macro call. These
6138 conditions are only treated as end of file errors if specified as input
6139 files on the command line.
6141 In GNU M4, an alternative method of reading files is
6142 using @code{undivert} (@pxref{Undivert}) on a named file.
6144 In addition, as a GNU M4 extension, if the included file cannot
6145 be found exactly as given, various standard suffixes are appended.
6146 If the included file name is absolute (a full path from the root directory
6147 is given) then additional search directories are not examined, although
6148 suffixes will be tried if the file is not found exactly as given.
6149 For each directory that is searched (according to the absolute directory
6150 give in the file name, or else by directories listed in @env{M4PATH} and
6151 given with the @option{-I} and @option{-B} options), first the unchanged
6152 file name is tried, and then again with the suffixes @samp{.m4f} and
6155 Furthermore, if no matching file has yet been found, before moving on to
6156 the next directory, @samp{.la} and the usual binary module suffix for
6157 the host platform (usually @samp{.so}) are also tried. Matching with one
6158 of those suffixes will attempt to load the matched file as a dynamic
6159 module. @xref{Modules}, for more details.
6162 @section Searching for include files
6164 @cindex search path for included files
6165 @cindex included files, search path for
6166 @cindex GNU extensions
6167 GNU @code{m4} allows included files to be found in other directories
6168 than the current working directory.
6170 @cindex @env{M4PATH}
6171 If the @option{--prepend-include} or @option{-B} command-line option was
6172 provided (@pxref{Preprocessor features, , Invoking m4}), those
6173 directories are searched first, in reverse order that those options were
6174 listed on the command line. Then @code{m4} looks in the current working
6175 directory. Next comes the directories specified with the
6176 @option{--include} or @option{-I} option, in the order found on the
6177 command line. Finally, if the @env{M4PATH} environment variable is set,
6178 it is expected to contain a colon-separated list of directories, which
6179 will be searched in order.
6181 If the automatic search for include-files causes trouble, the @samp{p}
6182 debug flag (@pxref{Debugmode}) can help isolate the problem.
6185 @chapter Diverting and undiverting output
6187 @cindex deferring output
6188 Diversions are a way of temporarily saving output. The output of
6189 @code{m4} can at any time be diverted to a temporary file, and be
6190 reinserted into the output stream, @dfn{undiverted}, again at a later
6193 @cindex @env{TMPDIR}
6194 Numbered diversions are counted from 0 upwards, diversion number 0
6195 being the normal output stream. GNU
6196 @code{m4} tries to keep diversions in memory. However, there is a
6197 limit to the overall memory usable by all diversions taken together
6198 (512K, currently). When this maximum is about to be exceeded,
6199 a temporary file is opened to receive the contents of the biggest
6200 diversion still in memory, freeing this memory for other diversions.
6201 When creating the temporary file, @code{m4} honors the value of the
6202 environment variable @env{TMPDIR}, and falls back to @file{/tmp}.
6203 Thus, the amount of available disk space provides the only real limit on
6204 the number and aggregate size of diversions.
6206 Diversions make it possible to generate output in a different order than
6207 the input was read. It is possible to implement topological sorting
6208 dependencies. For example, GNU Autoconf makes use of
6209 diversions under the hood to ensure that the expansion of a prerequisite
6210 macro appears in the output prior to the expansion of a dependent macro,
6211 regardless of which order the two macros were invoked in the user's
6215 * Divert:: Diverting output
6216 * Undivert:: Undiverting output
6217 * Divnum:: Diversion numbers
6218 * Cleardivert:: Discarding diverted text
6222 @section Diverting output
6224 @cindex diverting output to files
6225 @cindex output, diverting to files
6226 @cindex files, diverting output to
6227 Output is diverted using @code{divert}:
6229 @deffn {Builtin (m4)} divert (@dvar{number, 0}, @ovar{text})
6230 The current diversion is changed to @var{number}. If @var{number} is left
6231 out or empty, it is assumed to be zero. If @var{number} cannot be
6232 parsed, the diversion is unchanged.
6234 @cindex GNU extensions
6235 As a GNU extension, if optional @var{text} is supplied and
6236 @var{number} was valid, then @var{text} is immediately output to the
6237 new diversion, regardless of whether the expansion of @code{divert}
6238 occurred while collecting arguments for another macro.
6240 The expansion of @code{divert} is void.
6243 When all the @code{m4} input will have been processed, all existing
6244 diversions are automatically undiverted, in numerical order.
6248 This text is diverted.
6251 This text is not diverted.
6252 @result{}This text is not diverted.
6255 @result{}This text is diverted.
6258 Several calls of @code{divert} with the same argument do not overwrite
6259 the previous diverted text, but append to it. Diversions are printed
6260 after any wrapped text is expanded.
6263 define(`text', `TEXT')
6265 divert(`1')`diverted text.'
6268 m4wrap(`Wrapped text precedes ')
6271 @result{}Wrapped TEXT precedes diverted text.
6274 @cindex discarding input
6275 @cindex input, discarding
6276 If output is diverted to a negative diversion, it is simply discarded.
6277 This can be used to suppress unwanted output. A common example of
6278 unwanted output is the trailing newlines after macro definitions. Here
6279 is a common programming idiom in @code{m4} for avoiding them.
6283 define(`foo', `Macro `foo'.')
6284 define(`bar', `Macro `bar'.')
6289 @cindex GNU extensions
6290 Traditional implementations only supported ten diversions. But as a
6291 GNU extension, diversion numbers can be as large as positive
6292 integers will allow, rather than treating a multi-digit diversion number
6293 as a request to discard text.
6296 divert(eval(`1<<28'))world
6303 The ability to immediately output extra text is a GNU
6304 extension, but it can prove useful for ensuring that text goes to a
6305 particular diversion no matter how many pending macro expansions are in
6306 progress. For a demonstration of why this is useful, it is important to
6307 understand in the example below why @samp{one} is output in diversion 2,
6308 not diversion 1, while @samp{three} and @samp{five} both end up in the
6309 correctly numbered diversion. The key point is that when @code{divert}
6310 is executed unquoted as part of the argument collection of another
6311 macro, the side effect takes place immediately, but the text @samp{one}
6312 is not passed to any diversion until after the @samp{divert(`2')} and
6313 the enclosing @code{echo} have also taken place. The example with
6314 @samp{three} shows how following the quoting rule of thumb delays the
6315 invocation of @code{divert} until it is not nested in any argument
6316 collection context, while the example with @samp{five} shows the use of
6317 the optional argument to speed up the output process.
6320 define(`echo', `$1')
6322 echo(divert(`1')`one'divert(`2'))`'dnl
6323 echo(`divert(`3')three`'divert(`4')')`'dnl
6324 echo(divert(`5', `five')divert(`6'))`'dnl
6341 Note that @code{divert} is an English word, but also an active macro
6342 without arguments. When processing plain text, the word might appear in
6343 normal text and be unintentionally swallowed as a macro invocation. One
6344 way to avoid this is to use the @option{-P} option to rename all
6345 builtins (@pxref{Operation modes, , Invoking m4}). Another is to write
6346 a wrapper that requires a parameter to be recognized.
6349 We decided to divert the stream for irrigation.
6350 @result{}We decided to the stream for irrigation.
6351 define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')')
6357 We decided to divert the stream for irrigation.
6358 @result{}We decided to divert the stream for irrigation.
6362 @section Undiverting output
6364 Diverted text can be undiverted explicitly using the builtin
6367 @deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}})
6368 Undiverts the numeric @var{diversions} given by the arguments, in the
6369 order given. If no arguments are supplied, all diversions are
6370 undiverted, in numerical order.
6372 @cindex file inclusion
6373 @cindex inclusion, of files
6374 @cindex GNU extensions
6375 As a GNU extension, @var{diversions} may contain non-numeric
6376 strings, which are treated as the names of files to copy into the output
6377 without expansion. A warning is issued if a file could not be opened.
6379 The expansion of @code{undivert} is void.
6384 This text is diverted.
6387 This text is not diverted.
6388 @result{}This text is not diverted.
6391 @result{}This text is diverted.
6395 Notice the last two blank lines. One of them comes from the newline
6396 following @code{undivert}, the other from the newline that followed the
6397 @code{divert}! A diversion often starts with a blank line like this.
6399 When diverted text is undiverted, it is @emph{not} reread by @code{m4},
6400 but rather copied directly to the current output, and it is therefore
6401 not an error to undivert into a diversion. Undiverting the empty string
6402 is the same as specifying diversion 0; in either case nothing happens
6403 since the output has already been flushed.
6406 divert(`1')diverted text
6414 @result{}diverted text
6417 divert(`2')undivert(`1')diverted text`'divert
6423 @result{}diverted text
6426 When a diversion has been undiverted, the diverted text is discarded,
6427 and it is not possible to bring back diverted text more than once.
6431 This text is diverted first.
6432 divert(`0')undivert(`1')dnl
6434 @result{}This text is diverted first.
6438 This text is also diverted but not appended.
6439 divert(`0')undivert(`1')dnl
6441 @result{}This text is also diverted but not appended.
6444 Attempts to undivert the current diversion are silently ignored. Thus,
6445 when the current diversion is not 0, the current diversion does not get
6446 rearranged among the other diversions.
6454 divert(`2')undivert(`5', `2', `4')dnl
6455 undivert`'dnl effectively undivert(`1', `2', `3', `4', `5')
6456 divert`'undivert`'dnl
6464 @cindex GNU extensions
6465 @cindex file inclusion
6466 @cindex inclusion, of files
6467 GNU @code{m4} allows named files to be undiverted. Given a
6468 non-numeric argument, the contents of the file named will be copied,
6469 uninterpreted, to the current output. This complements the builtin
6470 @code{include} (@pxref{Include}). To illustrate the difference, assume
6471 the file @file{foo} contains:
6483 define(`bar', `BAR')
6493 If the file is not found (or cannot be read), an error message is
6494 issued, and the expansion is void. It is possible to intermix files
6495 and diversion numbers.
6498 divert(`1')diversion one
6499 divert(`2')undivert(`foo')dnl
6500 divert(`3')diversion three
6502 undivert(`1', `2', `foo', `3')dnl
6503 @result{}diversion one
6506 @result{}diversion three
6510 @section Diversion numbers
6512 @cindex diversion numbers
6513 The current diversion is tracked by the builtin @code{divnum}:
6515 @deffn {Builtin (m4)} divnum
6516 Expands to the number of the current diversion.
6523 Diversion one: divnum
6525 Diversion two: divnum
6528 @result{}Diversion one: 1
6530 @result{}Diversion two: 2
6534 @section Discarding diverted text
6536 @cindex discarding diverted text
6537 @cindex diverted text, discarding
6538 Often it is not known, when output is diverted, whether the diverted
6539 text is actually needed. Since all non-empty diversion are brought back
6540 on the main output stream when the end of input is seen, a method of
6541 discarding a diversion is needed. If all diversions should be
6542 discarded, the easiest is to end the input to @code{m4} with
6543 @samp{divert(`-1')} followed by an explicit @samp{undivert}:
6547 Diversion one: divnum
6549 Diversion two: divnum
6556 No output is produced at all.
6558 Clearing selected diversions can be done with the following macro:
6560 @deffn Composite cleardivert (@ovar{diversions@dots{}})
6561 Discard the contents of each of the listed numeric @var{diversions}.
6565 define(`cleardivert',
6566 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
6570 It is called just like @code{undivert}, but the effect is to clear the
6571 diversions, given by the arguments. (This macro has a nasty bug! You
6572 should try to see if you can find it and correct it; or @pxref{Improved
6573 cleardivert, , Answers}).
6576 @chapter Extending M4 with dynamic runtime modules
6579 @cindex dynamic modules
6580 @cindex loadable modules
6581 GNU M4 1.4.x had a monolithic architecture. All of its
6582 functionality was contained in a single binary, and additional macros
6583 could be added only by writing more code in the M4 language, or at the
6584 extreme by hacking the sources and recompiling the whole thing to make
6585 a custom M4 installation.
6587 Starting with release 2.0, M4 supports and is composed of loadable modules.
6588 Additional modules can be loaded into the running M4 interpreter as it is
6589 started up at the command line, or during normal expansion of macros. This
6590 facilitates runtime extension of the M4 builtin macro list using compiled C
6591 code linked against a new shared library, typically named @file{libm4.so}.
6593 For example, you might want to add a @code{setenv} builtin to M4, to
6594 use before invoking @code{esyscmd}. We might write a @file{setenv.c}
6595 something like this:
6599 #include "m4module.h"
6603 m4_builtin m4_builtin_table[] =
6605 /* name handler flags minargs maxargs */
6606 @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2, 3 @},
6608 @{ NULL, NULL, 0, 0, 0 @}
6612 * setenv(NAME, VALUE, [OVERWRITE])
6614 M4BUILTIN_HANDLER (setenv)
6619 if (!m4_numeric_arg (context, argc, argv, 3, &overwrite))
6622 setenv (M4ARG (1), M4ARG (2), overwrite);
6626 Then, having compiled and linked the module, in (somewhat contrived)
6632 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6634 esyscmd(`ifconfig -a')dnl
6638 Or instead of loading the module from the M4 invocation, you can use
6639 the @code{include} builtin:
6646 setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin')
6650 Also, at run time, you can choose which core modules to load. SUSv3 M4
6651 functionality is contained in the module @samp{m4}, GNU extensions in the
6652 module @samp{gnu}, and so on. All of the builtin descriptions in this manual
6653 are annotated with the module from which they are loaded -- mostly from the
6656 When you start GNU M4, the modules @samp{m4} and @samp{gnu} are
6657 loaded by default. If you supply the @option{-G} option at startup, the
6658 module @samp{traditional} is loaded instead of @samp{gnu}.
6659 @xref{Compatibility}, for more details on the differences between these
6660 two modes of startup.
6663 * M4modules:: Listing loaded modules
6664 * Standard Modules:: Standard bundled modules
6668 @section Listing loaded modules
6670 @deffn {Builtin (gnu)} m4modules
6671 Expands to a quoted ordered list of currently loaded modules,
6672 with the most recently loaded module at the front of the list. Loading
6673 a module multiple times will not affect the order of this list, the
6674 position depends on when the module was @emph{first} loaded.
6677 For example, after GNU @code{m4} is started with no additional modules,
6678 @code{m4modules} will yield the following:
6686 @node Standard Modules
6687 @section Standard bundled modules
6689 GNU @code{m4} ships with several bundled modules as standard.
6690 By convention, these modules define a text macro that can be tested
6691 with @code{ifdef} when they are loaded; only the @code{m4} module lacks
6692 this feature test macro, since it is not permitted by POSIX.
6693 Each of the feature test macros are intended to be used without
6698 Provides all of the builtins defined by POSIX. This module
6699 is always loaded --- GNU @code{m4} would only be a very slow
6700 version of @command{cat} without the builtins supplied by this module.
6703 Provides all of the GNU extensions, as defined by
6704 GNU M4 through the 1.4.x release series. It also provides a
6705 couple of feature test macros:
6707 @deffn {Macro (gnu)} __gnu__
6708 Expands to the empty string, as an indication that the @samp{gnu}
6712 @deffn {Macro (gnu)} __m4_version__
6713 Expands to an unquoted string containing the release version number of
6714 the running GNU @code{m4} executable.
6717 This module is always loaded, unless the @option{-G} command line
6718 option is supplied at startup (@pxref{Limits control, , Invoking m4}).
6721 This module provides compatibility with System V @code{m4}, for anything
6722 not specified by POSIX, and is loaded instead of the
6723 @samp{gnu} module if the @option{-G} command line option is specified.
6725 @deffn {Macro (traditional)} __traditional__
6726 Expands to the empty string, as an indication that the
6727 @samp{traditional} module is loaded.
6731 This module provides the implementation for the experimental
6732 @code{mpeval} feature. If the host machine does not have the
6733 GNU gmp library, the builtin will generate an error if called.
6734 @xref{Mpeval}, for more details. The module also defines the following
6737 @deffn {Macro (mpeval)} __mpeval__
6738 Expands to the empty string, as an indication that the @samp{mpeval}
6743 Here is an example of using the feature test macros.
6747 __gnu__-__traditional__
6748 @result{}-__traditional__
6749 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6750 @result{}Extensions are active
6752 @error{}m4:stdin:3: warning: __gnu__: extra arguments ignored: 1 > 0
6756 @comment options: -G
6758 $ @kbd{m4 --traditional}
6759 __gnu__-__traditional__
6761 ifdef(`__gnu__', `Extensions are active', `Minimal features')
6762 @result{}Minimal features
6765 Since the version string is unquoted and can potentially contain macro
6766 names (for example, a beta release could be numbered @samp{1.9b}), or be
6767 impacted by the use of @code{changesyntax}), the
6768 @code{__m4_version__} macro should generally be used via @code{defn}
6769 rather than directly invoked (@pxref{Defn}). In general, feature tests
6770 are more reliable than version number checks, so exercise caution when
6773 @comment This test is excluded from the testsuite since it depends on a
6774 @comment texinfo macro; but builtins.at covers the same thing.
6777 defn(`__m4_version__')
6778 @result{}@value{VERSION}
6782 @chapter Macros for text handling
6784 There are a number of builtins in @code{m4} for manipulating text in
6785 various ways, extracting substrings, searching, substituting, and so on.
6788 * Len:: Calculating length of strings
6789 * Index macro:: Searching for substrings
6790 * Regexp:: Searching for regular expressions
6791 * Substr:: Extracting substrings
6792 * Translit:: Translating characters
6793 * Patsubst:: Substituting text by regular expression
6794 * Format:: Formatting strings (printf-like)
6798 @section Calculating length of strings
6800 @cindex length of strings
6801 @cindex strings, length of
6802 The length of a string can be calculated by @code{len}:
6804 @deffn {Builtin (m4)} len (@var{string})
6805 Expands to the length of @var{string}, as a decimal number.
6807 The macro @code{len} is recognized only with parameters.
6818 @section Searching for substrings
6820 @cindex substrings, locating
6821 Searching for substrings is done with @code{index}:
6823 @deffn {Builtin (m4)} index (@var{string}, @var{substring}, @ovar{offset})
6824 Expands to the index of the first occurrence of @var{substring} in
6825 @var{string}. The first character in @var{string} has index 0. If
6826 @var{substring} does not occur in @var{string}, @code{index} expands to
6827 @samp{-1}. If @var{offset} is provided, it determines the index at
6828 which the search starts; a negative @var{offset} specifies the offset
6829 relative to the end of @var{string}.
6831 The macro @code{index} is recognized only with parameters.
6835 index(`gnus, gnats, and armadillos', `nat')
6837 index(`gnus, gnats, and armadillos', `dag')
6841 Omitting @var{substring} evokes a warning, but still produces output;
6842 contrast this with an empty @var{substring}.
6846 @error{}m4:stdin:1: warning: index: too few arguments: 1 < 2
6854 @cindex GNU extensions
6855 As an extension, an @var{offset} can be provided to limit the search to
6856 the tail of the @var{string}. A negative offset is interpreted relative
6857 to the end of @var{string}, and it is not an error if @var{offset}
6858 exceeds the bounds of @var{string}.
6861 index(`aba', `a', `1')
6863 index(`ababa', `ba', `-3')
6865 index(`abc', `ab', `4')
6867 index(`abc', `bc', `-4')
6872 @comment Expose a bug in the strstr() algorithm present in glibc
6873 @comment 2.9 through 2.12 and in gnulib up to Sep 2010.
6876 index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-:',
6877 `:12-:12-:12-:12-:12-:12-:12-:12-')
6881 @comment Expose a bug in the gnulib replacement strstr() algorithm
6882 @comment present from Jun 2010 to Feb 2011, including m4 1.4.15.
6885 index(`..wi.d.', `.d.')
6891 @section Searching for regular expressions
6893 @cindex regular expressions
6894 @cindex expressions, regular
6895 @cindex GNU extensions
6896 Searching for regular expressions is done with the builtin
6899 @deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax})
6900 @deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @
6901 @ovar{replacement}, @ovar{resyntax})
6902 Searches for @var{regexp} in @var{string}.
6904 If @var{resyntax} is given, the particular flavor of regular expression
6905 understood with respect to @var{regexp} can be changed from the current
6906 default. @xref{Changeresyntax}, for details of the values that can be
6907 given for this argument. If exactly three arguments given, then the
6908 third argument is treated as @var{resyntax} only if it matches a known
6909 syntax name, otherwise it is treated as @var{replacement}.
6911 If @var{replacement} is omitted, @code{regexp} expands to the index of
6912 the first match of @var{regexp} in @var{string}. If @var{regexp} does
6913 not match anywhere in @var{string}, it expands to -1.
6915 If @var{replacement} is supplied, and there was a match, @code{regexp}
6916 changes the expansion to this argument, with @samp{\@var{n}} substituted
6917 by the text matched by the @var{n}th parenthesized sub-expression of
6918 @var{regexp}, up to nine sub-expressions. The escape @samp{\&} is
6919 replaced by the text of the entire regular expression matched. For
6920 all other characters, @samp{\} treats the next character literally. A
6921 warning is issued if there were fewer sub-expressions than the
6922 @samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there
6923 was no match, @code{regexp} expands to the empty string.
6925 The macro @code{regexp} is recognized only with parameters.
6929 regexp(`GNUs not Unix', `\<[a-z]\w+')
6931 regexp(`GNUs not Unix', `\<Q\w*')
6933 regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***')
6934 @result{}*** Unix *** nix ***
6935 regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***')
6939 Here are some more examples on the handling of backslash:
6942 regexp(`abc', `\(b\)', `\\\10\a')
6944 regexp(`abc', `b', `\1\')
6945 @error{}m4:stdin:2: warning: regexp: sub-expression 1 not present
6946 @error{}m4:stdin:2: warning: regexp: trailing \ ignored in replacement
6948 regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
6949 @error{}m4:stdin:3: warning: regexp: sub-expression 4 not present
6950 @error{}m4:stdin:3: warning: regexp: sub-expression 5 not present
6951 @error{}m4:stdin:3: warning: regexp: sub-expression 6 not present
6955 Omitting @var{regexp} evokes a warning, but still produces output;
6956 contrast this with an empty @var{regexp} argument.
6960 @error{}m4:stdin:1: warning: regexp: too few arguments: 1 < 2
6964 regexp(`abc', `', `\\def')
6968 If @var{resyntax} is given, @var{regexp} must be given according to
6969 the syntax chosen, though the default regular expression syntax
6970 remains unchanged for other invocations:
6973 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***',
6975 @result{}*** Unix *** nix ***
6976 regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***')
6980 Occasionally, you might want to pass an @var{resyntax} argument without
6981 wishing to give @var{replacement}. If there are exactly three
6982 arguments, and the last argument is a valid @var{resyntax}, it is used
6983 as such, rather than as a replacement.
6986 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED')
6988 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED')
6989 @result{}POSIX_EXTENDED
6990 regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `')
6992 regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `')
6993 @result{}POSIX_EXTENDED
6997 @section Extracting substrings
6999 @cindex extracting substrings
7000 @cindex substrings, extracting
7001 Substrings are extracted with @code{substr}:
7003 @deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length}, @
7005 Performs a substring operation on @var{string}. If @var{from} is
7006 positive, it represents the 0-based index where the substring begins.
7007 If @var{length} is omitted, the substring ends at the end of
7008 @var{string}; if it is positive, @var{length} is added to the starting
7009 index to determine the ending index.
7011 @cindex GNU extensions
7012 As a GNU extension, if @var{from} is negative, it is added to
7013 the length of @var{string} to determine the starting index; if it is
7014 empty, the start of the string is used. Likewise, if @var{length} is
7015 negative, it is added to the length of @var{string} to determine the
7016 ending index, and an emtpy @var{length} behaves like an omitted
7017 @var{length}. It is not an error if either of the resulting indices lie
7018 outside the string, but the selected substring only contains the bytes
7019 of @var{string} that overlap the selected indices. If the end point
7020 lies before the beginning point, the substring chosen is the empty
7021 string located at the starting index.
7023 If @var{replace} is omitted, then the expansion is only the selected
7024 substring, which may be empty. As a GNU extension,if
7025 @var{replace} is provided, then the expansion is the original
7026 @var{string} with the selected substring replaced by @var{replace}. The
7027 expansion is empty and a warning issued if @var{from} or @var{length}
7028 cannot be parsed, or if @var{replace} is provided but the selected
7029 indices do not overlap with @var{string}.
7031 The macro @code{substr} is recognized only with parameters.
7035 substr(`gnus, gnats, and armadillos', `6')
7036 @result{}gnats, and armadillos
7037 substr(`gnus, gnats, and armadillos', `6', `5')
7041 Omitting @var{from} evokes a warning, but still produces output. On the
7042 other hand, selecting a @var{from} or @var{length} that lies beyond
7043 @var{string} is not a problem.
7047 @error{}m4:stdin:1: warning: substr: too few arguments: 1 < 2
7053 substr(`abc', `1', `4')
7057 Using negative values for @var{from} or @var{length} are GNU
7058 extensions, useful for accessing a fixed size tail of an
7059 arbitrary-length string. Prior to M4 1.6, using these values would
7060 silently result in the empty string. Some other implementations crash
7061 on negative values, and many treat an explicitly empty @var{length} as
7062 0, which is different from the omitted @var{length} implying the rest of
7063 the original @var{string}.
7066 substr(`abcde', `2', `')
7068 substr(`abcde', `-3')
7070 substr(`abcde', `', `-3')
7072 substr(`abcde', `-6')
7074 substr(`abcde', `-6', `5')
7076 substr(`abcde', `-7', `1')
7078 substr(`abcde', `1', `-2')
7080 substr(`abcde', `-4', `-1')
7082 substr(`abcde', `4', `-3')
7084 substr(`abcdefghij', `-09', `08')
7088 Another useful GNU extension, also added in M4 1.6, is the
7089 ability to replace a substring within the original @var{string}. An
7090 empty length substring at the beginning or end of @var{string} is valid,
7091 but selecting a substring that does not overlap @var{string} causes a
7095 substr(`abcde', `1', `3', `t')
7097 substr(`abcde', `5', `', `f')
7099 substr(`abcde', `-3', `-4', `f')
7101 substr(`abcde', `-6', `1', `f')
7103 substr(`abcde', `-7', `1', `f')
7104 @error{}m4:stdin:5: warning: substr: substring out of range
7106 substr(`abcde', `6', `', `f')
7107 @error{}m4:stdin:6: warning: substr: substring out of range
7111 If backwards compabitility to M4 1.4.x behavior is necessary, the
7112 following macro is sufficient to do the job (mimicking warnings about
7113 empty @var{from} or @var{length} or an ignored fourth argument is left
7114 as an exercise to the reader).
7117 define(`substr', `ifelse(`$#', `0', ``$0'',
7118 eval(`2 < $#')`$3', `1', `',
7119 index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')')
7121 substr(`abcde', `3')
7123 substr(`abcde', `3', `')
7125 substr(`abcde', `-1')
7127 substr(`abcde', `1', `-1')
7129 substr(`abcde', `2', `1', `C')
7133 On the other hand, it is possible to portably emulate the GNU
7134 extension of negative @var{from} and @var{length} arguments across all
7135 @code{m4} implementations, albeit with a lot more overhead. This
7136 example uses @code{incr} and @code{decr} to normalize @samp{-08} to
7137 something that a later @code{eval} will treat as a decimal value, rather
7138 than looking like an invalid octal number, while avoiding using these
7139 macros on an empty string. The helper macro @code{_substr_normalize} is
7140 recursive, since it is easier to fix @var{length} after @var{from} has
7141 been normalized, with the final iteration supplying two non-negative
7142 arguments to the original builtin, now named @code{_substr}.
7144 @comment options: -daq -t_substr
7146 $ @kbd{m4 -daq -t _substr}
7147 define(`_substr', defn(`substr'))dnl
7148 define(`substr', `ifelse(`$#', `0', ``$0'',
7149 `_$0(`$1', _$0_normalize(len(`$1'),
7150 ifelse(`$2', `', `0', `incr(decr(`$2'))'),
7151 ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl
7152 define(`_substr_normalize', `ifelse(
7153 eval(`$2 < 0 && $1 + $2 >= 0'), `1',
7154 `$0(`$1', eval(`$1 + $2'), `$3')',
7155 eval(`$2 < 0')`$3', `1', ``0', `$1'',
7156 eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1',
7157 `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))',
7158 eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'',
7159 eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')',
7160 `$3', `', ``$2', `$1'',
7161 eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1',
7162 ``$2', eval(`$1 - $2 + $3')',
7163 eval(`$3 - 0 < 0'), `1', ``$2', `0'',
7165 substr(`abcde', `2', `')
7166 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7168 substr(`abcde', `-3')
7169 @error{}m4trace: -1- _substr(`abcde', `2', `5')
7171 substr(`abcde', `', `-3')
7172 @error{}m4trace: -1- _substr(`abcde', `0', `2')
7174 substr(`abcde', `-6')
7175 @error{}m4trace: -1- _substr(`abcde', `0', `5')
7177 substr(`abcde', `-6', `5')
7178 @error{}m4trace: -1- _substr(`abcde', `0', `4')
7180 substr(`abcde', `-7', `1')
7181 @error{}m4trace: -1- _substr(`abcde', `0', `0')
7183 substr(`abcde', `1', `-2')
7184 @error{}m4trace: -1- _substr(`abcde', `1', `2')
7186 substr(`abcde', `-4', `-1')
7187 @error{}m4trace: -1- _substr(`abcde', `1', `3')
7189 substr(`abcde', `4', `-3')
7190 @error{}m4trace: -1- _substr(`abcde', `4', `0')
7192 substr(`abcdefghij', `-09', `08')
7193 @error{}m4trace: -1- _substr(`abcdefghij', `1', `8')
7198 @section Translating characters
7200 @cindex translating characters
7201 @cindex characters, translating
7202 Character translation is done with @code{translit}:
7204 @deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement})
7205 Expands to @var{string}, with each character that occurs in
7206 @var{chars} translated into the character from @var{replacement} with
7209 If @var{replacement} is shorter than @var{chars}, the excess characters
7210 of @var{chars} are deleted from the expansion; if @var{chars} is
7211 shorter, the excess characters in @var{replacement} are silently
7212 ignored. If @var{replacement} is omitted, all characters in
7213 @var{string} that are present in @var{chars} are deleted from the
7214 expansion. If a character appears more than once in @var{chars}, only
7215 the first instance is used in making the translation. Only a single
7216 translation pass is made, even if characters in @var{replacement} also
7217 appear in @var{chars}.
7219 As a GNU extension, both @var{chars} and @var{replacement} can
7220 contain character-ranges, e.g., @samp{a-z} (meaning all lowercase
7221 letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-}
7222 in @var{chars} or @var{replacement}, place it first or last in the
7223 entire string, or as the last character of a range. Back-to-back ranges
7224 can share a common endpoint. It is not an error for the last character
7225 in the range to be `larger' than the first. In that case, the range
7226 runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}.
7227 The expansion of a range is dependent on the underlying encoding of
7228 characters, so using ranges is not always portable between machines.
7230 The macro @code{translit} is recognized only with parameters.
7234 translit(`GNUs not Unix', `A-Z')
7236 translit(`GNUs not Unix', `a-z', `A-Z')
7237 @result{}GNUS NOT UNIX
7238 translit(`GNUs not Unix', `A-Z', `z-a')
7239 @result{}tmfs not fnix
7240 translit(`+,-12345', `+--1-5', `<;>a-c-a')
7242 translit(`abcdef', `aabdef', `bcged')
7246 In the @sc{ascii} encoding, the first example deletes all uppercase
7247 letters, the second converts lowercase to uppercase, and the third
7248 `mirrors' all uppercase letters, while converting them to lowercase.
7249 The two first cases are by far the most common, even though they are not
7250 portable to @sc{ebcdic} or other encodings. The fourth example shows a
7251 range ending in @samp{-}, as well as back-to-back ranges. The final
7252 example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the
7253 resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and
7254 @samp{e} are swapped, and the @samp{f} is discarded.
7256 Omitting @var{chars} evokes a warning, but still produces output.
7260 @error{}m4:stdin:1: warning: translit: too few arguments: 1 < 2
7265 @section Substituting text by regular expression
7267 @cindex regular expressions
7268 @cindex expressions, regular
7269 @cindex pattern substitution
7270 @cindex substitution by regular expression
7271 @cindex GNU extensions
7272 Global substitution in a string is done by @code{patsubst}:
7274 @deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @
7275 @ovar{replacement}, @ovar{resyntax})
7276 Searches @var{string} for matches of @var{regexp}, and substitutes
7277 @var{replacement} for each match.
7279 If @var{resyntax} is given, the particular flavor of regular expression
7280 understood with respect to @var{regexp} can be changed from the current
7281 default. @xref{Changeresyntax}, for details of the values that can be
7282 given for this argument. Unlike @var{regexp}, if exactly three
7283 arguments given, the third argument is always treated as
7284 @var{replacement}, even if it matches a known syntax name.
7286 The parts of @var{string} that are not covered by any match of
7287 @var{regexp} are copied to the expansion. Whenever a match is found, the
7288 search proceeds from the end of the match, so a character from
7289 @var{string} will never be substituted twice. If @var{regexp} matches a
7290 string of zero length, the start position for the search is incremented,
7291 to avoid infinite loops.
7293 When a replacement is to be made, @var{replacement} is inserted into
7294 the expansion, with @samp{\@var{n}} substituted by the text matched by
7295 the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to
7296 nine sub-expressions. The escape @samp{\&} is replaced by the text of
7297 the entire regular expression matched. For all other characters,
7298 @samp{\} treats the next character literally. A warning is issued if
7299 there were fewer sub-expressions than the @samp{\@var{n}} requested, or
7300 if there is a trailing @samp{\}.
7302 The @var{replacement} argument can be omitted, in which case the text
7303 matched by @var{regexp} is deleted.
7305 The macro @code{patsubst} is recognized only with parameters.
7308 When used with two arguments, @code{regexp} returns the position of the
7309 match, but @code{patsubst} deletes the match:
7312 patsubst(`GNUs not Unix', `^', `OBS: ')
7313 @result{}OBS: GNUs not Unix
7314 patsubst(`GNUs not Unix', `\<', `OBS: ')
7315 @result{}OBS: GNUs OBS: not OBS: Unix
7316 patsubst(`GNUs not Unix', `\w*', `(\&)')
7317 @result{}(GNUs)() (not)() (Unix)()
7318 patsubst(`GNUs not Unix', `\w+', `(\&)')
7319 @result{}(GNUs) (not) (Unix)
7320 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
7321 @result{}GN not@w{ }
7322 patsubst(`GNUs not Unix', `not', `NOT\')
7323 @error{}m4:stdin:6: warning: patsubst: trailing \ ignored in replacement
7324 @result{}GNUs NOT Unix
7327 Here is a slightly more realistic example, which capitalizes individual
7328 words or whole sentences, by substituting calls of the macros
7329 @code{upcase} and @code{downcase} into the strings.
7331 @deffn Composite upcase (@var{text})
7332 @deffnx Composite downcase (@var{text})
7333 @deffnx Composite capitalize (@var{text})
7334 Expand to @var{text}, but with capitalization changed: @code{upcase}
7335 changes all letters to upper case, @code{downcase} changes all letters
7336 to lower case, and @code{capitalize} changes the first character of each
7337 word to upper case and the remaining characters to lower case.
7340 First, an example of their usage, using implementations distributed in
7341 @file{m4-@value{VERSION}/@/doc/examples/@/capitalize.m4}.
7345 $ @kbd{m4 -I doc/examples}
7346 include(`capitalize.m4')
7348 upcase(`GNUs not Unix')
7349 @result{}GNUS NOT UNIX
7350 downcase(`GNUs not Unix')
7351 @result{}gnus not unix
7352 capitalize(`GNUs not Unix')
7353 @result{}Gnus Not Unix
7356 Now for the implementation. There is a helper macro @code{_capitalize}
7357 which puts only its first word in mixed case. Then @code{capitalize}
7358 merely parses out the words, and replaces them with an invocation of
7359 @code{_capitalize}. (As presented here, the @code{capitalize} macro has
7360 some subtle flaws. You should try to see if you can find and correct
7361 them; or @pxref{Improved capitalize, , Answers}).
7365 $ @kbd{m4 -I doc/examples}
7366 undivert(`capitalize.m4')dnl
7367 @result{}divert(`-1')
7368 @result{}# upcase(text)
7369 @result{}# downcase(text)
7370 @result{}# capitalize(text)
7371 @result{}# change case of text, simple version
7372 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
7373 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
7374 @result{}define(`_capitalize',
7375 @result{} `regexp(`$1', `^\(\w\)\(\w*\)',
7376 @result{} `upcase(`\1')`'downcase(`\2')')')
7377 @result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
7378 @result{}divert`'dnl
7381 If @var{resyntax} is given, @var{regexp} must be given according to
7382 the syntax chosen, though the default regular expression syntax
7383 remains unchanged for other invocations:
7387 `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl
7388 epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_')
7389 @result{}_bar_ _baz_
7390 patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_')
7391 @result{}_bar_ _baz_
7394 While @code{regexp} replaces the whole input with the replacement as
7395 soon as there is a match, @code{patsubst} replaces each
7396 @emph{occurrence} of a match and preserves non-matching pieces:
7402 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
7403 @result{}bar FOO baz FOO
7405 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
7406 @result{}bab abb 212
7410 Omitting @var{regexp} evokes a warning, but still produces output;
7411 contrast this with an empty @var{regexp} argument.
7415 @error{}m4:stdin:1: warning: patsubst: too few arguments: 1 < 2
7419 patsubst(`abc', `', `\\-')
7420 @result{}\-a\-b\-c\-
7424 @section Formatting strings (printf-like)
7426 @cindex formatted output
7427 @cindex output, formatted
7428 @cindex GNU extensions
7429 Formatted output can be made with @code{format}:
7431 @deffn {Builtin (gnu)} format (@var{format-string}, @dots{})
7432 Works much like the C function @code{printf}. The first argument
7433 @var{format-string} can contain @samp{%} specifications which are
7434 satisfied by additional arguments, and the expansion of @code{format} is
7435 the formatted string.
7437 The macro @code{format} is recognized only with parameters.
7440 Its use is best described by a few examples:
7442 @comment This test is a bit fragile, if someone tries to port to a
7443 @comment platform without infinity.
7445 define(`foo', `The brown fox jumped over the lazy dog')
7447 format(`The string "%s" uses %d characters', foo, len(foo))
7448 @result{}The string "The brown fox jumped over the lazy dog" uses 38 characters
7449 format(`%*.*d', `-1', `-1', `1')
7451 format(`%.0f', `56789.9876')
7453 len(format(`%-*X', `5000', `1'))
7455 ifelse(format(`%010F', `infinity'), ` INF', `success',
7456 format(`%010F', `infinity'), ` INFINITY', `success',
7457 format(`%010F', `infinity'))
7459 ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success',
7460 format(`%.1A', `1.999'), `0X2.0P+0', `success',
7461 format(`%.1A', `1.999'))
7463 format(`%g', `0xa.P+1')
7467 Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this
7468 example shows how @code{format} can be used to produce tabular output.
7472 $ @kbd{m4 -I doc/examples}
7473 include(`forloop.m4')
7475 forloop(`i', `1', `10', `format(`%6d squared is %10d
7477 @result{} 1 squared is 1
7478 @result{} 2 squared is 4
7479 @result{} 3 squared is 9
7480 @result{} 4 squared is 16
7481 @result{} 5 squared is 25
7482 @result{} 6 squared is 36
7483 @result{} 7 squared is 49
7484 @result{} 8 squared is 64
7485 @result{} 9 squared is 81
7486 @result{} 10 squared is 100
7490 The builtin @code{format} is modeled after the ANSI C @samp{printf}
7491 function, and supports these @samp{%} specifiers: @samp{c}, @samp{s},
7492 @samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A},
7493 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and
7494 @samp{%}; it supports field widths and precisions, and the flags
7495 @samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For
7496 integer specifiers, the width modifiers @samp{hh}, @samp{h}, and
7497 @samp{l} are recognized, and for floating point specifiers, the width
7498 modifier @samp{l} is recognized. Items not yet supported include
7499 positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C}
7500 specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll}
7501 modifiers, and any platform extensions available in the native
7502 @code{printf}. For more details on the functioning of @code{printf},
7503 see the C Library Manual, or the POSIX specification (for
7504 example, @samp{%a} is supported even on platforms that haven't yet
7505 implemented C99 hexadecimal floating point output natively).
7507 @c FIXME - format still needs some improvements.
7508 Warnings are issued for unrecognized specifiers, an improper number of
7509 arguments, or difficulty parsing an argument according to the format
7510 string (such as overflow or extra characters). It is anticipated that a
7511 future release of GNU @code{m4} will support more specifiers.
7512 Likewise, escape sequences are not yet recognized.
7516 @error{}m4:stdin:1: warning: format: unrecognized specifier in '%p'
7519 @error{}m4:stdin:2: warning: format: empty string treated as 0
7520 @error{}m4:stdin:2: warning: format: too few arguments: 2 < 3
7522 format(`%.1f', `2a')
7523 @error{}m4:stdin:3: warning: format: non-numeric argument '2a'
7528 @comment Expose a crash with a bad format string fixed in 1.4.15.
7529 @comment Unfortunately, 8-bit bytes are hard to check for; but the
7530 @comment exit status is enough to sniff the crash in broken versions.
7533 format(`%'format(`%c', `128'))
7540 @chapter Macros for doing arithmetic
7543 @cindex integer arithmetic
7544 Integer arithmetic is included in @code{m4}, with a C-like syntax. As
7545 convenient shorthands, there are builtins for simple increment and
7546 decrement operations.
7549 * Incr:: Decrement and increment operators
7550 * Eval:: Evaluating integer expressions
7551 * Mpeval:: Multiple precision arithmetic
7555 @section Decrement and increment operators
7557 @cindex decrement operator
7558 @cindex increment operator
7559 Increment and decrement of integers are supported using the builtins
7560 @code{incr} and @code{decr}:
7562 @deffn {Builtin (m4)} incr (@var{number})
7563 @deffnx {Builtin (m4)} decr (@var{number})
7564 Expand to the numerical value of @var{number}, incremented
7565 or decremented, respectively, by one. Except for the empty string, the
7566 expansion is empty if @var{number} could not be parsed.
7568 The macros @code{incr} and @code{decr} are recognized only with
7578 @error{}m4:stdin:3: warning: incr: empty string treated as 0
7581 @error{}m4:stdin:4: warning: decr: empty string treated as 0
7585 The builtin macros @code{incr} and @code{decr} are recognized only when
7589 @section Evaluating integer expressions
7591 @cindex integer expression evaluation
7592 @cindex evaluation, of integer expressions
7593 @cindex expressions, evaluation of integer
7594 Integer expressions are evaluated with @code{eval}:
7596 @deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width})
7597 Expands to the value of @var{expression}. The expansion is empty
7598 if a problem is encountered while parsing the arguments. If specified,
7599 @var{radix} and @var{width} control the format of the output.
7601 Calculations are done with signed numbers, using at least 31-bit
7602 precision, but as a GNU extension, @code{m4} will use wider
7603 integers if available. Precision is finite, based on the platform's
7604 notion of @code{intmax_t}, and overflow silently results in wraparound.
7605 A warning is issued if division by zero is attempted, or if
7606 @var{expression} could not be parsed.
7608 Expressions can contain the following operators, listed in order of
7609 decreasing precedence.
7615 Unary plus and minus, and bitwise and logical negation
7619 Multiplication, division, modulo, and ratio
7621 Addition and subtraction
7623 Shift left, shift right, unsigned shift right
7625 Relational operators
7631 Bitwise exclusive-or
7641 Sequential evaluation
7644 The macro @code{eval} is recognized only with parameters.
7647 All binary operators, except exponentiation, are left associative. C
7648 operators that perform variable assignment, such as @samp{+=} or
7649 @samp{--}, are not implemented, since @code{eval} only operates on
7650 constants, not variables. Attempting to use them results in an error.
7651 @comment FIXME - since XCU ERN 137 is approved, we could provide an
7652 @comment extension that supported assignment operators.
7654 Note that some older @code{m4} implementations use @samp{^} as an
7655 alternate operator for the exponentiation, although POSIX
7656 requires the C behavior of bitwise exclusive-or. The precedence of the
7657 negation operators, @samp{~} and @samp{!}, was traditionally lower than
7658 equality. The unary operators could not be used reliably more than once
7659 on the same term without intervening parentheses. The traditional
7660 precedence of the equality operators @samp{==} and @samp{!=} was
7661 identical instead of lower than the relational operators such as
7662 @samp{<}, even through GNU M4 1.4.8. Starting with version
7663 1.4.9, GNU M4 correctly follows POSIX precedence
7664 rules. M4 scripts designed to be portable between releases must be
7665 aware that parentheses may be required to enforce C precedence rules.
7666 Likewise, division by zero, even in the unused branch of a
7667 short-circuiting operator, is not always well-defined in other
7670 Following are some examples where the current version of M4 follows C
7671 precedence rules, but where older versions and some other
7672 implementations of @code{m4} require explicit parentheses to get the
7678 eval(`(1 == 2) > 0')
7688 eval(`+ + - ~ ! ~ 0')
7691 @error{}m4:stdin:8: warning: eval: invalid operator: '++0'
7694 @error{}m4:stdin:9: warning: eval: invalid operator: '1 = 1'
7697 @error{}m4:stdin:10: warning: eval: invalid operator: '0 |= 1'
7702 @error{}m4:stdin:12: warning: eval: divide by zero: '0 || 1 / 0'
7707 @error{}m4:stdin:14: warning: eval: modulo by zero: '2 && 1 % 0'
7711 @cindex GNU extensions
7712 As a GNU extension, @code{eval} supports several operators
7713 that do not appear in C@. A right-associative exponentiation operator
7714 @samp{**} computes the value of the left argument raised to the right,
7715 modulo the numeric precision width. If evaluated, the exponent must be
7716 non-negative, and at least one of the arguments must be non-zero, or a
7717 warning is issued. An unsigned shift operator @samp{>>>} allows
7718 shifting a negative number as though it were an unsigned bit pattern,
7719 which shifts in 0 bits rather than twos-complement sign-extension. A
7720 ratio operator @samp{\} behaves like normal division @samp{/} on
7721 integers, but is provided for symmetry with @code{mpeval}.
7722 Additionally, the C operators @samp{,} and @samp{?:} are supported.
7727 eval(`(2 ** 3) ** 2')
7735 @error{}m4:stdin:5: warning: eval: divide by zero: '0 ** 0'
7737 @error{}m4:stdin:6: warning: eval: negative exponent: '4 ** -2'
7739 eval(`2 || 4 ** -2')
7741 eval(`(-1 >> 1) == -1')
7743 eval(`(-1 >>> 1) > (1 << 30)')
7759 Within @var{expression}, (but not @var{radix} or @var{width}), numbers
7760 without a special prefix are decimal. A simple @samp{0} prefix
7761 introduces an octal number. @samp{0x} introduces a hexadecimal number.
7762 As GNU extensions, @samp{0b} introduces a binary number.
7763 @samp{0r} introduces a number expressed in any radix between 1 and 36:
7764 the prefix should be immediately followed by the decimal expression of
7765 the radix, a colon, then the digits making the number. For radix 1,
7766 leading zeros are ignored, and all remaining digits must be @samp{1};
7767 for all other radices, the digits are @samp{0}, @samp{1}, @samp{2},
7768 @dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up
7769 to @samp{z}. Lower and upper case letters can be used interchangeably
7770 in numbers prefixes and as number digits.
7772 Parentheses may be used to group subexpressions whenever needed. For the
7773 relational operators, a true relation returns @code{1}, and a false
7774 relation return @code{0}.
7776 Here are a few examples of use of @code{eval}.
7787 eval(index(`Hello world', `llo') >= 0)
7789 eval(`0r1:0111 + 0b100 + 0r3:12')
7791 define(`square', `eval(`($1) ** 2')')
7795 square(square(`5')` + 1')
7797 define(`foo', `666')
7800 @error{}m4:stdin:11: warning: eval: bad expression: 'foo / 6'
7806 As the last two lines show, @code{eval} does not handle macro
7807 names, even if they expand to a valid expression (or part of a valid
7808 expression). Therefore all macros must be expanded before they are
7809 passed to @code{eval}.
7810 @comment update this if we add support for variables.
7812 Some calculations are not portable to other implementations, since they
7813 have undefined semantics in C, but GNU @code{m4} has
7814 well-defined behavior on overflow. When shifting, an out-of-range shift
7815 amount is implicitly brought into the range of the precision using
7816 modulo arithmetic (for example, on 32-bit integers, this would be an
7817 implicit bit-wise and with 0x1f). This example should work whether your
7818 platform uses 32-bit integers, 64-bit integers, or even some other
7822 define(`max_int', eval(`-1 >>> 1'))
7824 define(`min_int', eval(max_int` + 1'))
7830 ifelse(eval(min_int` / -1'), min_int, `overflow occurred')
7831 @result{}overflow occurred
7832 eval(`0x80000000 % -1')
7836 eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2'))
7840 If @var{radix} is specified, it specifies the radix to be used in the
7841 expansion. The default radix is 10; this is also the case if
7842 @var{radix} is the empty string. A warning results if the radix is
7843 outside the range of 1 through 36, inclusive. The result of @code{eval}
7844 is always taken to be signed. No radix prefix is output, and for
7845 radices greater than 10, the digits are lower case (although some
7846 other implementations use upper case). The output is unquoted, and
7847 subject to further macro expansion. The @var{width}
7848 argument specifies the minimum output width, excluding any negative
7849 sign. The result is zero-padded to extend the expansion to the
7850 requested width. A warning results if the width is negative. If
7851 @var{radix} or @var{width} is out of bounds, the expansion of
7852 @code{eval} is empty.
7861 eval(`666', `6', `10')
7863 eval(`-666', `6', `10')
7864 @result{}-0000003030
7867 `0r1:'eval(`10', `1', `11')
7868 @result{}0r1:01111111111
7872 @error{}m4:stdin:9: warning: eval: radix out of range: 37
7875 @error{}m4:stdin:10: warning: eval: negative width: -1
7878 @error{}m4:stdin:11: warning: eval: empty string treated as 0
7881 @error{}m4:stdin:12: warning: eval: empty string treated as 0
7883 define(`a', `hi')eval(` 10 ', `16')
7888 @section Multiple precision arithmetic
7890 When @code{m4} is compiled with a multiple precision arithmetic library
7891 (@pxref{Experiments}), a builtin @code{mpeval} is defined.
7893 @deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @
7895 Behaves similarly to @code{eval}, except the calculations are done with
7896 infinite precision, and rational numbers are supported. Numbers may be
7899 The macro @code{mpeval} is recognized only with parameters.
7902 For the most part, using @code{mpeval} is similar to using @code{eval}:
7904 @comment options: mpeval -
7907 mpeval(`(1 << 70) + 2 ** 68 * 3', `16')
7908 @result{}700000000000000000
7909 `0r24:'mpeval(`0r36:zYx', `24', `5')
7913 The ratio operator, @samp{\}, is provided with the same precedence as
7914 division, and rationally divides two numbers and canonicalizes the
7915 result, whereas the division operator @samp{/} always returns the
7916 integer quotient of the division. To convert a rational value to
7917 integral, divide (@samp{/}) by 1. Some operators, such as @samp{%},
7918 @samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate
7919 only on integers and will truncate any rational remainder. The unsigned
7920 shift operator, @samp{>>>}, behaves identically with regular right
7921 shifts, @samp{>>}, since with infinite precision, it is not possible to
7922 convert a negative number to a positive using shifts. The
7923 exponentiation operator, @samp{**}, assumes that the exponent is
7924 integral, but allows negative exponents. With the short-circuit logical
7925 operators, @samp{||} and @samp{&&}, a non-zero result preserves the
7926 value of the argument that ended evaluation, rather than collapsing to
7927 @samp{1}. The operators @samp{?:} and @samp{,} are always available,
7928 even in POSIX mode, since @code{mpeval} does not have to
7929 conform to the POSIX rules for @code{eval}.
7931 @comment options: mpeval -
7948 @node Shell commands
7949 @chapter Macros for running shell commands
7951 @cindex UNIX commands, running
7952 @cindex executing shell commands
7953 @cindex running shell commands
7954 @cindex shell commands, running
7955 @cindex commands, running shell
7956 There are a few builtin macros in @code{m4} that allow you to run shell
7957 commands from within @code{m4}.
7959 Note that the definition of a valid shell command is system dependent.
7960 On UNIX systems, this is the typical @command{/bin/sh}. But on other
7961 systems, such as native Windows, the shell has a different syntax of
7962 commands that it understands. Some examples in this chapter assume
7963 @command{/bin/sh}, and also demonstrate how to quit early with a known
7964 exit value if this is not the case.
7967 * Platform macros:: Determining the platform
7968 * Syscmd:: Executing simple commands
7969 * Esyscmd:: Reading the output of commands
7970 * Sysval:: Exit status
7971 * Mkstemp:: Making temporary files
7972 * Mkdtemp:: Making temporary directories
7975 @node Platform macros
7976 @section Determining the platform
7978 @cindex platform macros
7979 Sometimes it is desirable for an input file to know which platform
7980 @code{m4} is running on. GNU @code{m4} provides several
7981 macros that are predefined to expand to the empty string; checking for
7982 their existence will confirm platform details.
7984 @deffn {Optional builtin (gnu)} __os2__
7985 @deffnx {Optional builtin (traditional)} os2
7986 @deffnx {Optional builtin (gnu)} __unix__
7987 @deffnx {Optional builtin (traditional)} unix
7988 @deffnx {Optional builtin (gnu)} __windows__
7989 @deffnx {Optional builtin (traditional)} windows
7990 Each of these macros is conditionally defined as needed to describe the
7991 environment of @code{m4}. If defined, each macro expands to the empty
7995 On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}}
7996 in the @samp{gnu} module, and @code{unix} in the @samp{traditional}
7999 On native Windows systems, GNU @code{m4} will define
8000 @code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in
8001 the @samp{traditional} module.
8003 On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}}
8004 in the @samp{gnu} module, and @code{os2} in the @samp{traditional}
8007 If GNU M4 does not provide a platform macro for your system,
8008 please report that as a bug.
8011 define(`provided', `0')
8013 ifdef(`__unix__', `define(`provided', incr(provided))')
8015 ifdef(`__windows__', `define(`provided', incr(provided))')
8017 ifdef(`__os2__', `define(`provided', incr(provided))')
8024 @section Executing simple commands
8026 Any shell command can be executed, using @code{syscmd}:
8028 @deffn {Builtin (m4)} syscmd (@var{shell-command})
8029 Executes @var{shell-command} as a shell command.
8031 The expansion of @code{syscmd} is void, @emph{not} the output from
8032 @var{shell-command}! Output or error messages from @var{shell-command}
8033 are not read by @code{m4}. @xref{Esyscmd}, if you need to process the
8036 Prior to executing the command, @code{m4} flushes its buffers.
8037 The default standard input, output and error of @var{shell-command} are
8038 the same as those of @code{m4}.
8040 By default, the @var{shell-command} will be used as the argument to the
8041 @option{-c} option of the @command{/bin/sh} shell (or the version of
8042 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8043 supports that). If you prefer a different shell, the
8044 @command{configure} script can be given the option
8045 @option{--with-syscmd-shell=@var{location}} to set the location of an
8046 alternative shell at GNU @code{m4} installation; the
8047 alternative shell must still support @option{-c}.
8049 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8050 m4}) is in effect, @code{syscmd} results in an error, since otherwise an
8051 input file could execute arbitrary code.
8053 The macro @code{syscmd} is recognized only with parameters.
8057 define(`foo', `FOO')
8064 Note how the expansion of @code{syscmd} keeps the trailing newline of
8065 the command, as well as using the newline that appeared after the macro.
8067 The following is an example of @var{shell-command} using the same
8068 standard input as @code{m4}:
8070 @comment The testsuite does not know how to parse pipes from the
8071 @comment texinfo. Fortunately, there are other tests in the testsuite
8072 @comment that test this same feature.
8075 $ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4}
8079 It tells @code{m4} to read all of its input before executing the wrapped
8080 text, then hands a valid (albeit emptied) pipe as standard input for the
8081 @code{cat} subcommand. Therefore, you should be careful when using
8082 standard input (either by specifying no files, or by passing @samp{-} as
8083 a file name on the command line, @pxref{Command line files, , Invoking
8084 m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd}
8085 that consume data from standard input. When standard input is a
8086 seekable file, the subprocess will pick up with the next character not
8087 yet processed by @code{m4}; when it is a pipe or other non-seekable
8088 file, there is no guarantee how much data will already be buffered by
8089 @code{m4} and thus unavailable to the child.
8091 Following is an example of how potentially unsafe actions can be
8094 @comment options: --safer
8099 @error{}m4:stdin:1: syscmd: disabled by --safer
8104 @section Reading the output of commands
8106 @cindex GNU extensions
8107 If you want @code{m4} to read the output of a shell command, use
8110 @deffn {Builtin (gnu)} esyscmd (@var{shell-command})
8111 Expands to the standard output of the shell command
8112 @var{shell-command}.
8114 Prior to executing the command, @code{m4} flushes its buffers.
8115 The default standard input and standard error of @var{shell-command} are
8116 the same as those of @code{m4}. The error output of @var{shell-command}
8117 is not a part of the expansion: it will appear along with the error
8118 output of @code{m4}.
8120 By default, the @var{shell-command} will be used as the argument to the
8121 @option{-c} option of the @command{/bin/sh} shell (or the version of
8122 @command{sh} specified by @samp{command -p getconf PATH}, if your system
8123 supports that). If you prefer a different shell, the
8124 @command{configure} script can be given the option
8125 @option{--with-syscmd-shell=@var{location}} to set the location of an
8126 alternative shell at GNU @code{m4} installation; the
8127 alternative shell must still support @option{-c}.
8129 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8130 m4}) is in effect, @code{esyscmd} results in an error, since otherwise
8131 an input file could execute arbitrary code.
8133 The macro @code{esyscmd} is recognized only with parameters.
8137 define(`foo', `FOO')
8144 Note how the expansion of @code{esyscmd} keeps the trailing newline of
8145 the command, as well as using the newline that appeared after the macro.
8147 Just as with @code{syscmd}, care must be exercised when sharing standard
8148 input between @code{m4} and the child process of @code{esyscmd}.
8149 Likewise, potentially unsafe actions can be suppressed.
8151 @comment options: --safer
8156 @error{}m4:stdin:1: esyscmd: disabled by --safer
8161 @section Exit status
8163 @cindex UNIX commands, exit status from
8164 @cindex exit status from shell commands
8165 @cindex shell commands, exit status from
8166 @cindex commands, exit status from shell
8167 @cindex status of shell commands
8168 To see whether a shell command succeeded, use @code{sysval}:
8170 @deffn {Builtin (m4)} sysval
8171 Expands to the exit status of the last shell command run with
8172 @code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been
8181 ifelse(sysval, `0', `zero', `non-zero')
8193 ifelse(sysval, `0', `zero', `non-zero')
8195 esyscmd(`echo dnl && exit 127')
8205 @code{sysval} results in 127 if there was a problem executing the
8206 command, for example, if the system-imposed argument length is exceeded,
8207 or if there were not enough resources to fork. It is not possible to
8208 distinguish between failed execution and successful execution that had
8209 an exit status of 127, unless there was output from the child process.
8211 On UNIX platforms, where it is possible to detect when command execution
8212 is terminated by a signal, rather than a normal exit, the result is the
8213 signal number shifted left by eight bits.
8215 @comment This test has difficulties being portable, even on platforms
8216 @comment where syscmd invokes /bin/sh. Kill is not portable with signal
8217 @comment names. According to autoconf, the only portable signal numbers
8218 @comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But
8219 @comment all shells handle SIGINT, and ksh handles HUP (as in, the shell
8220 @comment exits normally rather than letting the signal terminate it).
8221 @comment Also, TERM is flaky, as it can also kill the running m4 on
8222 @comment systems where /bin/sh does not create its own process group.
8223 @comment And PIPE is unreliable, since people tend to run with it
8224 @comment ignored, with m4 inheriting that choice. That leaves KILL as
8225 @comment the only signal we can reliably test.
8227 dnl This test assumes kill is a shell builtin, and that signals are
8230 `errprint(` skipping: syscmd does not have unix semantics
8232 syscmd(`kill -9 $$')
8240 esyscmd(`kill -9 $$')
8246 When the @option{--safer} option (@pxref{Operation modes, , Invoking
8247 m4}) is in effect, @code{sysval} will always remain at its default value
8250 @comment options: --safer
8257 @error{}m4:stdin:2: syscmd: disabled by --safer
8264 @section Making temporary files
8266 @cindex temporary file names
8267 @cindex files, names of temporary
8268 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8269 temporary file, for output or for some other purpose. There is a
8270 builtin macro, @code{mkstemp}, for making a temporary file:
8272 @deffn {Builtin (m4)} mkstemp (@var{template})
8273 @deffnx {Builtin (m4)} maketemp (@var{template})
8274 Expands to the quoted name of a new, empty file, made from the string
8275 @var{template}, which should end with the string @samp{XXXXXX}. The six
8276 @samp{X} characters are then replaced with random characters matching
8277 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file
8278 name unique. If fewer than six @samp{X} characters are found at the end
8279 of @code{template}, the result will be longer than the template. The
8280 created file will have access permissions as if by @kbd{chmod =rw,go=},
8281 meaning that the current umask of the @code{m4} process is taken into
8282 account, and at most only the current user can read and write the file.
8284 The traditional behavior, standardized by POSIX, is that
8285 @code{maketemp} merely replaces the trailing @samp{X} with the process
8286 id, without creating a file or quoting the expansion, and without
8287 ensuring that the resulting
8288 string is a unique file name. In part, this means that using the same
8289 @var{template} twice in the same input file will result in the same
8290 expansion. This behavior is a security hole, as it is very easy for
8291 another process to guess the name that will be generated, and thus
8292 interfere with a subsequent use of @code{syscmd} trying to manipulate
8293 that file name. Hence, POSIX has recommended that all new
8294 implementations of @code{m4} provide the secure @code{mkstemp} builtin,
8295 and that users of @code{m4} check for its existence.
8297 The expansion is void and an error issued if a temporary file could
8300 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8301 is in effect, @code{mkstemp} and GNU-mode @code{maketemp}
8302 result in an error, since otherwise an input file could perform a mild
8303 denial-of-service attack by filling up a disk with multiple empty files.
8305 The macros @code{mkstemp} and @code{maketemp} are recognized only with
8309 If you try this next example, you will most likely get different output
8310 for the two file names, since the replacement characters are randomly
8316 define(`tmp', `oops')
8318 maketemp(`/tmp/fooXXXXXX')
8319 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8320 @result{}/tmp/fooa07346
8321 ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))',
8322 `define(`mkstemp', defn(`maketemp'))dnl
8323 errprint(`warning: potentially insecure maketemp implementation
8330 @comment options: --safer
8334 maketemp(`/tmp/fooXXXXXX')
8335 @error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead
8336 @error{}m4:stdin:1: maketemp: disabled by --safer
8338 mkstemp(`/tmp/fooXXXXXX')
8339 @error{}m4:stdin:2: mkstemp: disabled by --safer
8343 @cindex GNU extensions
8344 Unless you use the @option{--traditional} command line option (or
8345 @option{-G}, @pxref{Limits control, , Invoking m4}), the GNU
8346 version of @code{maketemp} is secure. This means that using the same
8347 template to multiple calls will generate multiple files. However, we
8348 recommend that you use the new @code{mkstemp} macro, introduced in
8349 GNU M4 1.4.8, which is secure even in traditional mode. Also,
8350 as of M4 1.4.11, the secure implementation quotes the resulting file
8351 name, so that you are guaranteed to know what file was created even if
8352 the random file name happens to match an existing macro. Notice that
8353 this example is careful to use @code{defn} to avoid unintended expansion
8358 define(`foo', `errprint(`oops')')
8360 syscmd(`rm -f foo-??????')sysval
8362 define(`file1', maketemp(`foo-XXXXXX'))dnl
8363 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8364 ifelse(esyscmd(`echo \` foo-?????? \''), `foo-??????',
8365 `no file', `created')
8367 define(`file2', maketemp(`foo-XX'))dnl
8368 @error{}m4:stdin:6: warning: maketemp: recommend using mkstemp instead
8369 define(`file3', mkstemp(`foo-XXXXXX'))dnl
8370 ifelse(len(defn(`file1')), len(defn(`file2')),
8371 `same length', `different')
8372 @result{}same length
8373 ifelse(defn(`file1'), defn(`file2'), `same', `different file')
8374 @result{}different file
8375 ifelse(defn(`file2'), defn(`file3'), `same', `different file')
8376 @result{}different file
8377 ifelse(defn(`file1'), defn(`file3'), `same', `different file')
8378 @result{}different file
8379 syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3'))
8385 @comment options: -G
8388 syscmd(`rm -f foo-*')sysval
8390 define(`file1', maketemp(`foo-XXXXXX'))dnl
8391 @error{}m4:stdin:2: warning: maketemp: recommend using mkstemp instead
8392 define(`file2', maketemp(`foo-XXXXXX'))dnl
8393 @error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead
8394 ifelse(file1, file2, `same', `different file')
8396 len(maketemp(`foo-XXXXX'))
8397 @error{}m4:stdin:5: warning: maketemp: recommend using mkstemp instead
8399 define(`abc', `def')
8403 @error{}m4:stdin:7: warning: maketemp: recommend using mkstemp instead
8404 syscmd(`test -f foo-*')sysval
8409 @section Making temporary directories
8411 @cindex temporary directory
8412 @cindex directories, temporary
8413 @cindex GNU extensions
8414 Commands specified to @code{syscmd} or @code{esyscmd} might need a
8415 temporary directory, for holding multiple temporary files; such a
8416 directory can be created with @code{mkdtemp}:
8418 @deffn {Builtin (gnu)} mkdtemp (@var{template})
8419 Expands to the quoted name of a new, empty directory, made from the string
8420 @var{template}, which should end with the string @samp{XXXXXX}. The six
8421 @samp{X} characters are then replaced with random characters matching
8422 the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name
8423 unique. If fewer than six @samp{X} characters are found at the end of
8424 @code{template}, the result will be longer than the template. The
8425 created directory will have access permissions as if by @kbd{chmod
8426 =rwx,go=}, meaning that the current umask of the @code{m4} process is
8427 taken into account, and at most only the current user can read, write,
8428 and search the directory.
8430 The expansion is void and an error issued if a temporary directory could
8433 When the @option{--safer} option (@pxref{Operation modes, Invoking m4})
8434 is in effect, @code{mkdtemp} results in an error, since otherwise an
8435 input file could perform a mild denial-of-service attack by filling up a
8436 disk with multiple directories.
8438 The macro @code{mkdtemp} is recognized only with parameters.
8439 This macro was added in M4 2.0.
8442 If you try this next example, you will most likely get different output
8443 for the directory names, since the replacement characters are randomly
8449 define(`tmp', `oops')
8451 mkdtemp(`/tmp/fooXXXXXX')
8452 @result{}/tmp/foo2h89Vo
8457 @comment options: --safer
8461 mkdtemp(`/tmp/fooXXXXXX')
8462 @error{}m4:stdin:1: mkdtemp: disabled by --safer
8466 Multiple calls with the same template will generate multiple
8471 syscmd(`echo foo??????')dnl
8473 define(`dir1', mkdtemp(`fooXXXXXX'))dnl
8474 ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created')
8476 define(`dir2', mkdtemp(`fooXXXXXX'))dnl
8477 ifelse(dir1, dir2, `same', `different directories')
8478 @result{}different directories
8479 syscmd(`rmdir 'dir1 dir2)
8486 @chapter Miscellaneous builtin macros
8488 This chapter describes various builtins, that do not really belong in
8489 any of the previous chapters.
8492 * Errprint:: Printing error messages
8493 * Location:: Printing current location
8494 * M4exit:: Exiting from @code{m4}
8495 * Syncoutput:: Turning on and off sync lines
8499 @section Printing error messages
8501 @cindex printing error messages
8502 @cindex error messages, printing
8503 @cindex messages, printing error
8504 @cindex standard error, output to
8505 You can print error messages using @code{errprint}:
8507 @deffn {Builtin (m4)} errprint (@var{message}, @dots{})
8508 Prints @var{message} and the rest of the arguments to standard error,
8509 separated by spaces. Standard error is used, regardless of the
8510 @option{--debugfile} option (@pxref{Debugging options, , Invoking m4}).
8512 The expansion of @code{errprint} is void.
8513 The macro @code{errprint} is recognized only with parameters.
8517 errprint(`Invalid arguments to forloop
8519 @error{}Invalid arguments to forloop
8521 errprint(`1')errprint(`2',`3
8527 A trailing newline is @emph{not} printed automatically, so it should be
8528 supplied as part of the argument, as in the example. Unfortunately, the
8529 exact output of @code{errprint} is not very portable to other @code{m4}
8530 implementations: POSIX requires that all arguments be printed,
8531 but some implementations of @code{m4} only print the first.
8532 Furthermore, some BSD implementations always append a newline
8533 for each @code{errprint} call, regardless of whether the last argument
8534 already had one, and POSIX is silent on whether this is
8538 @section Printing current location
8540 @cindex location, input
8541 @cindex input location
8542 To make it possible to specify the location of an error, three
8543 utility builtins exist:
8545 @deffn {Builtin (gnu)} __file__
8546 @deffnx {Builtin (gnu)} __line__
8547 @deffnx {Builtin (gnu)} __program__
8548 Expand to the quoted name of the current input file, the
8549 current input line number in that file, and the quoted name of the
8550 current invocation of @code{m4}.
8554 errprint(__program__:__file__:__line__: `input error
8556 @error{}m4:stdin:1: input error
8560 Line numbers start at 1 for each file. If the file was found due to the
8561 @option{-I} option or @env{M4PATH} environment variable, that is
8562 reflected in the file name. Synclines, via @code{syncoutput}
8563 (@pxref{Syncoutput}) or the command line option @option{--synclines}
8564 (or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the
8565 @samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}),
8566 also use this notion of current file and line. Redefining the three
8567 location macros has no effect on syncline, debug, warning, or error
8570 This example reuses the file @file{incl.m4} mentioned earlier
8575 $ @kbd{m4 -I doc/examples}
8576 define(`foo', ``$0' called at __file__:__line__')
8579 @result{}foo called at stdin:2
8581 @result{}Include file start
8582 @result{}foo called at doc/examples/incl.m4:2
8583 @result{}Include file end
8587 The location of macros invoked during the rescanning of macro expansion
8588 text corresponds to the location in the file where the expansion was
8589 triggered, regardless of how many newline characters the expansion text
8590 contains. As of GNU M4 1.4.8, the location of text wrapped
8591 with @code{m4wrap} (@pxref{M4wrap}) is the point at which the
8592 @code{m4wrap} was invoked. Previous versions, however, behaved as
8593 though wrapped text came from line 0 of the file ``''.
8596 define(`echo', `$@@')
8598 define(`foo', `echo(__line__
8608 foo(errprint(__line__
8626 The @code{@w{__program__}} macro behaves like @samp{$0} in shell
8627 terminology. If you invoke @code{m4} through an absolute path or a link
8628 with a different spelling, rather than by relying on a @env{PATH} search
8629 for plain @samp{m4}, it will affect how @code{@w{__program__}} expands.
8630 The intent is that you can use it to produce error messages with the
8631 same formatting that @code{m4} produces internally. It can also be used
8632 within @code{syscmd} (@pxref{Syscmd}) to pick the same version of
8633 @code{m4} that is currently running, rather than whatever version of
8634 @code{m4} happens to be first in @env{PATH}. It was first introduced in
8638 @section Exiting from @code{m4}
8640 @cindex exiting from @code{m4}
8641 @cindex status, setting @code{m4} exit
8642 If you need to exit from @code{m4} before the entire input has been
8643 read, you can use @code{m4exit}:
8645 @deffn {Builtin (m4)} m4exit (@ovar{code})
8646 Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is
8647 left out, the exit status is zero. If @var{code} cannot be parsed, or
8648 is outside the range of 0 to 255, the exit status is one. No further
8649 input is read, and all wrapped and diverted text is discarded.
8653 m4wrap(`This text is lost due to `m4exit'.')
8655 divert(`1') So is this.
8658 m4exit And this is never read.
8661 A common use of this is to abort processing:
8663 @deffn Composite fatal_error (@var{message})
8664 Abort processing with an error message and non-zero status. Prefix
8665 @var{message} with details about where the error occurred, and print the
8666 resulting string to standard error.
8671 define(`fatal_error',
8672 `errprint(__program__:__file__:__line__`: fatal error: $*
8675 fatal_error(`this is a BAD one, buster')
8676 @error{}m4:stdin:4: fatal error: this is a BAD one, buster
8679 After this macro call, @code{m4} will exit with exit status 1. This macro
8680 is only intended for error exits, since the normal exit procedures are
8681 not followed, i.e., diverted text is not undiverted, and saved text
8682 (@pxref{M4wrap}) is not reread. (This macro could be made more robust
8683 to earlier versions of @code{m4}. You should try to see if you can find
8684 weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}).
8686 Note that it is still possible for the exit status to be different than
8687 what was requested by @code{m4exit}. If @code{m4} detects some other
8688 error, such as a write error on standard output, the exit status will be
8689 non-zero even if @code{m4exit} requested zero.
8691 If standard input is seekable, then the file will be positioned at the
8692 next unread character. If it is a pipe or other non-seekable file,
8693 then there are no guarantees how much data @code{m4} might have read
8694 into buffers, and thus discarded.
8697 @section Turning on and off sync lines
8699 @cindex toggling synchronization lines
8700 @cindex synchronization lines
8701 @cindex location, input
8702 @cindex input location
8703 It is possible to adjust whether synclines are printed to output:
8705 @deffn {Builtin (gnu)} syncoutput (@var{truth})
8706 If @var{truth} matches the extended regular expression
8707 @samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the
8708 form: @samp{#line <number> ["<file>"]}.
8710 If @var{truth} is empty, or matches the extended regular expression
8711 @samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off.
8713 All other arguments are ignored and issue a warning.
8715 The macro @code{syncoutput} is recognized only with parameters.
8716 This macro was added in M4 2.0.
8720 define(`twoline', `1
8723 changecom(`/*', `*/')
8725 define(`comment', `/*1
8733 @result{}#line 8 "stdin"
8761 @error{}m4:stdin:18: warning: syncoutput: unknown directive 'blah'
8765 Notice that a syncline is output any time a single source line expands
8766 to multiple output lines, or any time multiple source lines expand to a
8767 single output line. When there is a one-for-one correspondence, no
8768 additional synclines are needed.
8770 Synchronization lines can be used to track where input comes from; an
8771 optional file designation is printed when the syncline algorithm
8772 detects that consecutive output lines come from different files. You
8773 can also use the @option{--synclines} command-line option (or
8774 @option{-s}, @pxref{Preprocessor features, , Invoking m4}) to start
8775 with synchronization on. This example reuses the file @file{incl.m4}
8776 mentioned earlier (@pxref{Include}):
8779 @comment options: -s
8781 $ @kbd{m4 --synclines -I doc/examples}
8783 @result{}#line 1 "doc/examples/incl.m4"
8784 @result{}Include file start
8786 @result{}Include file end
8787 @result{}#line 1 "stdin"
8792 @chapter Fast loading of frozen state
8794 Some bigger @code{m4} applications may be built over a common base
8795 containing hundreds of definitions and other costly initializations.
8796 Usually, the common base is kept in one or more declarative files,
8797 which files are listed on each @code{m4} invocation prior to the
8798 user's input file, or else each input file uses @code{include}.
8800 Reading the common base of a big application, over and over again, may
8801 be time consuming. GNU @code{m4} offers some machinery to
8802 speed up the start of an application using lengthy common bases.
8805 * Using frozen files:: Using frozen files
8806 * Frozen file format 1:: Frozen file format 1
8807 * Frozen file format 2:: Frozen file format 2
8810 @node Using frozen files
8811 @section Using frozen files
8813 @cindex fast loading of frozen files
8814 @cindex frozen files for fast loading
8815 @cindex initialization, frozen state
8816 @cindex dumping into frozen file
8817 @cindex reloading a frozen file
8818 @cindex GNU extensions
8819 Suppose a user has a library of @code{m4} initializations in
8820 @file{base.m4}, which is then used with multiple input files:
8824 $ @kbd{m4 base.m4 input1.m4}
8825 $ @kbd{m4 base.m4 input2.m4}
8826 $ @kbd{m4 base.m4 input3.m4}
8829 Rather than spending time parsing the fixed contents of @file{base.m4}
8830 every time, the user might rather execute:
8834 $ @kbd{m4 -F base.m4f base.m4}
8838 once, and further execute, as often as needed:
8842 $ @kbd{m4 -R base.m4f input1.m4}
8843 $ @kbd{m4 -R base.m4f input2.m4}
8844 $ @kbd{m4 -R base.m4f input3.m4}
8848 with the varying input. The first call, containing the @option{-F}
8849 option, only reads and executes file @file{base.m4}, defining
8850 various application macros and computing other initializations.
8851 Once the input file @file{base.m4} has been completely processed, GNU
8852 @code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a
8853 file which contains a kind of snapshot of the @code{m4} internal state.
8855 Later calls, containing the @option{-R} option, are able to reload
8856 the internal state of @code{m4}, from @file{base.m4f},
8857 @emph{prior} to reading any other input files. This means
8858 instead of starting with a virgin copy of @code{m4}, input will be
8859 read after having effectively recovered the effect of a prior run.
8860 In our example, the effect is the same as if file @file{base.m4} has
8861 been read anew. However, this effect is achieved a lot faster.
8863 Only one frozen file may be created or read in any one @code{m4}
8864 invocation. It is not possible to recover two frozen files at once.
8865 However, frozen files may be updated incrementally, through using
8866 @option{-R} and @option{-F} options simultaneously. For example, if
8867 some care is taken, the command:
8871 $ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4}
8875 could be broken down in the following sequence, accumulating the same
8880 $ @kbd{m4 -F file1.m4f file1.m4}
8881 $ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4}
8882 $ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4}
8883 $ @kbd{m4 -R file3.m4f file4.m4}
8886 Some care is necessary because the frozen file does not save all state
8887 information. Stacks of macro definitions via @code{pushdef} are
8888 accurately stored, along with all renamed or undefined builtins, as are
8889 the current syntax rules such as from @code{changequote}. However, the
8890 value of @code{sysval} and text saved in @code{m4wrap} are not currently
8891 preserved. Also, changing command line options between runs may cause
8892 unexpected behavior. A future release of GNU M4 may improve
8893 on the quality of frozen files.
8895 When an @code{m4} run is to be frozen, the automatic undiversion
8896 which takes place at end of execution is inhibited. Instead, all
8897 positively numbered diversions are saved into the frozen file.
8898 The active diversion number is also transmitted.
8900 A frozen file to be reloaded need not reside in the current directory.
8901 It is looked up the same way as an @code{include} file (@pxref{Search
8904 If the frozen file was generated with a newer version of @code{m4}, and
8905 contains directives that an older @code{m4} cannot parse, attempting to
8906 load the frozen file with option @option{-R} will cause @code{m4} to
8907 exit with status 63 to indicate version mismatch.
8909 @node Frozen file format 1
8910 @section Frozen file format 1
8912 @cindex frozen file format 1
8913 @cindex file format, frozen file version 1
8914 Frozen files are sharable across architectures. It is safe to write
8915 a frozen file on one machine and read it on another, given that the
8916 second machine uses the same or newer version of GNU @code{m4}.
8917 It is conventional, but not required, to give a frozen file the suffix
8920 Older versions of GNU @code{m4} create frozen files with
8921 syntax version 1. These files can be read by the current version, but
8922 are no longer produced. Version 1 files are mostly text files, although
8923 any macros or diversions that contained nonprintable characters or long
8924 lines cause the resulting frozen file to do likewise, since there are no
8925 escape sequences. The file can be edited to change the state that
8926 @code{m4} will start with. It is composed of several directives, each
8927 starting with a single letter and ending with a newline (@key{NL}).
8928 Wherever a directive is expected, the character @samp{#} can be used
8929 instead to introduce a comment line; empty lines are also ignored if
8930 they are not part of an embedded string.
8932 In the following descriptions, each @var{len} refers to the length of a
8933 corresponding subsequent @var{str}. Numbers are always expressed in
8934 decimal, and an omitted number defaults to 0. The valid directives in
8938 @item V @var{number} @key{NL}
8939 Confirms the format of the file. Version 1 is recognized when
8940 @var{number} is 1. This directive must be the first non-comment in the
8941 file, and may not appear more than once.
8943 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8944 Uses @var{str1} and @var{str2} as the begin-comment and
8945 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
8948 @item D @var{number}, @var{len} @key{NL} @var{str} @key{NL}
8949 Selects diversion @var{number}, making it current, then copy @var{str}
8950 in the current diversion. @var{number} may be a negative number for a
8951 diversion that discards text. To merely specify an active selection,
8952 use this command with an empty @var{str}. With 0 as the diversion
8953 @var{number}, @var{str} will be issued on standard output at reload
8954 time. GNU @code{m4} will not produce the @samp{D} directive
8955 with non-zero length for diversion 0, but this can be done with manual
8956 edits. This directive may appear more than once for the same diversion,
8957 in which case the diversion is the concatenation of the various uses.
8958 If omitted, then diversion 0 is current.
8960 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8961 Defines, through @code{pushdef}, a definition for @var{str1} expanding
8962 to the function whose builtin name is @var{str2}. If the builtin does
8963 not exist (for example, if the frozen file was produced by a copy of
8964 @code{m4} compiled with the now-abandoned @code{changeword} support),
8965 the reload is silent, but any subsequent use of the definition of
8966 @var{str1} will result in a warning. This directive may appear more
8967 than once for the same name, and its order, along with @samp{T}, is
8968 important. If omitted, you will have no access to any builtins.
8970 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8971 Uses @var{str1} and @var{str2} as the begin-quote and end-quote
8972 strings. If omitted, then @samp{`} and @samp{'} are the quote
8975 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
8976 Defines, though @code{pushdef}, a definition for @var{str1}
8977 expanding to the text given by @var{str2}. This directive may appear
8978 more than once for the same name, and its order, along with @samp{F}, is
8982 When loading format 1, the syntax categories @samp{@{} and @samp{@}} are
8983 disabled (reverting braces to be treated like plain characters). This
8984 is because frozen files created with M4 1.4.x did not understand
8985 @samp{$@{@dots{}@}} extended argument notation, and a frozen macro that
8986 contained this character sequence should not behave differently just
8987 because a newer version of M4 reloaded the file.
8989 @node Frozen file format 2
8990 @section Frozen file format 2
8992 @cindex frozen file format 2
8993 @cindex file format, frozen file version 2
8994 The syntax of version 1 has some drawbacks; if any macro or diversion
8995 contained non-printable characters or long lines, the resulting frozen
8996 file would not qualify as a text file, making it harder to edit with
8997 some vendor tools. The concatenation of multiple strings on a single
8998 line, such as for the @samp{T} directive, makes distinguishing the two
8999 strings a bit more difficult. Finally, the format lacks support for
9000 several items of @code{m4} state, such that a reloaded file did not
9001 always behave the same as the original file.
9003 These shortcomings have been addressed in version 2 of the frozen file
9004 syntax. New directives have been added, and existing directives have
9005 additional, and sometimes optional, parameters. All @var{str} instances
9006 in the grammar are now followed by @key{NL}, which makes the split
9007 between consecutive strings easier to recognize. Strings may now
9008 contain escape sequences modeled after C, such as @samp{\n} for newline
9009 or @samp{\0} for @sc{nul}, so that the frozen file can be pure
9010 @sc{ascii} (although when hand-editing a frozen file, it is still
9011 acceptable to use the original byte rather than an escape sequence for
9012 all bytes except @samp{\}). Also in the context of a @var{str}, the
9013 escape sequence @samp{\@key{NL}} is discarded, allowing a user to split
9014 lines that are too long for some platform tools.
9017 @item V @var{number} @key{NL}
9018 Confirms the format of the file. @code{m4} @value{VERSION} only creates
9019 frozen files where @var{number} is 2. This directive must be the first
9020 non-comment in the file, and may not appear more than once.
9022 @item C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9023 Uses @var{str1} and @var{str2} as the begin-comment and
9024 end-comment strings. If omitted, then @samp{#} and @key{NL} are the
9027 @item d @var{len} @key{NL} @var{str} @key{NL}
9028 Sets the debug flags, using @var{str} as the argument to
9029 @code{debugmode}. If omitted, then the debug flags start in their
9030 default disabled state.
9032 @item D @var{number} , @var{len} @key{NL} @var{str} @key{NL}
9033 Selects diversion @var{number}, making it current, then copy @var{str}
9034 in the current diversion. @var{number} may be a negative number for a
9035 diversion that discards text. To merely specify an active selection,
9036 use this command with an empty @var{string}. With 0 as the diversion
9037 @var{number}, @var{str} will be issued on standard output at reload
9038 time. GNU @code{m4} will not produce the @samp{D} directive
9039 with non-zero length for diversion 0, but this can be done with manual
9040 edits. This directive may appear more than once for the same diversion,
9041 in which case the diversion is the concatenation of the various uses.
9042 If omitted, then diversion 0 is current.
9044 @comment FIXME - the first usage, with only one string, is not supported
9045 @comment in the current code
9046 @c @item F @var{len1} @key{NL} @var{str1} @key{NL}
9047 @item F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9048 @itemx F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9049 Defines, through @code{pushdef}, a definition for @var{str1} expanding
9050 to the function whose builtin name is given by @var{str2} (defaulting to
9051 @var{str1} if not present). With two arguments, the builtin name is
9052 searched for among the intrinsic builtin functions only; with three
9053 arguments, the builtin name is searched for amongst the builtin
9054 functions defined by the module named by @var{str3}.
9056 @item M @var{len} @key{NL} @var{str} @key{NL}
9057 Names a module which will be searched for according to the module search
9058 path and loaded. Modules loaded from a frozen file don't add their
9059 builtin entries to the symbol table. Modules must be loaded prior to
9060 specifying module-specific builtins via the three-argument @code{F} or
9063 @item Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9064 Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings.
9065 If omitted, then @samp{`} and @samp{'} are the quote delimiters.
9067 @item R @var{len} @key{NL} @var{str} @key{NL}
9068 Sets the default regexp syntax, where @var{str} encodes one of the
9069 regular expression syntaxes supported by GNU M4.
9070 @xref{Changeresyntax}, for more details.
9072 @item S @var{syntax-code} @var{len} @key{NL} @var{str} @key{NL}
9073 Defines, through @code{changesyntax}, a syntax category for each of the
9074 characters in @var{str}. The @var{syntax-code} must be one of the
9075 characters described in @ref{Changesyntax}.
9077 @item t @var{len} @key{NL} @var{str} @key{NL}
9078 Enables tracing for any macro named @var{str}, similar to using the
9079 @code{traceon} builtin. This option may occur more than once for
9080 multiple macros; if omitted, no macro starts out as traced.
9082 @item T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL}
9083 @itemx T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
9084 Defines, though @code{pushdef}, a definition for @var{str1} expanding to
9085 the text given by @var{str2}. This directive may appear more than once
9086 for the same name, and its order, along with @samp{F}, is important. If
9087 present, the optional third argument associates the macro with a module
9088 named by @var{str3}.
9092 @chapter Compatibility with other versions of @code{m4}
9094 @cindex compatibility
9095 This chapter describes the many of the differences between this
9096 implementation of @code{m4}, and of other implementations found under
9097 UNIX, such as System V Release 4, Solaris, and BSD flavors.
9098 In particular, it lists the known differences and extensions to
9099 POSIX. However, the list is not necessarily comprehensive.
9101 At the time of this writing, POSIX 2001 (also known as IEEE
9102 Std 1003.1-2001) is the latest standard, although a new version of
9103 POSIX is under development and includes several proposals for
9104 modifying what @code{m4} is required to do. The requirements for
9105 @code{m4} are shared between SUSv3 and POSIX, and
9107 @uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}.
9110 * Extensions:: Extensions in GNU M4
9111 * Incompatibilities:: Other incompatibilities
9112 * Experiments:: Experimental features in GNU M4
9116 @section Extensions in GNU M4
9118 @cindex GNU extensions
9120 @cindex @env{POSIXLY_CORRECT}
9121 This version of @code{m4} contains a few facilities that do not exist
9122 in System V @code{m4}. These extra facilities are all suppressed by
9123 using the @option{-G} command line option, unless overridden by other
9124 command line options.
9125 Most of these extensions are compatible with
9126 @uref{http://www.unix.org/single_unix_specification/,
9127 POSIX}; the few exceptions are suppressed if the
9128 @env{POSIXLY_CORRECT} environment variable is set.
9132 In the @code{$@var{n}} notation for macro arguments, @var{n} can contain
9133 several digits, while the System V @code{m4} only accepts one digit.
9134 This allows macros in GNU @code{m4} to take any number of
9135 arguments, and not only nine (@pxref{Arguments}).
9136 POSIX does not allow this extension, so it is disabled if
9137 @env{POSIXLY_CORRECT} is set.
9138 @c FIXME - update this bullet when ${11} is implemented.
9141 The @code{divert} (@pxref{Divert}) macro can manage more than 9
9142 diversions. GNU @code{m4} treats all positive numbers as valid
9143 diversions, rather than discarding diversions greater than 9.
9146 Files included with @code{include} and @code{sinclude} are sought in a
9147 user specified search path, if they are not found in the working
9148 directory. The search path is specified by the @option{-I} option and the
9149 @samp{M4PATH} environment variable (@pxref{Search Path}).
9152 Arguments to @code{undivert} can be non-numeric, in which case the named
9153 file will be included uninterpreted in the output (@pxref{Undivert}).
9156 Formatted output is supported through the @code{format} builtin, which
9157 is modeled after the C library function @code{printf} (@pxref{Format}).
9160 Searches and text substitution through regular expressions are supported
9161 by the @code{regexp} (@pxref{Regexp}) and @code{patsubst}
9162 (@pxref{Patsubst}) builtins.
9164 The syntax of regular expressions in M4 has never been clearly
9165 formalized. While OpenBSD M4 uses extended regular
9166 expressions for @code{regexp} and @code{patsubst}, GNU M4
9167 defaults to basic regular expressions, but provides
9168 @code{changeresyntax} (@pxref{Changeresyntax}) to change the flavor of
9169 regular expression syntax in use.
9172 The output of shell commands can be read into @code{m4} with
9173 @code{esyscmd} (@pxref{Esyscmd}).
9176 There is indirect access to any builtin macro with @code{builtin}
9180 Macros can be called indirectly through @code{indir} (@pxref{Indir}).
9183 The name of the program, the current input file, and the current input
9184 line number are accessible through the builtins @code{@w{__program__}},
9185 @code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}).
9188 The generation of sync lines can be controlled through @code{syncoutput}
9189 (@pxref{Syncoutput}).
9192 The format of the output from @code{dumpdef} and macro tracing can be
9193 controlled with @code{debugmode} (@pxref{Debugmode}).
9196 The destination of trace and debug output can be controlled with
9197 @code{debugfile} (@pxref{Debugfile}).
9200 The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp},
9201 creating a new file with a unique name on every invocation, rather than
9202 following the insecure behavior of replacing the trailing @samp{X}
9203 characters with the @code{m4} process id. POSIX does not
9204 allow this extension, so @code{maketemp} is insecure if
9205 @env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in
9209 POSIX only requires support for the command line options
9210 @option{-s}, @option{-D}, and @option{-U}, so all other options accepted
9211 by GNU M4 are extensions. @xref{Invoking m4}, for a
9212 description of these options.
9215 The debugging and tracing facilities in GNU @code{m4} are much
9216 more extensive than in most other versions of @code{m4}.
9219 Some traditional implementations only allow reading standard input
9220 once, but GNU @code{m4} correctly handles multiple instances
9221 of @samp{-} on the command line.
9224 POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
9225 (first-in, first-out) order, and most other implementations obey this.
9226 However, versions of GNU @code{m4} earlier than 1.6 used
9227 LIFO order. Furthermore, POSIX states that only the first
9228 argument to @code{m4wrap} is saved for later evaluation, but
9229 GNU @code{m4} saves and processes all arguments, with output
9230 separated by spaces.
9233 POSIX states that builtins that require arguments, but are
9234 called without arguments, have undefined behavior. Traditional
9235 implementations simply behave as though empty strings had been passed.
9236 For example, @code{a`'define`'b} would expand to @code{ab}. But
9237 GNU @code{m4} ignores certain builtins if they have missing
9238 arguments, giving @code{adefineb} for the above example.
9241 @node Incompatibilities
9242 @section Other incompatibilities
9244 There are a few other incompatibilities between this implementation of
9245 @code{m4}, and what POSIX requires, or what the System V
9246 version implemented.
9250 Traditional implementations handle @code{define(`f',`1')} (@pxref{Define})
9251 by undefining the entire stack of previous definitions, and if doing
9252 @code{undefine(`f')} first. GNU @code{m4} replaces just the top
9253 definition on the stack, as if doing @code{popdef(`f')} followed by
9254 @code{pushdef(`f',`1')}. POSIX allows either behavior.
9257 At one point, POSIX required @code{changequote(@var{arg})}
9258 (@pxref{Changequote}) to use newline as the close quote, but this was a
9259 bug, and the next version of POSIX is anticipated to state
9260 that using empty strings or just one argument is unspecified.
9261 Meanwhile, the GNU @code{m4} behavior of treating an empty
9262 end-quote delimiter as @samp{'} is not portable, as Solaris treats it as
9263 repeating the start-quote delimiter, and BSD treats it as leaving the
9264 previous end-quote delimiter unchanged. For predictable results, never
9265 call changequote with just one argument, or with empty strings for
9269 At one point, POSIX required @code{changecom(@var{arg},)}
9270 (@pxref{Changecom}) to make it impossible to end a comment, but this is
9271 a bug, and the next version of POSIX is anticipated to state
9272 that using empty strings is unspecified. Meanwhile, the GNU
9273 @code{m4} behavior of treating an empty end-comment delimiter as newline
9274 is not portable, as BSD treats it as leaving the previous end-comment
9275 delimiter unchanged. It is also impossible in BSD implementations to
9276 disable comments, even though that is required by POSIX. For
9277 predictable results, never call changecom with empty strings for
9281 Traditional implementations allow argument collection, but not string
9282 and comment processing, to span file boundaries. Thus, if @file{a.m4}
9283 contains @samp{len(}, and @file{b.m4} contains @samp{abc)},
9284 @kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but
9285 gives an error message that the end of file was encountered inside a
9286 macro with GNU @code{m4}. On the other hand, traditional
9287 implementations do end of file processing for files included with
9288 @code{include} or @code{sinclude} (@pxref{Include}), while GNU
9289 @code{m4} seamlessly integrates the content of those files. Thus
9290 @code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of
9294 POSIX requires @code{eval} (@pxref{Eval}) to treat all
9295 operators with the same precedence as C@. However, earlier versions of
9296 GNU @code{m4} followed the traditional behavior of other
9297 @code{m4} implementations, where bitwise and logical negation (@samp{~}
9298 and @samp{!}) have lower precedence than equality operators; and where
9299 equality operators (@samp{==} and @samp{!=}) had the same precedence as
9300 relational operators (such as @samp{<}). Use explicit parentheses to
9301 ensure proper precedence. As extensions to POSIX,
9302 GNU @code{m4} gives well-defined semantics to operations that
9303 C leaves undefined, such as when overflow occurs, when shifting negative
9304 numbers, or when performing division by zero. POSIX also
9305 requires @samp{=} to cause an error, but many traditional
9306 implementations allowed it as an alias for @samp{==}.
9309 POSIX 2001 requires @code{translit} (@pxref{Translit}) to
9310 treat each character of the second and third arguments literally.
9311 However, it is anticipated that the next version of POSIX will
9312 allow the GNU @code{m4} behavior of treating @samp{-} as a
9316 POSIX requires @code{m4} to honor the locale environment
9317 variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
9318 @env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
9319 implemented in GNU @code{m4}.
9322 GNU @code{m4} implements sync lines differently from System V
9323 @code{m4}, when text is being diverted. GNU @code{m4} outputs
9324 the sync lines when the text is being diverted, and System V @code{m4}
9325 when the diverted text is being brought back.
9327 The problem is which lines and file names should be attached to text
9328 that is being, or has been, diverted. System V @code{m4} regards all
9329 the diverted text as being generated by the source line containing the
9330 @code{undivert} call, whereas GNU @code{m4} regards the
9331 diverted text as being generated at the time it is diverted.
9333 The sync line option is used mostly when using @code{m4} as
9334 a front end to a compiler. If a diverted line causes a compiler error,
9335 the error messages should most probably refer to the place where the
9336 diversion was made, and not where it was inserted again.
9338 @comment options: -s
9343 @result{}#line 3 "stdin"
9346 @result{}#line 2 "stdin"
9348 @result{}#line 1 "stdin"
9352 @comment FIXME - this needs to be fixed before 2.0.
9353 The current @code{m4} implementation has a limitation that the syncline
9354 output at the start of each diversion occurs no matter what, even if the
9355 previous diversion did not end with a newline. This goes contrary to
9356 the claim that synclines appear on a line by themselves, so this
9357 limitation may be corrected in a future version of @code{m4}. In the
9358 meantime, when using @option{-s}, it is wisest to make sure all
9359 diversions end with newline.
9362 GNU @code{m4} makes no attempt at prohibiting self-referential
9374 There is nothing inherently wrong with defining @samp{x} to
9375 return @samp{x}. The wrong thing is to expand @samp{x} unquoted,
9376 because that would cause an infinite rescan loop.
9377 In @code{m4}, one might use macros to hold strings, as we do for
9378 variables in other programming languages, further checking them with:
9382 ifelse(defn(`@var{holder}'), `@var{value}', @dots{})
9386 In cases like this one, an interdiction for a macro to hold its own name
9387 would be a useless limitation. Of course, this leaves more rope for the
9388 GNU @code{m4} user to hang himself! Rescanning hangs may be
9389 avoided through careful programming, a little like for endless loops in
9390 traditional programming languages.
9393 POSIX states that only unquoted leading newlines and blanks
9394 (that is, space and tab) are ignored when collecting macro arguments.
9395 However, this appears to be a bug in POSIX, since most
9396 traditional implementations also ignore all whitespace (formfeed,
9397 carriage return, and vertical tab). GNU @code{m4} follows
9398 tradition and ignores all leading unquoted whitespace.
9402 @section Experimental features in GNU M4
9404 Certain features of GNU @code{m4} are experimental.
9406 Some are only available if activated by an option given to
9407 @file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation
9408 time. The functionality
9409 might change or even go away in the future. @emph{Do not rely on it}.
9410 Please direct your comments about it the same way you would do for bugs.
9412 @section Changesyntax
9414 An experimental feature, which improves the flexibility of @code{m4},
9415 allows for changing the way the input is parsed (@pxref{Changesyntax}).
9416 No compile time option is needed for @code{changesyntax}. The
9417 implementation is careful to not slow down @code{m4} parsing, unlike the
9418 withdrawn experiment of @code{changeword} that appeared earlier in M4
9421 @section Multiple precision arithmetic
9423 Another experimental feature, which would improve @code{m4} usefulness,
9424 allows for multiple precision rational arithmetic similar to
9425 @code{eval}. You must have the GNU multi-precision (gmp)
9426 library installed, and should use @kbd{./configure --with-gmp} if you
9427 want this feature compiled in. The current implementation is unproven
9428 and might go away. Do not count on it yet.
9431 @chapter Correct version of some examples
9433 Some of the examples in this manuals are buggy or not very robust, for
9434 demonstration purposes. Improved versions of these composite macros are
9438 * Improved exch:: Solution for @code{exch}
9439 * Improved forloop:: Solution for @code{forloop}
9440 * Improved foreach:: Solution for @code{foreach}
9441 * Improved copy:: Solution for @code{copy}
9442 * Improved m4wrap:: Solution for @code{m4wrap}
9443 * Improved cleardivert:: Solution for @code{cleardivert}
9444 * Improved capitalize:: Solution for @code{capitalize}
9445 * Improved fatal_error:: Solution for @code{fatal_error}
9449 @section Solution for @code{exch}
9451 The @code{exch} macro (@pxref{Arguments}) as presented requires clients
9452 to double quote their arguments. A nicer definition, which lets
9453 clients follow the rule of thumb of one level of quoting per level of
9454 parentheses, involves adding quotes in the definition of @code{exch}, as
9458 define(`exch', ``$2', `$1'')
9460 define(exch(`expansion text', `macro'))
9463 @result{}expansion text
9466 @node Improved forloop
9467 @section Solution for @code{forloop}
9469 The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go
9470 into an infinite loop if given an iterator that is not parsed as a macro
9471 name. It does not do any sanity checking on its numeric bounds, and
9472 only permits decimal numbers for bounds. Here is an improved version,
9473 shipped as @file{m4-@value{VERSION}/@/doc/examples/@/forloop2.m4}; this
9474 version also optimizes overhead by calling four macros instead of six
9475 per iteration (excluding those in @var{text}), by not dereferencing the
9476 @var{iterator} in the helper @code{@w{_forloop}}.
9480 $ @kbd{m4 -I doc/examples}
9481 undivert(`forloop2.m4')dnl
9482 @result{}divert(`-1')
9483 @result{}# forloop(var, from, to, stmt) - improved version:
9484 @result{}# works even if VAR is not a strict macro name
9485 @result{}# performs sanity check that FROM is larger than TO
9486 @result{}# allows complex numerical expressions in TO and FROM
9487 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9488 @result{} `pushdef(`$1')_$0(`$1', eval(`$2'),
9489 @result{} eval(`$3'), `$4')popdef(`$1')')')
9490 @result{}define(`_forloop',
9491 @result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `',
9492 @result{} `$0(`$1', incr(`$2'), `$3', `$4')')')
9493 @result{}divert`'dnl
9494 include(`forloop2.m4')
9496 forloop(`i', `2', `1', `no iteration occurs')
9498 forloop(`', `1', `2', ` odd iterator name')
9499 @result{} odd iterator name odd iterator name
9500 forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')')
9501 @result{} 0xa 0xb 0xc
9502 forloop(`i', `a', `b', `non-numeric bounds')
9503 @error{}m4:stdin:6: warning: eval: bad input: '(a) <= (b)'
9507 One other change to notice is that the improved version used @samp{_$0}
9508 rather than @samp{_foreach} to invoke the helper routine. In general,
9509 this is a good practice to follow, because then the set of macros can be
9510 uniformly transformed. The following example shows a transformation
9511 that doubles the current quoting and appends a suffix @samp{2} to each
9512 transformed macro. If @code{foreach} refers to the literal
9513 @samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of
9514 the intended @code{_foreach2}, and the mixing of quoting paradigms leads
9515 to an infinite recursion loop in this example.
9517 @comment options: -L9
9521 $ @kbd{m4 -d -L 9 -I doc/examples}
9522 define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4')
9524 define(`double', `define(`$1'`2',
9525 arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))')
9527 double(`forloop')double(`_forloop')defn(`forloop2')
9528 @result{}ifelse(eval(``($2) <= ($3)''), ``1'',
9529 @result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''),
9530 @result{} eval(``$3''), ``$4'')popdef(``$1'')'')
9531 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9533 changequote(`[', `]')changequote([``], [''])
9535 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9537 changequote`'include(`forloop.m4')
9539 double(`forloop')double(`_forloop')defn(`forloop2')
9540 @result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'')
9541 forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)')
9543 changequote(`[', `]')changequote([``], [''])
9545 forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'')
9546 @error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it
9549 One more optimization is still possible. Instead of repeatedly
9550 assigning a variable then invoking or dereferencing it, it is possible
9551 to pass the current iterator value as a single argument. Coupled with
9552 @code{curry} if other arguments are needed (@pxref{Composition}), or
9553 with helper macros if the argument is needed in more than one place in
9554 the expansion, the output can be generated with three, rather than four,
9555 macros of overhead per iteration. Notice how the file
9556 @file{m4-@value{VERSION}/@/doc/examples/@/forloop3.m4} rearranges the
9557 arguments of the helper @code{_forloop} to take two arguments that are
9558 placed around the current value. By splitting a balanced set of
9559 parantheses across multiple arguments, the helper macro can now be
9560 shared by @code{forloop} and the new @code{forloop_arg}.
9564 $ @kbd{m4 -I doc/examples}
9565 include(`forloop3.m4')
9567 undivert(`forloop3.m4')dnl
9568 @result{}divert(`-1')
9569 @result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for
9570 @result{}# each value between FROM and TO, without define overhead
9571 @result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1',
9572 @result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')')
9573 @result{}# forloop(var, from, to, stmt) - refactored to share code
9574 @result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1',
9575 @result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'),
9576 @result{} `define(`$1',', `)$4')popdef(`$1')')')
9577 @result{}define(`_forloop',
9578 @result{} `$3`$1'$4`'ifelse(`$1', `$2', `',
9579 @result{} `$0(incr(`$1'), `$2', `$3', `$4')')')
9580 @result{}divert`'dnl
9581 forloop(`i', `1', `3', ` i')
9583 define(`echo', `$@@')
9585 forloop_arg(`1', `3', ` echo')
9589 forloop_arg(`1', `3', `curry(`pushdef', `a')')
9601 Of course, it is possible to make even more improvements, such as
9602 adding an optional step argument, or allowing iteration through
9603 descending sequences. GNU Autoconf provides some of these
9604 additional bells and whistles in its @code{m4_for} macro.
9606 @node Improved foreach
9607 @section Solution for @code{foreach}
9609 The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as
9610 presented earlier each have flaws. First, we will examine and fix the
9611 quadratic behavior of @code{foreachq}:
9615 $ @kbd{m4 -I doc/examples}
9616 include(`foreachq.m4')
9618 traceon(`shift')debugmode(`aq')
9620 foreachq(`x', ``1', `2', `3', `4'', `x
9623 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9624 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9626 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9627 @error{}m4trace: -3- shift(`2', `3', `4')
9628 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9629 @error{}m4trace: -2- shift(`2', `3', `4')
9631 @error{}m4trace: -5- shift(`1', `2', `3', `4')
9632 @error{}m4trace: -4- shift(`2', `3', `4')
9633 @error{}m4trace: -3- shift(`3', `4')
9634 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9635 @error{}m4trace: -3- shift(`2', `3', `4')
9636 @error{}m4trace: -2- shift(`3', `4')
9638 @error{}m4trace: -6- shift(`1', `2', `3', `4')
9639 @error{}m4trace: -5- shift(`2', `3', `4')
9640 @error{}m4trace: -4- shift(`3', `4')
9641 @error{}m4trace: -3- shift(`4')
9644 @cindex quadratic behavior, avoiding
9645 @cindex avoiding quadratic behavior
9646 Each successive iteration was adding more quoted @code{shift}
9647 invocations, and the entire list contents were passing through every
9648 iteration. In general, when recursing, it is a good idea to make the
9649 recursion use fewer arguments, rather than adding additional quoted
9650 uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes
9651 fewer macros, is less likely to run into machine limits, and most
9652 importantly, performs faster. The fixed version of @code{foreachq} can
9653 be found in @file{m4-@value{VERSION}/@/doc/examples/@/foreachq2.m4}:
9657 $ @kbd{m4 -I doc/examples}
9658 include(`foreachq2.m4')
9660 undivert(`foreachq2.m4')dnl
9661 @result{}include(`quote.m4')dnl
9662 @result{}divert(`-1')
9663 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9664 @result{}# quoted list, improved version
9665 @result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')')
9666 @result{}define(`_arg1q', ``$1'')
9667 @result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')')
9668 @result{}define(`_foreachq', `ifelse(`$2', `', `',
9669 @result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
9670 @result{}divert`'dnl
9671 traceon(`shift')debugmode(`aq')
9673 foreachq(`x', ``1', `2', `3', `4'', `x
9676 @error{}m4trace: -3- shift(`1', `2', `3', `4')
9678 @error{}m4trace: -3- shift(`2', `3', `4')
9680 @error{}m4trace: -3- shift(`3', `4')
9684 Note that the fixed version calls unquoted helper macros in
9685 @code{@w{_foreachq}} to trim elements immediately; those helper macros
9686 in turn must re-supply the layer of quotes lost in the macro invocation.
9687 Contrast the use of @code{@w{_arg1q}}, which quotes the first list
9688 element, with @code{@w{_arg1}} of the earlier implementation that
9689 returned the first list element directly. Additionally, by calling the
9690 helper method immediately, the @samp{defn(`@var{iterator}')} no longer
9691 contains unexpanded macros.
9693 The astute m4 programmer might notice that the solution above still uses
9694 more macro invocations than strictly necessary. Note that @samp{$2},
9695 which contains an arbitrarily long quoted list, is expanded and
9696 rescanned three times per iteration of @code{_foreachq}. Furthermore,
9697 every iteration of the algorithm effectively unboxes then reboxes the
9698 list, which costs a couple of macro invocations. It is possible to
9699 rewrite the algorithm by swapping the order of the arguments to
9700 @code{_foreachq} in order to operate on an unboxed list in the first
9701 place, and by using the fixed-length @samp{$#} instead of an arbitrary
9702 length list as the key to end recursion. The result is an overhead of
9703 six macro invocations per loop (excluding any macros in @var{text}),
9704 instead of eight. This alternative approach is available as
9705 @file{m4-@value{VERSION}/@/doc/examples/@/foreach3.m4}:
9709 $ @kbd{m4 -I doc/examples}
9710 include(`foreachq3.m4')
9712 undivert(`foreachq3.m4')dnl
9713 @result{}divert(`-1')
9714 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9715 @result{}# quoted list, alternate improved version
9716 @result{}define(`foreachq', `ifelse(`$2', `', `',
9717 @result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
9718 @result{}define(`_foreachq', `ifelse(`$#', `3', `',
9719 @result{} `define(`$1', `$4')$2`'$0(`$1', `$2',
9720 @result{} shift(shift(shift($@@))))')')
9721 @result{}divert`'dnl
9722 traceon(`shift')debugmode(`aq')
9724 foreachq(`x', ``1', `2', `3', `4'', `x
9727 @error{}m4trace: -4- shift(`x', `x
9728 @error{}', `', `1', `2', `3', `4')
9729 @error{}m4trace: -3- shift(`x
9730 @error{}', `', `1', `2', `3', `4')
9731 @error{}m4trace: -2- shift(`', `1', `2', `3', `4')
9733 @error{}m4trace: -4- shift(`x', `x
9734 @error{}', `1', `2', `3', `4')
9735 @error{}m4trace: -3- shift(`x
9736 @error{}', `1', `2', `3', `4')
9737 @error{}m4trace: -2- shift(`1', `2', `3', `4')
9739 @error{}m4trace: -4- shift(`x', `x
9740 @error{}', `2', `3', `4')
9741 @error{}m4trace: -3- shift(`x
9742 @error{}', `2', `3', `4')
9743 @error{}m4trace: -2- shift(`2', `3', `4')
9745 @error{}m4trace: -4- shift(`x', `x
9746 @error{}', `3', `4')
9747 @error{}m4trace: -3- shift(`x
9748 @error{}', `3', `4')
9749 @error{}m4trace: -2- shift(`3', `4')
9752 Prior to M4 1.6, every instance of @samp{$@@} was rescanned as it was
9753 encountered. Thus, the @file{foreachq3.m4} alternative used much less
9754 memory than @file{foreachq2.m4}, and executed as much as 10% faster,
9755 since each iteration encountered fewer @samp{$@@}. However, the
9756 implementation of rescanning every byte in @samp{$@@} was quadratic in
9757 the number of bytes scanned (for example, making the broken version in
9758 @file{foreachq.m4} cubic, rather than quadratic, in behavior). Once the
9759 underlying M4 implementation was improved in 1.6 to reuse results of
9760 previous scans, both styles of @code{foreachq} become linear in the
9761 number of bytes scanned, but the @file{foreachq3.m4} version remains
9762 noticeably faster because of fewer macro invocations. Notice how the
9763 implementation injects an empty argument prior to expanding @samp{$2}
9764 within @code{foreachq}; the helper macro @code{_foreachq} then ignores
9765 the third argument altogether, and ends recursion when there are three
9766 arguments left because there was nothing left to pass through
9767 @code{shift}. Thus, each iteration only needs one @code{ifelse}, rather
9768 than the two conditionals used in the version from @file{foreachq2.m4}.
9770 @cindex nine arguments, more than
9771 @cindex more than nine arguments
9772 @cindex arguments, more than nine
9773 So far, all of the implementations of @code{foreachq} presented have
9774 been quadratic with M4 1.4.x. But @code{forloop} is linear, because
9775 each iteration parses a constant amount of arguments. So, it is
9776 possible to design a variant that uses @code{forloop} to do the
9777 iteration, then uses @samp{$@@} only once at the end, giving a linear
9778 result even with older M4 implementations. This implementation relies
9779 on the GNU extension that @samp{$10} expands to the tenth
9780 argument rather than the first argument concatenated with @samp{0}. The
9781 trick is to define an intermediate macro that repeats the text
9782 @code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive
9783 integers corresponding to each argument. The helper macro
9784 @code{_foreachq_} is needed in order to generate the literal sequences
9785 such as @samp{$1} into the intermediate macro, rather than expanding
9786 them as the arguments of @code{_foreachq}. With this approach, no
9787 @code{shift} calls are even needed! However, when linear recursion is
9788 available in new enough M4, the time and memory cost of using
9789 @code{forloop} to build an intermediate macro outweigh the costs of any
9790 of the previous implementations (there are seven macros of overhead per
9791 iteration instead of six in @file{foreachq3.m4}, and the entire
9792 intermediate macro must be built in memory before any iteration is
9793 expanded). Additionally, this approach will need adjustment when a
9794 future version of M4 follows POSIX by no longer treating
9795 @samp{$10} as the tenth argument; the anticipation is that
9796 @samp{$@{10@}} can be used instead, although that alternative syntax is
9801 $ @kbd{m4 -I doc/examples}
9802 include(`foreachq4.m4')
9804 undivert(`foreachq4.m4')dnl
9805 @result{}include(`forloop2.m4')dnl
9806 @result{}divert(`-1')
9807 @result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt)
9808 @result{}# quoted list, version based on forloop
9809 @result{}define(`foreachq',
9810 @result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
9811 @result{}define(`_foreachq',
9812 @result{}`pushdef(`$1', forloop(`$1', `3', `$#',
9813 @result{} `$0_(`1', `2', indir(`$1'))')`popdef(
9814 @result{} `$1')')indir(`$1', $@@)')
9815 @result{}define(`_foreachq_',
9816 @result{}``define(`$$1', `$$3')$$2`''')
9817 @result{}divert`'dnl
9818 traceon(`shift')debugmode(`aq')
9820 foreachq(`x', ``1', `2', `3', `4'', `x
9828 For yet another approach, the improved version of @code{foreach},
9829 available in @file{m4-@value{VERSION}/@/doc/examples/@/foreach2.m4},
9830 simply overquotes the arguments to @code{@w{_foreach}} to begin with,
9831 using @code{dquote_elt}. Then @code{@w{_foreach}} can just use
9832 @code{@w{_arg1}} to remove the extra layer of quoting that was added up
9837 $ @kbd{m4 -I doc/examples}
9838 include(`foreach2.m4')
9840 undivert(`foreach2.m4')dnl
9841 @result{}include(`quote.m4')dnl
9842 @result{}divert(`-1')
9843 @result{}# foreach(x, (item_1, item_2, ..., item_n), stmt)
9844 @result{}# parenthesized list, improved version
9845 @result{}define(`foreach', `pushdef(`$1')_$0(`$1',
9846 @result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')')
9847 @result{}define(`_arg1', `$1')
9848 @result{}define(`_foreach', `ifelse(`$2', `(`')', `',
9849 @result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
9850 @result{}divert`'dnl
9851 traceon(`shift')debugmode(`aq')
9853 foreach(`x', `(`1', `2', `3', `4')', `x
9855 @error{}m4trace: -4- shift(`1', `2', `3', `4')
9856 @error{}m4trace: -4- shift(`2', `3', `4')
9857 @error{}m4trace: -4- shift(`3', `4')
9859 @error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
9861 @error{}m4trace: -3- shift(``2'', ``3'', ``4'')
9863 @error{}m4trace: -3- shift(``3'', ``4'')
9865 @error{}m4trace: -3- shift(``4'')
9868 It is likewise possible to write a variant of @code{foreach} that
9869 performs in linear time on M4 1.4.x; the easiest method is probably
9870 writing a version of @code{foreach} that unboxes its list, then invokes
9871 @code{_foreachq} as previously defined in @file{foreachq4.m4}.
9873 @cindex filtering defined symbols
9874 @cindex subset of defined symbols
9875 @cindex defined symbols, filtering
9876 With a robust @code{foreachq} implementation, it is possible to create a
9877 filter on a list of defined symbols. This next example will find all
9878 symbols that contain @samp{if} or @samp{def}, via two different
9879 approaches. In the first approach, @code{dquote_elt} is used to
9880 overquote each list element, then @code{dquote} forms the list; that
9881 way, the iterator @code{macro} can be expanded in place because its
9882 contents are already quoted. This approach also uses a self-modifying
9883 macro @code{sep} to provide the correct number of commas. In the second
9884 approach, the iterator @code{macro} contains live text, so it must be
9885 used with @code{defn} to avoid unintentional expansion. The correct
9886 number of commas is achieved by using @code{shift} to ignore the first
9887 one, although a leading space still remains.
9891 $ @kbd{m4 -I doc/examples}
9892 include(`quote.m4')include(`foreachq2.m4')
9894 pushdef(`sep', `define(`sep', ``, '')')
9896 foreachq(`macro', dquote(dquote_elt(m4symbols)),
9897 `regexp(macro, `.*if.*', `sep`\&'')')
9898 @result{}ifdef, ifelse, shift
9901 shift(foreachq(`macro', dquote(m4symbols),
9902 `regexp(defn(`macro'), `def', `,` ''dquote(defn(`macro')))'))
9903 @result{} define, defn, dumpdef, ifdef, popdef, pushdef, undefine
9906 In summary, recursion over list elements is trickier than it appeared at
9907 first glance, but provides a powerful idiom within @code{m4} processing.
9908 As a final demonstration, both list styles are now able to handle
9909 several scenarios that would wreak havoc on one or both of the original
9910 implementations. This points out one other difference between the
9911 list styles. @code{foreach} evaluates unquoted list elements only once,
9912 in preparation for calling @code{@w{_foreach}}, similary for
9913 @code{foreachq} as provided by @file{foreachq3.m4} or
9914 @file{foreachq4.m4}. But
9915 @code{foreachq}, as provided by @file{foreachq2.m4},
9916 evaluates unquoted list elements twice while visiting the first list
9917 element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When
9918 deciding which list style to use, one must take into account whether
9919 repeating the side effects of unquoted list elements will have any
9920 detrimental effects.
9924 $ @kbd{m4 -d -I doc/examples}
9925 include(`foreach2.m4')
9927 include(`foreachq2.m4')
9930 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
9932 dnl 1-element list of empty element
9933 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
9935 dnl 2-element list of empty elements
9936 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
9937 @result{}<><> / <><>
9938 dnl 1-element list of a comma
9939 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
9941 dnl 2-element list of unbalanced parentheses
9942 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
9943 @result{}<(><)> / <(><)>
9944 define(`ab', `oops')dnl using defn(`iterator')
9945 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
9946 foreachq(`x', ``a', `b'', `defn(`x')')
9948 define(`active', `ACT, IVE')
9952 dnl list of unquoted macros; expansion occurs before recursion
9953 foreach(`x', `(active, active)', `<x>
9955 @error{}m4trace: -4- active -> `ACT, IVE'
9956 @error{}m4trace: -4- active -> `ACT, IVE'
9961 foreachq(`x', `active, active', `<x>
9963 @error{}m4trace: -3- active -> `ACT, IVE'
9964 @error{}m4trace: -3- active -> `ACT, IVE'
9966 @error{}m4trace: -3- active -> `ACT, IVE'
9967 @error{}m4trace: -3- active -> `ACT, IVE'
9971 dnl list of quoted macros; expansion occurs during recursion
9972 foreach(`x', `(`active', `active')', `<x>
9974 @error{}m4trace: -1- active -> `ACT, IVE'
9976 @error{}m4trace: -1- active -> `ACT, IVE'
9978 foreachq(`x', ``active', `active'', `<x>
9980 @error{}m4trace: -1- active -> `ACT, IVE'
9982 @error{}m4trace: -1- active -> `ACT, IVE'
9984 dnl list of double-quoted macro names; no expansion
9985 foreach(`x', `(``active'', ``active'')', `<x>
9989 foreachq(`x', ```active'', ``active''', `<x>
9996 @section Solution for @code{copy}
9998 The macro @code{copy} presented above works with M4 1.6 and newer, but
9999 is unable to handle builtin tokens with M4 1.4.x, because it tries to
10000 pass the builtin token through the macro @code{curry}, where it is
10001 silently flattened to an empty string (@pxref{Composition}). Rather
10002 than using the problematic @code{curry} to work around the limitation
10003 that @code{stack_foreach} expects to invoke a macro that takes exactly
10004 one argument, we can write a new macro that lets us form the exact
10005 two-argument @code{pushdef} call sequence needed, so that we are no
10006 longer passing a builtin token through a text macro.
10008 @deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @
10010 @deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @
10011 @var{post}, @var{sep})
10012 For each of the @code{pushdef} definitions associated with @var{macro},
10013 expand the sequence @samp{@var{pre}`'definition`'@var{post}}.
10014 Additionally, expand @var{sep} between definitions.
10015 @code{stack_foreach_sep} visits the oldest definition first, while
10016 @code{stack_foreach_sep_lifo} visits the current definition first. The
10017 expansion may dereference @var{macro}, but should not modify it. There
10018 are a few special macros, such as @code{defn}, which cannot be used as
10019 the @var{macro} parameter.
10022 Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is
10023 equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(',
10024 `)')}. By supplying explicit parentheses, split among the @var{pre} and
10025 @var{post} arguments to @code{stack_foreach_sep}, it is now possible to
10026 construct macro calls with more than one argument, without passing
10027 builtin tokens through a macro call. It is likewise possible to
10028 directly reference the stack definitions without a macro call, by
10029 leaving @var{pre} and @var{post} empty. Thus, in addition to fixing
10030 @code{copy} on builtin tokens, it also executes with fewer macro
10033 The new macro also adds a separator that is only output after the first
10034 iteration of the helper @code{_stack_reverse_sep}, implemented by
10035 prepending the original @var{sep} to @var{pre} and omitting a @var{sep}
10036 argument in subsequent iterations. Note that the empty string that
10037 separates @var{sep} from @var{pre} is provided as part of the fourth
10038 argument when originally calling @code{_stack_reverse_sep}, and not by
10039 writing @code{$4`'$3} as the third argument in the recursive call; while
10040 the other approach would give the same output, it does so at the expense
10041 of increasing the argument size on each iteration of
10042 @code{_stack_reverse_sep}, which results in quadratic instead of linear
10043 execution time. The improved stack walking macros are available in
10044 @file{m4-@value{VERSION}/@/doc/examples/@/stack_sep.m4}:
10048 $ @kbd{m4 -I doc/examples}
10049 include(`stack_sep.m4')
10051 define(`copy', `ifdef(`$2', `errprint(`$2 already defined
10053 `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl
10054 pushdef(`a', `1')pushdef(`a', defn(`divnum'))
10064 pushdef(`c', `1')pushdef(`c', `2')
10066 stack_foreach_sep_lifo(`c', `', `', `, ')
10068 undivert(`stack_sep.m4')dnl
10069 @result{}divert(`-1')
10070 @result{}# stack_foreach_sep(macro, pre, post, sep)
10071 @result{}# Invoke PRE`'defn`'POST with a single argument of each definition
10072 @result{}# from the definition stack of MACRO, starting with the oldest, and
10073 @result{}# separated by SEP between definitions.
10074 @result{}define(`stack_foreach_sep',
10075 @result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl
10076 @result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')')
10077 @result{}# stack_foreach_sep_lifo(macro, pre, post, sep)
10078 @result{}# Like stack_foreach_sep, but starting with the newest definition.
10079 @result{}define(`stack_foreach_sep_lifo',
10080 @result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl
10081 @result{}`_stack_reverse_sep(`tmp-$1', `$1')')
10082 @result{}define(`_stack_reverse_sep',
10083 @result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0(
10084 @result{} `$1', `$2', `$4$3')')')
10085 @result{}divert`'dnl
10088 @node Improved m4wrap
10089 @section Solution for @code{m4wrap}
10091 The replacement @code{m4wrap} versions presented above, designed to
10092 guarantee FIFO or LIFO order regardless of the underlying M4
10093 implementation, share a bug when dealing with wrapped text that looks
10094 like parameter expansion. Note how the invocation of
10095 @code{m4wrap@var{n}} interprets these parameters, while using the
10096 builtin preserves them for their intended use.
10100 $ @kbd{m4 -I doc/examples}
10101 include(`wraplifo.m4')
10103 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10106 builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b')
10110 @result{}m4wrap0:---0-
10111 @result{}bar:-a-a,b-2-
10114 Additionally, the computation of @code{_m4wrap_level} and creation of
10115 multiple @code{m4wrap@var{n}} placeholders in the original examples is
10116 more expensive in time and memory than strictly necessary. Notice how
10117 the improved version grabs the wrapped text via @code{defn} to avoid
10118 parameter expansion, then undefines @code{_m4wrap_text}, before
10119 stripping a level of quotes with @code{_arg1} to expand the text. That
10120 way, each level of wrapping reuses the single placeholder, which starts
10121 each nesting level in an undefined state.
10123 Finally, it is worth emulating the GNU M4 extension of saving
10124 all arguments to @code{m4wrap}, separated by a space, rather than saving
10125 just the first argument. This is done with the @code{join} macro
10126 documented previously (@pxref{Shift}). The improved LIFO example is
10127 shipped as @file{m4-@value{VERSION}/@/doc/examples/@/wraplifo2.m4}, and
10128 can easily be converted to a FIFO solution by swapping the adjacent
10129 invocations of @code{joinall} and @code{defn}.
10133 $ @kbd{m4 -I doc/examples}
10134 include(`wraplifo2.m4')
10136 undivert(`wraplifo2.m4')dnl
10137 @result{}dnl Redefine m4wrap to have LIFO semantics, improved example.
10138 @result{}include(`join.m4')dnl
10139 @result{}define(`_m4wrap', defn(`m4wrap'))dnl
10140 @result{}define(`_arg1', `$1')dnl
10141 @result{}define(`m4wrap',
10142 @result{}`ifdef(`_$0_text',
10143 @result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))',
10144 @result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl
10145 @result{}define(`_$0_text', joinall(` ', $@@))')')dnl
10146 m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b')
10150 m4wrap(`nested', `', `$@@
10155 @result{}foo:-a-a,b-2-
10156 @result{}nested $@@
10159 @node Improved cleardivert
10160 @section Solution for @code{cleardivert}
10162 The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be
10163 called without arguments to clear all pending diversions. That is
10164 because using undivert with an empty string for an argument is different
10165 than using it with no arguments at all. Compare the earlier definition
10166 with one that takes the number of arguments into account:
10169 define(`cleardivert',
10170 `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')')
10180 define(`cleardivert',
10181 `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0',
10182 `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')')
10193 @node Improved capitalize
10194 @section Solution for @code{capitalize}
10196 The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does
10197 not allow clients to follow the quoting rule of thumb. Consider the
10198 three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the
10199 difference between calling @code{capitalize} with the expansion of a
10200 macro, expanding the result of a case change, and changing the case of a
10201 double-quoted string:
10205 $ @kbd{m4 -I doc/examples}
10206 include(`capitalize.m4')dnl
10207 define(`active', `act1, ive')dnl
10208 define(`Active', `Act2, Ive')dnl
10209 define(`ACTIVE', `ACT3, IVE')dnl
10220 downcase(``ACTIVE'')
10224 capitalize(`active')
10226 capitalize(``active'')
10227 @result{}_capitalize(`active')
10228 define(`A', `OOPS')
10232 capitalize(`active')
10236 First, when @code{capitalize} is called with more than one argument, it
10237 was throwing away later arguments, whereas @code{upcase} and
10238 @code{downcase} used @samp{$*} to collect them all. The fix is simple:
10239 use @samp{$*} consistently.
10241 Next, with single-quoting, @code{capitalize} outputs a single character,
10242 a set of quotes, then the rest of the characters, making it impossible
10243 to invoke @code{Active} after the fact, and allowing the alternate macro
10244 @code{A} to interfere. Here, the solution is to use additional quoting
10245 in the helper macros, then pass the final over-quoted output string
10246 through @code{_arg1} to remove the extra quoting and finally invoke the
10247 concatenated portions as a single string.
10249 Finally, when passed a double-quoted string, the nested macro
10250 @code{_capitalize} is never invoked because it ended up nested inside
10251 quotes. This one is the toughest to fix. In short, we have no idea how
10252 many levels of quotes are in effect on the substring being altered by
10253 @code{patsubst}. If the replacement string cannot be expressed entirely
10254 in terms of literal text and backslash substitutions, then we need a
10255 mechanism to guarantee that the helper macros are invoked outside of
10256 quotes. In other words, this sounds like a job for @code{changequote}
10257 (@pxref{Changequote}). By changing the active quoting characters, we
10258 can guarantee that replacement text injected by @code{patsubst} always
10259 occurs in the middle of a string that has exactly one level of
10260 over-quoting using alternate quotes; so the replacement text closes the
10261 quoted string, invokes the helper macros, then reopens the quoted
10262 string. In turn, that means the replacement text has unbalanced quotes,
10263 necessitating another round of @code{changequote}.
10265 In the fixed version below, (also shipped as
10266 @file{m4-@value{VERSION}/@/doc/examples/@/capitalize.m4}),
10267 @code{capitalize} uses the alternate quotes of @samp{<<[} and @samp{]>>}
10268 (the longer strings are chosen so as to be less likely to appear in the
10269 text being converted). The helpers @code{_to_alt} and @code{_from_alt}
10270 merely reduce the number of characters required to perform a
10271 @code{changequote}, since the definition changes twice. The outermost
10272 pair means that @code{patsubst} and @code{_capitalize_alt} are invoked
10273 with alternate quoting; the innermost pair is used so that the third
10274 argument to @code{patsubst} can contain an unbalanced
10275 @samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase}
10276 must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since
10277 they contain nested quotes but are invoked with the alternate quoting
10282 $ @kbd{m4 -I doc/examples}
10283 include(`capitalize2.m4')dnl
10284 define(`active', `act1, ive')dnl
10285 define(`Active', `Act2, Ive')dnl
10286 define(`ACTIVE', `ACT3, IVE')dnl
10287 define(`A', `OOPS')dnl
10288 capitalize(active; `active'; ``active''; ```actIVE''')
10289 @result{}Act1,Ive; Act2, Ive; Active; `Active'
10290 undivert(`capitalize2.m4')dnl
10291 @result{}divert(`-1')
10292 @result{}# upcase(text)
10293 @result{}# downcase(text)
10294 @result{}# capitalize(text)
10295 @result{}# change case of text, improved version
10296 @result{}define(`upcase', `translit(`$*', `a-z', `A-Z')')
10297 @result{}define(`downcase', `translit(`$*', `A-Z', `a-z')')
10298 @result{}define(`_arg1', `$1')
10299 @result{}define(`_to_alt', `changequote(`<<[', `]>>')')
10300 @result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
10301 @result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
10302 @result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
10303 @result{}define(`_capitalize_alt',
10304 @result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
10305 @result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
10306 @result{}define(`capitalize',
10307 @result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
10308 @result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')
10309 @result{}divert`'dnl
10312 @node Improved fatal_error
10313 @section Solution for @code{fatal_error}
10315 The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions
10316 of GNU M4 earlier than 1.4.8, where invoking @code{@w{__file__}}
10317 (@pxref{Location}) inside @code{m4wrap} would result in an empty string,
10318 and @code{@w{__line__}} resulted in @samp{0} even though all files start
10319 at line 1. Furthermore, versions earlier than 1.4.6 did not support the
10320 @code{@w{__program__}} macro. If you want @code{fatal_error} to work
10321 across the entire 1.4.x release series, a better implementation would
10326 define(`fatal_error',
10327 `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl
10328 `:ifelse(__line__, `0', `',
10329 `__file__:__line__:')` fatal error: $*
10332 m4wrap(`divnum(`demo of internal message')
10333 fatal_error(`inside wrapped text')')
10336 @error{}m4:stdin:6: warning: divnum: extra arguments ignored: 1 > 0
10338 @error{}m4:stdin:6: fatal error: inside wrapped text
10341 @c ========================================================== Appendices
10343 @node Copying This Package
10344 @appendix How to make copies of the overall M4 package
10345 @cindex License, code
10347 This appendix covers the license for copying the source code of the
10348 overall M4 package. This manual is under a different set of
10349 restrictions, covered later (@pxref{Copying This Manual}).
10352 * GNU General Public License:: License for copying the M4 package
10355 @node GNU General Public License
10356 @appendixsec License for copying the M4 package
10357 @cindex GPL, GNU General Public License
10358 @cindex GNU General Public License
10359 @cindex General Public License (GPL), GNU
10360 @include gpl-3.0.texi
10362 @node Copying This Manual
10363 @appendix How to make copies of this manual
10364 @cindex License, manual
10366 This appendix covers the license for copying this manual. Note that
10367 some of the longer examples in this manual are also distributed in the
10368 directory @file{m4-@value{VERSION}/@/doc/examples/}, where a more
10369 permissive license is in effect when copying just the examples.
10372 * GNU Free Documentation License:: License for copying this manual
10375 @node GNU Free Documentation License
10376 @appendixsec License for copying this manual
10377 @cindex FDL, GNU Free Documentation License
10378 @cindex GNU Free Documentation License
10379 @cindex Free Documentation License (FDL), GNU
10380 @include fdl-1.3.texi
10383 @appendix Indices of concepts and macros
10386 * Macro index:: Index for all @code{m4} macros
10387 * Concept index:: Index for many concepts
10391 @appendixsec Index for all @code{m4} macros
10393 This index covers all @code{m4} builtins, as well as several useful
10394 composite macros. References are exclusively to the places where a
10395 macro is introduced the first time.
10399 @node Concept index
10400 @appendixsec Index for many concepts
10406 @c Local Variables:
10408 @c ispell-local-dictionary: "american"
10409 @c indent-tabs-mode: nil
10410 @c whitespace-check-buffer-indent: nil