11 .IP "" "\w'\fB\\$1\ \ \fP'u"
15 .CT 1 files prog_other
17 awk \- pattern-directed scanning and processing language
41 for lines that match any of a set of patterns specified literally in
43 or in one or more files
48 there can be an associated action that will be performed
52 Each line is matched against the
53 pattern portion of every pattern-action statement;
54 the associated action is performed for each matched pattern.
57 means the standard input.
62 is treated as an assignment, not a filename,
63 and is executed at the time it would have been opened if it were a filename.
68 is an assignment to be done before
73 options may be present.
77 option defines the input field separator to be the regular expression
80 An input line is normally made up of fields separated by white space,
81 or by the regular expression
83 The fields are denoted
88 refers to the entire line.
91 is null, the input line is split into one field per character.
93 A pattern-action statement has the form:
95 .IB pattern " { " action " }
100 a missing pattern always matches.
101 Pattern-action statements are separated by newlines or semicolons.
103 An action is a sequence of statements.
104 A statement can be one of the following:
107 .ta \w'\f(CWdelete array[expression]\fR'u
111 if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP
112 while(\fI expression \fP)\fI statement\fP
113 for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP
114 for(\fI var \fPin\fI array \fP)\fI statement\fP
115 do\fI statement \fPwhile(\fI expression \fP)
118 {\fR [\fP\fI statement ... \fP\fR] \fP}
119 \fIexpression\fP #\fR commonly\fP\fI var = expression\fP
120 print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
121 printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
122 return\fR [ \fP\fIexpression \fP\fR]\fP
123 next #\fR skip remaining patterns on this input line\fP
124 nextfile #\fR skip rest of this file, open next, start at top\fP
125 delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP
126 delete\fI array\fP #\fR delete all elements of array\fP
127 exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP
133 Statements are terminated by
134 semicolons, newlines or right braces.
139 String constants are quoted \&\f(CW"\ "\fR,
140 with the usual C escapes recognized within.
141 Expressions take on string or numeric values as appropriate,
142 and are built using the operators
144 (exponentiation), and concatenation (indicated by white space).
147 ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
148 are also available in expressions.
149 Variables may be scalars, array elements
153 Variables are initialized to the null string.
154 Array subscripts may be any string,
155 not necessarily numeric;
156 this allows for a form of associative memory.
157 Multiple subscripts such as
159 are permitted; the constituents are concatenated,
160 separated by the value of
165 statement prints its arguments on the standard output
170 is present or on a pipe if
172 is present), separated by the current output field separator,
173 and terminated by the output record separator.
177 may be literal names or parenthesized expressions;
178 identical string values in different statements denote
182 statement formats its expression list according to the
186 The built-in function
188 closes the file or pipe
190 The built-in function
192 flushes any buffered output for the file or pipe
195 The mathematical functions
204 Other built-in functions:
208 the length of its argument
210 number of elements in an array for an array argument,
216 random number on [0,1).
221 and returns the previous seed.
224 truncates to an integer value.
226 \fBsubstr(\fIs\fB, \fIm\fR [\fB, \fIn\^\fR]\fB)\fR
231 that begins at position
236 use the rest of the string.
238 .BI index( s , " t" )
243 occurs, or 0 if it does not.
245 .BI match( s , " r" )
248 where the regular expression
250 occurs, or 0 if it does not.
255 are set to the position and length of the matched string.
257 \fBsplit(\fIs\fB, \fIa \fR[\fB, \fIfs\^\fR]\fB)\fR
267 The separation is done with the regular expression
269 or with the field separator
274 An empty string as field separator splits the string
275 into one array element per character.
277 \fBsub(\fIr\fB, \fIt \fR[, \fIs\^\fR]\fB)
280 for the first occurrence of the regular expression
290 \fBgsub(\fIr\fB, \fIt \fR[, \fIs\^\fR]\fB)
293 except that all occurrences of the regular expression
298 return the number of replacements.
300 .BI sprintf( fmt , " expr" , " ...\fB)
301 the string resulting from formatting
311 and returns its exit status. This will be \-1 upon error,
313 exit status upon a normal exit,
316 upon death-by-signal, where
318 is the number of the murdering signal,
321 if there was a core dump.
326 with all upper-case characters translated to their
327 corresponding lower-case equivalents.
332 with all lower-case characters translated to their
333 corresponding upper-case equivalents.
340 to the next input record from the current input file;
345 to the next record from
360 returns the next line of output from
364 returns 1 for a successful input,
365 0 for end of file, and \-1 for an error.
367 Patterns are arbitrary Boolean combinations
370 of regular expressions and
371 relational expressions.
372 Regular expressions are as in
376 Isolated regular expressions
377 in a pattern apply to the entire line.
378 Regular expressions may also occur in
379 relational expressions, using the operators
384 is a constant regular expression;
385 any string (constant or variable) may be used
386 as a regular expression, except in the position of an isolated regular expression
389 A pattern may consist of two patterns separated by a comma;
390 in this case, the action is performed for all lines
391 from an occurrence of the first pattern
392 though an occurrence of the second.
394 A relational expression is one of the following:
396 .I expression matchop regular-expression
398 .I expression relop expression
400 .IB expression " in " array-name
402 .BI ( expr , expr,... ") in " array-name
406 is any of the six relational operators in C,
415 A conditional is an arithmetic expression,
416 a relational expression,
417 or a Boolean combination
424 may be used to capture control before the first input line is read
429 do not combine with other patterns.
430 They may appear multiple times in a program and execute
431 in the order they are read by
434 Variable names with special meanings:
438 argument count, assignable.
441 argument array, assignable;
442 non-null members are taken as filenames.
445 conversion format used when converting numbers
450 array of environment variables; subscripts are names.
453 the name of the current input file.
456 ordinal number of the current record in the current file.
459 regular expression used to separate fields; also settable
464 number of fields in the current record.
467 ordinal number of the current record.
470 output format for numbers (default
474 output field separator (default space).
477 output record separator (default newline).
480 the length of a string matched by
484 input record separator (default newline).
485 If empty, blank lines separate records.
486 If more than one character long,
488 is treated as a regular expression, and records are
489 separated by text matching the expression.
492 the start position of a string matched by
496 separates multiple subscripts (default 034).
499 Functions may be defined (at the position of a pattern-action statement) thus:
502 function foo(a, b, c) { ...; return x }
504 Parameters are passed by value if scalar and by reference if array name;
505 functions may be called recursively.
506 Parameters are local to the function; all other variables are global.
507 Thus local variables may be created by providing excess parameters in
508 the function definition.
509 .SH ENVIRONMENT VARIABLES
512 is set in the environment, then
514 follows the POSIX rules for
518 with respect to consecutive backslashes and ampersands.
524 Print lines longer than 72 characters.
529 Print first two fields in opposite order.
532 BEGIN { FS = ",[ \et]*|[ \et]+" }
537 Same, with input fields separated by comma and/or spaces and tabs.
542 END { print "sum is", s, " average is", s/NR }
547 Add up first column, print sum and average.
552 Print all lines between start/stop pairs.
556 BEGIN { # Simulate echo(1)
557 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
567 A. V. Aho, B. W. Kernighan, P. J. Weinberger,
568 .IR "The AWK Programming Language" ,
569 Addison-Wesley, 1988. ISBN 0-201-07981-X.
571 There are no explicit conversions between numbers and strings.
572 To force an expression to be treated as a number add 0 to it;
573 to force it to be treated as a string concatenate
576 The scope rules for variables in functions are a botch;
579 Only eight-bit characters sets are handled correctly.
580 .SH UNUSUAL FLOATING-POINT VALUES
582 was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
583 and Infinity values, which are supported by all modern floating-point
592 to convert string values to double-precision floating-point values,
593 modern C libraries also convert strings starting with
597 into infinity and NaN values respectively. This led to strange results,
598 with something like this:
602 echo nancy | awk '{ print $1 + 0 }'
611 now follows GNU AWK, and prefilters string values before attempting
612 to convert them to numbers, as follows:
614 .I "Hexadecimal values"
615 Hexadecimal values (allowed since C99) convert to zero, as they did
623 (case independent) convert to NaN. No others do.
624 (NaNs can have signs.)
631 (case independent) convert to positive and negative infinity, respectively.