1 @c Copyright (C) 1994, 1996, 1998, 2000-2001, 2003-2007, 2009-2018 Free
2 @c Software Foundation, Inc.
4 @c Permission is granted to copy, distribute and/or modify this document
5 @c under the terms of the GNU Free Documentation License, Version 1.3 or
6 @c any later version published by the Free Software Foundation; with no
7 @c Invariant Sections, no Front-Cover Texts, and no Back-Cover
8 @c Texts. A copy of the license is included in the ``GNU Free
9 @c Documentation License'' file as part of this distribution.
11 @c this regular expression description is for: generic
14 * awk regular expression syntax::
15 * egrep regular expression syntax::
16 * ed regular expression syntax::
17 * emacs regular expression syntax::
18 * gnu-awk regular expression syntax::
19 * grep regular expression syntax::
20 * posix-awk regular expression syntax::
21 * posix-basic regular expression syntax::
22 * posix-egrep regular expression syntax::
23 * posix-extended regular expression syntax::
24 * posix-minimal-basic regular expression syntax::
25 * sed regular expression syntax::
28 @node awk regular expression syntax
29 @subsection @samp{awk} regular expression syntax
32 The character @samp{.} matches any single character except the null character.
38 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
40 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
48 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
50 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
52 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
54 The alternation operator is @samp{|}.
56 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
58 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
61 @item At the beginning of a regular expression
63 @item After an open-group, signified by
65 @item After the alternation operator @samp{|}
72 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
75 @node egrep regular expression syntax
76 @subsection @samp{egrep} regular expression syntax
79 The character @samp{.} matches any single character.
85 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
87 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
95 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
97 GNU extensions are supported:
100 @item @samp{\w} matches a character within a word
102 @item @samp{\W} matches a character which is not within a word
104 @item @samp{\<} matches the beginning of a word
106 @item @samp{\>} matches the end of a word
108 @item @samp{\b} matches a word boundary
110 @item @samp{\B} matches characters which are not a word boundary
112 @item @samp{\`} matches the beginning of the whole input
114 @item @samp{\'} matches the end of the whole input
119 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
121 The alternation operator is @samp{|}.
123 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
125 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
127 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
129 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
132 @node ed regular expression syntax
133 @subsection @samp{ed} regular expression syntax
136 The character @samp{.} matches any single character except the null character.
142 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
144 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
150 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
152 GNU extensions are supported:
155 @item @samp{\w} matches a character within a word
157 @item @samp{\W} matches a character which is not within a word
159 @item @samp{\<} matches the beginning of a word
161 @item @samp{\>} matches the end of a word
163 @item @samp{\b} matches a word boundary
165 @item @samp{\B} matches characters which are not a word boundary
167 @item @samp{\`} matches the beginning of the whole input
169 @item @samp{\'} matches the end of the whole input
174 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
176 The alternation operator is @samp{\|}.
178 The character @samp{^} only represents the beginning of a string when it appears:
182 At the beginning of a regular expression
184 @item After an open-group, signified by
187 @item After the alternation operator @samp{\|}
192 The character @samp{$} only represents the end of a string when it appears:
195 @item At the end of a regular expression
197 @item Before a close-group, signified by
199 @item Before the alternation operator @samp{\|}
204 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
207 @item At the beginning of a regular expression
209 @item After an open-group, signified by
211 @item After the alternation operator @samp{\|}
216 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
218 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
221 @node emacs regular expression syntax
222 @subsection @samp{emacs} regular expression syntax
225 The character @samp{.} matches any single character except newline.
231 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
233 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
241 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
243 GNU extensions are supported:
246 @item @samp{\w} matches a character within a word
248 @item @samp{\W} matches a character which is not within a word
250 @item @samp{\<} matches the beginning of a word
252 @item @samp{\>} matches the end of a word
254 @item @samp{\b} matches a word boundary
256 @item @samp{\B} matches characters which are not a word boundary
258 @item @samp{\`} matches the beginning of the whole input
260 @item @samp{\'} matches the end of the whole input
265 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
267 The alternation operator is @samp{\|}.
269 The character @samp{^} only represents the beginning of a string when it appears:
273 At the beginning of a regular expression
275 @item After an open-group, signified by
278 @item After the alternation operator @samp{\|}
283 The character @samp{$} only represents the end of a string when it appears:
286 @item At the end of a regular expression
288 @item Before a close-group, signified by
290 @item Before the alternation operator @samp{\|}
295 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
298 @item At the beginning of a regular expression
300 @item After an open-group, signified by
302 @item After the alternation operator @samp{\|}
309 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
312 @node gnu-awk regular expression syntax
313 @subsection @samp{gnu-awk} regular expression syntax
316 The character @samp{.} matches any single character.
322 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
324 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
332 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
334 GNU extensions are supported:
337 @item @samp{\w} matches a character within a word
339 @item @samp{\W} matches a character which is not within a word
341 @item @samp{\<} matches the beginning of a word
343 @item @samp{\>} matches the end of a word
345 @item @samp{\b} matches a word boundary
347 @item @samp{\B} matches characters which are not a word boundary
349 @item @samp{\`} matches the beginning of the whole input
351 @item @samp{\'} matches the end of the whole input
356 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
358 The alternation operator is @samp{|}.
360 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
362 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
365 @item At the beginning of a regular expression
367 @item After an open-group, signified by
369 @item After the alternation operator @samp{|}
374 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
376 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
379 @node grep regular expression syntax
380 @subsection @samp{grep} regular expression syntax
383 The character @samp{.} matches any single character.
389 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
391 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
397 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
399 GNU extensions are supported:
402 @item @samp{\w} matches a character within a word
404 @item @samp{\W} matches a character which is not within a word
406 @item @samp{\<} matches the beginning of a word
408 @item @samp{\>} matches the end of a word
410 @item @samp{\b} matches a word boundary
412 @item @samp{\B} matches characters which are not a word boundary
414 @item @samp{\`} matches the beginning of the whole input
416 @item @samp{\'} matches the end of the whole input
421 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
423 The alternation operator is @samp{\|}.
425 The character @samp{^} only represents the beginning of a string when it appears:
429 At the beginning of a regular expression
431 @item After an open-group, signified by
434 @item After a newline
436 @item After the alternation operator @samp{\|}
441 The character @samp{$} only represents the end of a string when it appears:
444 @item At the end of a regular expression
446 @item Before a close-group, signified by
448 @item Before a newline
450 @item Before the alternation operator @samp{\|}
455 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
458 @item At the beginning of a regular expression
460 @item After an open-group, signified by
462 @item After a newline
464 @item After the alternation operator @samp{\|}
469 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
471 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
474 @node posix-awk regular expression syntax
475 @subsection @samp{posix-awk} regular expression syntax
478 The character @samp{.} matches any single character except the null character.
484 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
486 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
494 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
496 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
498 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
500 The alternation operator is @samp{|}.
502 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
504 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
507 @item At the beginning of a regular expression
509 @item After an open-group, signified by
511 @item After the alternation operator @samp{|}
516 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
518 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
521 @node posix-basic regular expression syntax
522 @subsection @samp{posix-basic} regular expression syntax
523 This is a synonym for ed.
524 @node posix-egrep regular expression syntax
525 @subsection @samp{posix-egrep} regular expression syntax
526 This is a synonym for egrep.
527 @node posix-extended regular expression syntax
528 @subsection @samp{posix-extended} regular expression syntax
531 The character @samp{.} matches any single character except the null character.
537 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
539 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
547 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
549 GNU extensions are supported:
552 @item @samp{\w} matches a character within a word
554 @item @samp{\W} matches a character which is not within a word
556 @item @samp{\<} matches the beginning of a word
558 @item @samp{\>} matches the end of a word
560 @item @samp{\b} matches a word boundary
562 @item @samp{\B} matches characters which are not a word boundary
564 @item @samp{\`} matches the beginning of the whole input
566 @item @samp{\'} matches the end of the whole input
571 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
573 The alternation operator is @samp{|}.
575 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
577 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
580 @item At the beginning of a regular expression
582 @item After an open-group, signified by
584 @item After the alternation operator @samp{|}
589 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
591 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
594 @node posix-minimal-basic regular expression syntax
595 @subsection @samp{posix-minimal-basic} regular expression syntax
598 The character @samp{.} matches any single character except the null character.
602 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
604 GNU extensions are supported:
607 @item @samp{\w} matches a character within a word
609 @item @samp{\W} matches a character which is not within a word
611 @item @samp{\<} matches the beginning of a word
613 @item @samp{\>} matches the end of a word
615 @item @samp{\b} matches a word boundary
617 @item @samp{\B} matches characters which are not a word boundary
619 @item @samp{\`} matches the beginning of the whole input
621 @item @samp{\'} matches the end of the whole input
626 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
630 The character @samp{^} only represents the beginning of a string when it appears:
634 At the beginning of a regular expression
636 @item After an open-group, signified by
642 The character @samp{$} only represents the end of a string when it appears:
645 @item At the end of a regular expression
647 @item Before a close-group, signified by
654 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
656 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
659 @node sed regular expression syntax
660 @subsection @samp{sed} regular expression syntax
661 This is a synonym for ed.