1 @c Copyright (C) 1994, 1996, 1998, 2000--2001, 2003--2007, 2009--2020 Free
2 @c Software Foundation, Inc.
4 @c Permission is granted to copy, distribute and/or modify this document
5 @c under the terms of the GNU Free Documentation License, Version 1.3 or
6 @c any later version published by the Free Software Foundation; with no
7 @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
8 @c copy of the license is at <https://www.gnu.org/licenses/fdl-1.3.en.html>.
10 @c this regular expression description is for: generic
13 * awk regular expression syntax::
14 * egrep regular expression syntax::
15 * ed regular expression syntax::
16 * emacs regular expression syntax::
17 * gnu-awk regular expression syntax::
18 * grep regular expression syntax::
19 * posix-awk regular expression syntax::
20 * posix-basic regular expression syntax::
21 * posix-egrep regular expression syntax::
22 * posix-extended regular expression syntax::
23 * posix-minimal-basic regular expression syntax::
24 * sed regular expression syntax::
27 @node awk regular expression syntax
28 @subsection @samp{awk} regular expression syntax
31 The character @samp{.} matches any single character except the null character.
37 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
39 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
47 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
49 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
51 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
53 The alternation operator is @samp{|}.
55 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
57 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
60 @item At the beginning of a regular expression
62 @item After an open-group, signified by
64 @item After the alternation operator @samp{|}
71 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
74 @node egrep regular expression syntax
75 @subsection @samp{egrep} regular expression syntax
78 The character @samp{.} matches any single character.
84 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
86 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
94 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
96 GNU extensions are supported:
99 @item @samp{\w} matches a character within a word
101 @item @samp{\W} matches a character which is not within a word
103 @item @samp{\<} matches the beginning of a word
105 @item @samp{\>} matches the end of a word
107 @item @samp{\b} matches a word boundary
109 @item @samp{\B} matches characters which are not a word boundary
111 @item @samp{\`} matches the beginning of the whole input
113 @item @samp{\'} matches the end of the whole input
118 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
120 The alternation operator is @samp{|}.
122 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
124 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
126 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
128 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
131 @node ed regular expression syntax
132 @subsection @samp{ed} regular expression syntax
135 The character @samp{.} matches any single character except the null character.
141 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
143 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
149 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
151 GNU extensions are supported:
154 @item @samp{\w} matches a character within a word
156 @item @samp{\W} matches a character which is not within a word
158 @item @samp{\<} matches the beginning of a word
160 @item @samp{\>} matches the end of a word
162 @item @samp{\b} matches a word boundary
164 @item @samp{\B} matches characters which are not a word boundary
166 @item @samp{\`} matches the beginning of the whole input
168 @item @samp{\'} matches the end of the whole input
173 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
175 The alternation operator is @samp{\|}.
177 The character @samp{^} only represents the beginning of a string when it appears:
181 At the beginning of a regular expression
183 @item After an open-group, signified by
186 @item After the alternation operator @samp{\|}
191 The character @samp{$} only represents the end of a string when it appears:
194 @item At the end of a regular expression
196 @item Before a close-group, signified by
198 @item Before the alternation operator @samp{\|}
203 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
206 @item At the beginning of a regular expression
208 @item After an open-group, signified by
210 @item After the alternation operator @samp{\|}
215 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
217 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
220 @node emacs regular expression syntax
221 @subsection @samp{emacs} regular expression syntax
224 The character @samp{.} matches any single character except newline.
230 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
232 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
240 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
242 GNU extensions are supported:
245 @item @samp{\w} matches a character within a word
247 @item @samp{\W} matches a character which is not within a word
249 @item @samp{\<} matches the beginning of a word
251 @item @samp{\>} matches the end of a word
253 @item @samp{\b} matches a word boundary
255 @item @samp{\B} matches characters which are not a word boundary
257 @item @samp{\`} matches the beginning of the whole input
259 @item @samp{\'} matches the end of the whole input
264 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
266 The alternation operator is @samp{\|}.
268 The character @samp{^} only represents the beginning of a string when it appears:
272 At the beginning of a regular expression
274 @item After an open-group, signified by
277 @item After the alternation operator @samp{\|}
282 The character @samp{$} only represents the end of a string when it appears:
285 @item At the end of a regular expression
287 @item Before a close-group, signified by
289 @item Before the alternation operator @samp{\|}
294 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
297 @item At the beginning of a regular expression
299 @item After an open-group, signified by
301 @item After the alternation operator @samp{\|}
308 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
311 @node gnu-awk regular expression syntax
312 @subsection @samp{gnu-awk} regular expression syntax
315 The character @samp{.} matches any single character.
321 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
323 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
331 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
333 GNU extensions are supported:
336 @item @samp{\w} matches a character within a word
338 @item @samp{\W} matches a character which is not within a word
340 @item @samp{\<} matches the beginning of a word
342 @item @samp{\>} matches the end of a word
344 @item @samp{\b} matches a word boundary
346 @item @samp{\B} matches characters which are not a word boundary
348 @item @samp{\`} matches the beginning of the whole input
350 @item @samp{\'} matches the end of the whole input
355 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
357 The alternation operator is @samp{|}.
359 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
361 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
364 @item At the beginning of a regular expression
366 @item After an open-group, signified by
368 @item After the alternation operator @samp{|}
373 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
375 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
378 @node grep regular expression syntax
379 @subsection @samp{grep} regular expression syntax
382 The character @samp{.} matches any single character.
388 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
390 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
396 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
398 GNU extensions are supported:
401 @item @samp{\w} matches a character within a word
403 @item @samp{\W} matches a character which is not within a word
405 @item @samp{\<} matches the beginning of a word
407 @item @samp{\>} matches the end of a word
409 @item @samp{\b} matches a word boundary
411 @item @samp{\B} matches characters which are not a word boundary
413 @item @samp{\`} matches the beginning of the whole input
415 @item @samp{\'} matches the end of the whole input
420 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
422 The alternation operator is @samp{\|}.
424 The character @samp{^} only represents the beginning of a string when it appears:
428 At the beginning of a regular expression
430 @item After an open-group, signified by
433 @item After a newline
435 @item After the alternation operator @samp{\|}
440 The character @samp{$} only represents the end of a string when it appears:
443 @item At the end of a regular expression
445 @item Before a close-group, signified by
447 @item Before a newline
449 @item Before the alternation operator @samp{\|}
454 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
457 @item At the beginning of a regular expression
459 @item After an open-group, signified by
461 @item After a newline
463 @item After the alternation operator @samp{\|}
468 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
470 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
473 @node posix-awk regular expression syntax
474 @subsection @samp{posix-awk} regular expression syntax
477 The character @samp{.} matches any single character except the null character.
483 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
485 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
493 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
495 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
497 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
499 The alternation operator is @samp{|}.
501 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
503 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
506 @item At the beginning of a regular expression
508 @item After an open-group, signified by
510 @item After the alternation operator @samp{|}
515 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
517 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
520 @node posix-basic regular expression syntax
521 @subsection @samp{posix-basic} regular expression syntax
522 This is a synonym for ed.
523 @node posix-egrep regular expression syntax
524 @subsection @samp{posix-egrep} regular expression syntax
525 This is a synonym for egrep.
526 @node posix-extended regular expression syntax
527 @subsection @samp{posix-extended} regular expression syntax
530 The character @samp{.} matches any single character except the null character.
536 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
538 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
546 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
548 GNU extensions are supported:
551 @item @samp{\w} matches a character within a word
553 @item @samp{\W} matches a character which is not within a word
555 @item @samp{\<} matches the beginning of a word
557 @item @samp{\>} matches the end of a word
559 @item @samp{\b} matches a word boundary
561 @item @samp{\B} matches characters which are not a word boundary
563 @item @samp{\`} matches the beginning of the whole input
565 @item @samp{\'} matches the end of the whole input
570 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
572 The alternation operator is @samp{|}.
574 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
576 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
579 @item At the beginning of a regular expression
581 @item After an open-group, signified by
583 @item After the alternation operator @samp{|}
588 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
590 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
593 @node posix-minimal-basic regular expression syntax
594 @subsection @samp{posix-minimal-basic} regular expression syntax
597 The character @samp{.} matches any single character except the null character.
601 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
603 GNU extensions are supported:
606 @item @samp{\w} matches a character within a word
608 @item @samp{\W} matches a character which is not within a word
610 @item @samp{\<} matches the beginning of a word
612 @item @samp{\>} matches the end of a word
614 @item @samp{\b} matches a word boundary
616 @item @samp{\B} matches characters which are not a word boundary
618 @item @samp{\`} matches the beginning of the whole input
620 @item @samp{\'} matches the end of the whole input
625 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
629 The character @samp{^} only represents the beginning of a string when it appears:
633 At the beginning of a regular expression
635 @item After an open-group, signified by
641 The character @samp{$} only represents the end of a string when it appears:
644 @item At the end of a regular expression
646 @item Before a close-group, signified by
653 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
655 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
658 @node sed regular expression syntax
659 @subsection @samp{sed} regular expression syntax
660 This is a synonym for ed.