1 @c This is part of the Emacs manual.
2 @c Copyright (C) 1985, 86, 87, 93, 94, 95, 97, 2000, 2001
3 @c Free Software Foundation, Inc.
4 @c See file emacs.texi for copying conditions.
5 @node Search, Fixit, Display, Top
6 @chapter Searching and Replacement
8 @cindex finding strings within text
10 Like other editors, Emacs has commands for searching for occurrences of
11 a string. The principal search command is unusual in that it is
12 @dfn{incremental}; it begins to search before you have finished typing the
13 search string. There are also nonincremental search commands more like
14 those of other editors.
16 Besides the usual @code{replace-string} command that finds all
17 occurrences of one string and replaces them with another, Emacs has a fancy
18 replacement command called @code{query-replace} which asks interactively
19 which occurrences to replace.
22 * Incremental Search:: Search happens as you type the string.
23 * Nonincremental Search:: Specify entire string and then search.
24 * Word Search:: Search for sequence of words.
25 * Regexp Search:: Search for match for a regexp.
26 * Regexps:: Syntax of regular expressions.
27 * Search Case:: To ignore case while searching, or not.
28 * Replace:: Search, and replace some or all matches.
29 * Other Repeating Search:: Operating on all matches for some regexp.
32 @node Incremental Search, Nonincremental Search, Search, Search
33 @section Incremental Search
35 @cindex incremental search
36 An incremental search begins searching as soon as you type the first
37 character of the search string. As you type in the search string, Emacs
38 shows you where the string (as you have typed it so far) would be
39 found. When you have typed enough characters to identify the place you
40 want, you can stop. Depending on what you plan to do next, you may or
41 may not need to terminate the search explicitly with @key{RET}.
46 Incremental search forward (@code{isearch-forward}).
48 Incremental search backward (@code{isearch-backward}).
52 @findex isearch-forward
53 @kbd{C-s} starts an incremental search. @kbd{C-s} reads characters from
54 the keyboard and positions the cursor at the first occurrence of the
55 characters that you have typed. If you type @kbd{C-s} and then @kbd{F},
56 the cursor moves right after the first @samp{F}. Type an @kbd{O}, and see
57 the cursor move to after the first @samp{FO}. After another @kbd{O}, the
58 cursor is after the first @samp{FOO} after the place where you started the
59 search. At each step, the buffer text that matches the search string is
60 highlighted, if the terminal can do that; at each step, the current search
61 string is updated in the echo area.
63 If you make a mistake in typing the search string, you can cancel
64 characters with @key{DEL}. Each @key{DEL} cancels the last character of
65 search string. This does not happen until Emacs is ready to read another
66 input character; first it must either find, or fail to find, the character
67 you want to erase. If you do not want to wait for this to happen, use
68 @kbd{C-g} as described below.
70 When you are satisfied with the place you have reached, you can type
71 @key{RET}, which stops searching, leaving the cursor where the search
72 brought it. Also, any command not specially meaningful in searches
73 stops the searching and is then executed. Thus, typing @kbd{C-a} would
74 exit the search and then move to the beginning of the line. @key{RET}
75 is necessary only if the next command you want to type is a printing
76 character, @key{DEL}, @key{RET}, or another control character that is
77 special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s},
78 @kbd{C-y}, @kbd{M-y}, @kbd{M-r}, or @kbd{M-s}).
80 Sometimes you search for @samp{FOO} and find it, but not the one you
81 expected to find. There was a second @samp{FOO} that you forgot
82 about, before the one you were aiming for. In this event, type
83 another @kbd{C-s} to move to the next occurrence of the search string.
84 You can repeat this any number of times. If you overshoot, you can
85 cancel some @kbd{C-s} characters with @key{DEL}.
87 After you exit a search, you can search for the same string again by
88 typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
89 incremental search, and the second @kbd{C-s} means ``search again.''
91 To reuse earlier search strings, use the @dfn{search ring}. The
92 commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search
93 string to reuse. These commands leave the selected search ring element
94 in the minibuffer, where you can edit it. Type @kbd{C-s} or @kbd{C-r}
95 to terminate editing the string and search for it.
97 If your string is not found at all, the echo area says @samp{Failing
98 I-Search}. The cursor is after the place where Emacs found as much of your
99 string as it could. Thus, if you search for @samp{FOOT}, and there is no
100 @samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}.
101 At this point there are several things you can do. If your string was
102 mistyped, you can rub some of it out and correct it. If you like the place
103 you have found, you can type @key{RET} or some other Emacs command to
104 ``accept what the search offered.'' Or you can type @kbd{C-g}, which
105 removes from the search string the characters that could not be found (the
106 @samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in
107 @samp{FOOT}). A second @kbd{C-g} at that point cancels the search
108 entirely, returning point to where it was when the search started.
110 An upper-case letter in the search string makes the search
111 case-sensitive. If you delete the upper-case character from the search
112 string, it ceases to have this effect. @xref{Search Case}.
114 To search for a newline, type @kbd{C-j}. To search for another
115 control character, such as control-S or carriage return, you must quote
116 it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous
117 to its use for insertion (@pxref{Inserting Text}): it causes the
118 following character to be treated the way any ``ordinary'' character is
119 treated in the same context. You can also specify a character by its
120 octal code: enter @kbd{C-q} followed by a sequence of octal digits.
122 @cindex searching for non-ASCII characters
123 @cindex input method, during incremental search
124 To search for non-ASCII characters, you must use an input method
125 (@pxref{Input Methods}). If an input method is turned on in the
126 current buffer when you start the search, you can use it while you
127 type the search string also. Emacs indicates that by including the
128 input method mnemonic in its prompt, like this:
135 @findex isearch-toggle-input-method
136 @findex isearch-toggle-specified-input-method
137 where @var{im} is the mnemonic of the active input method. You can
138 toggle (enable or disable) the input method while you type the search
139 string with @kbd{C-\} (@code{isearch-toggle-input-method}). You can
140 turn on a certain (non-default) input method with @kbd{C-^}
141 (@code{isearch-toggle-specified-input-method}), which prompts for the
142 name of the input method. Note that the input method you turn on
143 during incremental search is turned on in the current buffer as well.
145 If a search is failing and you ask to repeat it by typing another
146 @kbd{C-s}, it starts again from the beginning of the buffer.
147 Repeating a failing reverse search with @kbd{C-r} starts again from
148 the end. This is called @dfn{wrapping around}, and @samp{Wrapped}
149 appears in the search prompt once this has happened. If you keep on
150 going past the original starting point of the search, it changes to
151 @samp{Overwrapped}, which means that you are revisiting matches that
152 you have already seen.
154 @cindex quitting (in search)
155 The @kbd{C-g} ``quit'' character does special things during searches;
156 just what it does depends on the status of the search. If the search has
157 found what you specified and is waiting for input, @kbd{C-g} cancels the
158 entire search. The cursor moves back to where you started the search. If
159 @kbd{C-g} is typed when there are characters in the search string that have
160 not been found---because Emacs is still searching for them, or because it
161 has failed to find them---then the search string characters which have not
162 been found are discarded from the search string. With them gone, the
163 search is now successful and waiting for more input, so a second @kbd{C-g}
164 will cancel the entire search.
166 You can change to searching backwards with @kbd{C-r}. If a search fails
167 because the place you started was too late in the file, you should do this.
168 Repeated @kbd{C-r} keeps looking for more occurrences backwards. A
169 @kbd{C-s} starts going forwards again. @kbd{C-r} in a search can be canceled
173 @findex isearch-backward
174 If you know initially that you want to search backwards, you can use
175 @kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r} as
176 a key runs a command (@code{isearch-backward}) to search backward. A
177 backward search finds matches that are entirely before the starting
178 point, just as a forward search finds matches that begin after it.
180 The characters @kbd{C-y} and @kbd{C-w} can be used in incremental
181 search to grab text from the buffer into the search string. This makes
182 it convenient to search for another occurrence of text at point.
183 @kbd{C-w} copies the word after point as part of the search string,
184 advancing point over that word. Another @kbd{C-s} to repeat the search
185 will then search for a string including that word. @kbd{C-y} is similar
186 to @kbd{C-w} but copies all the rest of the current line into the search
187 string. Both @kbd{C-y} and @kbd{C-w} convert the text they copy to
188 lower case if the search is currently not case-sensitive; this is so the
189 search remains case-insensitive.
191 The character @kbd{M-y} copies text from the kill ring into the search
192 string. It uses the same text that @kbd{C-y} as a command would yank.
193 @kbd{Mouse-2} in the echo area does the same.
196 When you exit the incremental search, it sets the mark to where point
197 @emph{was}, before the search. That is convenient for moving back
198 there. In Transient Mark mode, incremental search sets the mark without
199 activating it, and does so only if the mark is not already active.
201 @cindex lazy search highlighting
202 @vindex isearch-lazy-highlight
203 When you pause for a little while during incremental search, it
204 highlights all other possible matches for the search string. This
205 makes it easier to anticipate where you can get to by typing @kbd{C-s}
206 or @kbd{C-r} to repeat the search. The short delay before highlighting
207 other matches helps indicate which match is the current one.
208 If you don't like this feature, you can turn it off by setting
209 @code{isearch-lazy-highlight} to @code{nil}.
211 @vindex isearch-lazy-highlight-face
212 @cindex faces for highlighting search matches
213 You can control how does the highlighting of matches look like by
214 customizing the faces @code{isearch} (used for the current match) and
215 @code{isearch-lazy-highlight-face} (used for the other matches).
217 @vindex isearch-mode-map
218 To customize the special characters that incremental search understands,
219 alter their bindings in the keymap @code{isearch-mode-map}. For a list
220 of bindings, look at the documentation of @code{isearch-mode} with
221 @kbd{C-h f isearch-mode @key{RET}}.
223 @subsection Slow Terminal Incremental Search
225 Incremental search on a slow terminal uses a modified style of display
226 that is designed to take less time. Instead of redisplaying the buffer at
227 each place the search gets to, it creates a new single-line window and uses
228 that to display the line that the search has found. The single-line window
229 comes into play as soon as point gets outside of the text that is already
232 When you terminate the search, the single-line window is removed.
233 Then Emacs redisplays the window in which the search was done, to show
234 its new position of point.
236 @vindex search-slow-speed
237 The slow terminal style of display is used when the terminal baud rate is
238 less than or equal to the value of the variable @code{search-slow-speed},
241 @vindex search-slow-window-lines
242 The number of lines to use in slow terminal search display is controlled
243 by the variable @code{search-slow-window-lines}. Its normal value is 1.
245 @node Nonincremental Search, Word Search, Incremental Search, Search
246 @section Nonincremental Search
247 @cindex nonincremental search
249 Emacs also has conventional nonincremental search commands, which require
250 you to type the entire search string before searching begins.
253 @item C-s @key{RET} @var{string} @key{RET}
254 Search for @var{string}.
255 @item C-r @key{RET} @var{string} @key{RET}
256 Search backward for @var{string}.
259 To do a nonincremental search, first type @kbd{C-s @key{RET}}. This
260 enters the minibuffer to read the search string; terminate the string
261 with @key{RET}, and then the search takes place. If the string is not
262 found, the search command gets an error.
264 The way @kbd{C-s @key{RET}} works is that the @kbd{C-s} invokes
265 incremental search, which is specially programmed to invoke nonincremental
266 search if the argument you give it is empty. (Such an empty argument would
267 otherwise be useless.) @kbd{C-r @key{RET}} also works this way.
269 However, nonincremental searches performed using @kbd{C-s @key{RET}} do
270 not call @code{search-forward} right away. The first thing done is to see
271 if the next character is @kbd{C-w}, which requests a word search.
276 @findex search-forward
277 @findex search-backward
278 Forward and backward nonincremental searches are implemented by the
279 commands @code{search-forward} and @code{search-backward}. These
280 commands may be bound to keys in the usual manner. The feature that you
281 can get to them via the incremental search commands exists for
282 historical reasons, and to avoid the need to find suitable key sequences
285 @node Word Search, Regexp Search, Nonincremental Search, Search
289 Word search searches for a sequence of words without regard to how the
290 words are separated. More precisely, you type a string of many words,
291 using single spaces to separate them, and the string can be found even if
292 there are multiple spaces, newlines or other punctuation between the words.
294 Word search is useful for editing a printed document made with a text
295 formatter. If you edit while looking at the printed, formatted version,
296 you can't tell where the line breaks are in the source file. With word
297 search, you can search without having to know them.
300 @item C-s @key{RET} C-w @var{words} @key{RET}
301 Search for @var{words}, ignoring details of punctuation.
302 @item C-r @key{RET} C-w @var{words} @key{RET}
303 Search backward for @var{words}, ignoring details of punctuation.
306 Word search is a special case of nonincremental search and is invoked
307 with @kbd{C-s @key{RET} C-w}. This is followed by the search string,
308 which must always be terminated with @key{RET}. Being nonincremental,
309 this search does not start until the argument is terminated. It works
310 by constructing a regular expression and searching for that; see
313 Use @kbd{C-r @key{RET} C-w} to do backward word search.
315 @findex word-search-forward
316 @findex word-search-backward
317 Forward and backward word searches are implemented by the commands
318 @code{word-search-forward} and @code{word-search-backward}. These
319 commands may be bound to keys in the usual manner. The feature that you
320 can get to them via the incremental search commands exists for historical
321 reasons, and to avoid the need to find suitable key sequences for them.
323 @node Regexp Search, Regexps, Word Search, Search
324 @section Regular Expression Search
325 @cindex regular expression
328 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
329 denotes a class of alternative strings to match, possibly infinitely
330 many. In GNU Emacs, you can search for the next match for a regexp
331 either incrementally or not.
334 @findex isearch-forward-regexp
336 @findex isearch-backward-regexp
337 Incremental search for a regexp is done by typing @kbd{C-M-s}
338 (@code{isearch-forward-regexp}). This command reads a search string
339 incrementally just like @kbd{C-s}, but it treats the search string as a
340 regexp rather than looking for an exact match against the text in the
341 buffer. Each time you add text to the search string, you make the
342 regexp longer, and the new regexp is searched for. Invoking @kbd{C-s}
343 with a prefix argument (its value does not matter) is another way to do
344 a forward incremental regexp search. To search backward for a regexp,
345 use @kbd{C-M-r} (@code{isearch-backward-regexp}), or @kbd{C-r} with a
348 All of the control characters that do special things within an
349 ordinary incremental search have the same function in incremental regexp
350 search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the
351 search retrieves the last incremental search regexp used; that is to
352 say, incremental regexp and non-regexp searches have independent
353 defaults. They also have separate search rings that you can access with
354 @kbd{M-p} and @kbd{M-n}.
356 If you type @key{SPC} in incremental regexp search, it matches any
357 sequence of whitespace characters, including newlines. If you want
358 to match just a space, type @kbd{C-q @key{SPC}}.
360 Note that adding characters to the regexp in an incremental regexp
361 search can make the cursor move back and start again. For example, if
362 you have searched for @samp{foo} and you add @samp{\|bar}, the cursor
363 backs up in case the first @samp{bar} precedes the first @samp{foo}.
365 @findex re-search-forward
366 @findex re-search-backward
367 Nonincremental search for a regexp is done by the functions
368 @code{re-search-forward} and @code{re-search-backward}. You can invoke
369 these with @kbd{M-x}, or bind them to keys, or invoke them by way of
370 incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r
373 If you use the incremental regexp search commands with a prefix
374 argument, they perform ordinary string search, like
375 @code{isearch-forward} and @code{isearch-backward}. @xref{Incremental
378 @node Regexps, Search Case, Regexp Search, Search
379 @section Syntax of Regular Expressions
380 @cindex regexp syntax
382 Regular expressions have a syntax in which a few characters are
383 special constructs and the rest are @dfn{ordinary}. An ordinary
384 character is a simple regular expression which matches that same
385 character and nothing else. The special characters are @samp{$},
386 @samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{]} and
387 @samp{\}. Any other character appearing in a regular expression is
388 ordinary, unless a @samp{\} precedes it.
390 For example, @samp{f} is not a special character, so it is ordinary, and
391 therefore @samp{f} is a regular expression that matches the string
392 @samp{f} and no other string. (It does @emph{not} match the string
393 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches
394 only @samp{o}. (When case distinctions are being ignored, these regexps
395 also match @samp{F} and @samp{O}, but we consider this a generalization
396 of ``the same string,'' rather than an exception.)
398 Any two regular expressions @var{a} and @var{b} can be concatenated. The
399 result is a regular expression which matches a string if @var{a} matches
400 some amount of the beginning of that string and @var{b} matches the rest of
403 As a simple example, we can concatenate the regular expressions @samp{f}
404 and @samp{o} to get the regular expression @samp{fo}, which matches only
405 the string @samp{fo}. Still trivial. To do something nontrivial, you
406 need to use one of the special characters. Here is a list of them.
409 @item .@: @r{(Period)}
410 is a special character that matches any single character except a newline.
411 Using concatenation, we can make regular expressions like @samp{a.b}, which
412 matches any three-character string that begins with @samp{a} and ends with
416 is not a construct by itself; it is a postfix operator that means to
417 match the preceding regular expression repetitively as many times as
418 possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
421 @samp{*} always applies to the @emph{smallest} possible preceding
422 expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
423 @samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
425 The matcher processes a @samp{*} construct by matching, immediately,
426 as many repetitions as can be found. Then it continues with the rest
427 of the pattern. If that fails, backtracking occurs, discarding some
428 of the matches of the @samp{*}-modified construct in case that makes
429 it possible to match the rest of the pattern. For example, in matching
430 @samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
431 tries to match all three @samp{a}s; but the rest of the pattern is
432 @samp{ar} and there is only @samp{r} left to match, so this try fails.
433 The next alternative is for @samp{a*} to match only two @samp{a}s.
434 With this choice, the rest of the regexp matches successfully.@refill
437 is a postfix operator, similar to @samp{*} except that it must match
438 the preceding expression at least once. So, for example, @samp{ca+r}
439 matches the strings @samp{car} and @samp{caaaar} but not the string
440 @samp{cr}, whereas @samp{ca*r} matches all three strings.
443 is a postfix operator, similar to @samp{*} except that it can match the
444 preceding expression either once or not at all. For example,
445 @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
448 @cindex non-greedy regexp matching
449 are non-greedy variants of the operators above. The normal operators
450 @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as
451 much as they can, as long as the overall regexp can still match. With
452 a following @samp{?}, they are non-greedy: they will match as little
455 Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a}
456 and the string @samp{abbbb}; but if you try to match them both against
457 the text @samp{abbb}, @samp{ab*} will match it all (the longest valid
458 match), while @samp{ab*?} will match just @samp{a} (the shortest
462 is a postfix operator that specifies repetition @var{n} times---that
463 is, the preceding regular expression must match exactly @var{n} times
464 in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx}
467 @item \@{@var{n},@var{m}\@}
468 is a postfix operator that specifies repetition between @var{n} and
469 @var{m} times---that is, the preceding regular expression must match
470 at least @var{n} times, but no more than @var{m} times. If @var{m} is
471 omitted, then there is no upper limit, but the preceding regular
472 expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
473 equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
474 @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
477 is a @dfn{character set}, which begins with @samp{[} and is terminated
478 by @samp{]}. In the simplest case, the characters between the two
479 brackets are what this set can match.
481 Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
482 @samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
483 (including the empty string), from which it follows that @samp{c[ad]*r}
484 matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
486 You can also include character ranges in a character set, by writing the
487 starting and ending characters with a @samp{-} between them. Thus,
488 @samp{[a-z]} matches any lower-case ASCII letter. Ranges may be
489 intermixed freely with individual characters, as in @samp{[a-z$%.]},
490 which matches any lower-case ASCII letter or @samp{$}, @samp{%} or
493 Note that the usual regexp special characters are not special inside a
494 character set. A completely different set of special characters exists
495 inside character sets: @samp{]}, @samp{-} and @samp{^}.
497 To include a @samp{]} in a character set, you must make it the first
498 character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
499 include a @samp{-}, write @samp{-} as the first or last character of the
500 set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
503 To include @samp{^} in a set, put it anywhere but at the beginning of
506 When you use a range in case-insensitive search, you should write both
507 ends of the range in upper case, or both in lower case, or both should
508 be non-letters. The behavior of a mixed-case range such as @samp{A-z}
509 is somewhat ill-defined, and it may change in future Emacs versions.
512 @samp{[^} begins a @dfn{complemented character set}, which matches any
513 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
514 all characters @emph{except} letters and digits.
516 @samp{^} is not special in a character set unless it is the first
517 character. The character following the @samp{^} is treated as if it
518 were first (in other words, @samp{-} and @samp{]} are not special there).
520 A complemented character set can match a newline, unless newline is
521 mentioned as one of the characters not to match. This is in contrast to
522 the handling of regexps in programs such as @code{grep}.
525 is a special character that matches the empty string, but only at the
526 beginning of a line in the text being matched. Otherwise it fails to
527 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
528 the beginning of a line.
531 is similar to @samp{^} but matches only at the end of a line. Thus,
532 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
535 has two functions: it quotes the special characters (including
536 @samp{\}), and it introduces additional special constructs.
538 Because @samp{\} quotes special characters, @samp{\$} is a regular
539 expression that matches only @samp{$}, and @samp{\[} is a regular
540 expression that matches only @samp{[}, and so on.
543 Note: for historical compatibility, special characters are treated as
544 ordinary ones if they are in contexts where their special meanings make no
545 sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is
546 no preceding expression on which the @samp{*} can act. It is poor practice
547 to depend on this behavior; it is better to quote the special character anyway,
548 regardless of where it appears.@refill
550 For the most part, @samp{\} followed by any character matches only that
551 character. However, there are several exceptions: two-character
552 sequences starting with @samp{\} that have special meanings. The second
553 character in the sequence is always an ordinary character when used on
554 its own. Here is a table of @samp{\} constructs.
558 specifies an alternative. Two regular expressions @var{a} and @var{b}
559 with @samp{\|} in between form an expression that matches some text if
560 either @var{a} matches it or @var{b} matches it. It works by trying to
561 match @var{a}, and if that fails, by trying to match @var{b}.
563 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
564 but no other string.@refill
566 @samp{\|} applies to the largest possible surrounding expressions. Only a
567 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
570 Full backtracking capability exists to handle multiple uses of @samp{\|}.
573 is a grouping construct that serves three purposes:
577 To enclose a set of @samp{\|} alternatives for other operations.
578 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
581 To enclose a complicated expression for the postfix operators @samp{*},
582 @samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
583 @samp{bananana}, etc., with any (zero or more) number of @samp{na}
587 To record a matched substring for future reference.
590 This last application is not a consequence of the idea of a
591 parenthetical grouping; it is a separate feature that is assigned as a
592 second meaning to the same @samp{\( @dots{} \)} construct. In practice
593 there is usually no conflict between the two meanings; when there is
594 a conflict, you can use a ``shy'' group.
596 @item \(?: @dots{} \)
597 @cindex shy group, in regexp
598 specifies a ``shy'' group that does not record the matched substring;
599 you can't refer back to it with @samp{\@var{d}}. This is useful
600 in mechanically combining regular expressions, so that you
601 can add groups for syntactic purposes without interfering with
602 the numbering of the groups that were written by the user.
605 matches the same text that matched the @var{d}th occurrence of a
606 @samp{\( @dots{} \)} construct.
608 After the end of a @samp{\( @dots{} \)} construct, the matcher remembers
609 the beginning and end of the text matched by that construct. Then,
610 later on in the regular expression, you can use @samp{\} followed by the
611 digit @var{d} to mean ``match the same text matched the @var{d}th time
612 by the @samp{\( @dots{} \)} construct.''
614 The strings matching the first nine @samp{\( @dots{} \)} constructs
615 appearing in a regular expression are assigned numbers 1 through 9 in
616 the order that the open-parentheses appear in the regular expression.
617 So you can use @samp{\1} through @samp{\9} to refer to the text matched
618 by the corresponding @samp{\( @dots{} \)} constructs.
620 For example, @samp{\(.*\)\1} matches any newline-free string that is
621 composed of two identical halves. The @samp{\(.*\)} matches the first
622 half, which may be anything, but the @samp{\1} that follows must match
625 If a particular @samp{\( @dots{} \)} construct matches more than once
626 (which can easily happen if it is followed by @samp{*}), only the last
630 matches the empty string, but only at the beginning
631 of the buffer or string being matched against.
634 matches the empty string, but only at the end of
635 the buffer or string being matched against.
638 matches the empty string, but only at point.
641 matches the empty string, but only at the beginning or
642 end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
643 @samp{foo} as a separate word. @samp{\bballs?\b} matches
644 @samp{ball} or @samp{balls} as a separate word.@refill
646 @samp{\b} matches at the beginning or end of the buffer
647 regardless of what text appears next to it.
650 matches the empty string, but @emph{not} at the beginning or
654 matches the empty string, but only at the beginning of a word.
655 @samp{\<} matches at the beginning of the buffer only if a
656 word-constituent character follows.
659 matches the empty string, but only at the end of a word. @samp{\>}
660 matches at the end of the buffer only if the contents end with a
661 word-constituent character.
664 matches any word-constituent character. The syntax table
665 determines which characters these are. @xref{Syntax}.
668 matches any character that is not a word-constituent.
671 matches any character whose syntax is @var{c}. Here @var{c} is a
672 character that represents a syntax code: thus, @samp{w} for word
673 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
674 etc. Represent a character of whitespace (which can be a newline) by
675 either @samp{-} or a space character.
678 matches any character whose syntax is not @var{c}.
680 @cindex categories of characters
681 @cindex characters which belong to a specific language
682 @findex describe-categories
684 matches any character that belongs to the category @var{c}. For
685 example, @samp{\cc} matches Chinese characters, @samp{\cg} matches
686 Greek characters, etc. For the description of the known categories,
687 type @kbd{M-x describe-categories @key{RET}}.
690 matches any character that does @emph{not} belong to category
694 The constructs that pertain to words and syntax are controlled by the
695 setting of the syntax table (@pxref{Syntax}).
697 Here is a complicated regexp, used by Emacs to recognize the end of a
698 sentence together with any whitespace that follows. It is given in Lisp
699 syntax to enable you to distinguish the spaces from the tab characters. In
700 Lisp syntax, the string constant begins and ends with a double-quote.
701 @samp{\"} stands for a double-quote as part of the regexp, @samp{\\} for a
702 backslash as part of the regexp, @samp{\t} for a tab and @samp{\n} for a
706 "[.?!][]\"')]*\\($\\|\t\\| \\)[ \t\n]*"
710 This contains four parts in succession: a character set matching period,
711 @samp{?}, or @samp{!}; a character set matching close-brackets, quotes,
712 or parentheses, repeated any number of times; an alternative in
713 backslash-parentheses that matches end-of-line, a tab, or two spaces;
714 and a character set matching whitespace characters, repeated any number
717 To enter the same regexp interactively, you would type @key{TAB} to
718 enter a tab, and @kbd{C-j} to enter a newline. You would also type
719 single backslashes as themselves, instead of doubling them for Lisp syntax.
722 @c I commented this out because it is missing vital information
723 @c and therefore useless. For instance, what do you do to *use* the
724 @c regular expression when it is finished? What jobs is this good for?
728 @cindex authoring regular expressions
729 For convenient interactive development of regular expressions, you
730 can use the @kbd{M-x re-builder} command. It provides a convenient
731 interface for creating regular expressions, by giving immediate visual
732 feedback. The buffer from which @code{re-builder} was invoked becomes
733 the target for the regexp editor, which pops in a separate window. At
734 all times, all the matches in the target buffer for the current
735 regular expression are highlighted. Each parenthesized sub-expression
736 of the regexp is shown in a distinct face, which makes it easier to
737 verify even very complex regexps. (On displays that don't support
738 colors, Emacs blinks the cursor around the matched text, as it does
739 for matching parens.)
742 @node Search Case, Replace, Regexps, Search
743 @section Searching and Case
745 @vindex case-fold-search
746 Incremental searches in Emacs normally ignore the case of the text
747 they are searching through, if you specify the text in lower case.
748 Thus, if you specify searching for @samp{foo}, then @samp{Foo} and
749 @samp{foo} are also considered a match. Regexps, and in particular
750 character sets, are included: @samp{[ab]} would match @samp{a} or
751 @samp{A} or @samp{b} or @samp{B}.@refill
753 An upper-case letter anywhere in the incremental search string makes
754 the search case-sensitive. Thus, searching for @samp{Foo} does not find
755 @samp{foo} or @samp{FOO}. This applies to regular expression search as
756 well as to string search. The effect ceases if you delete the
757 upper-case letter from the search string.
759 If you set the variable @code{case-fold-search} to @code{nil}, then
760 all letters must match exactly, including case. This is a per-buffer
761 variable; altering the variable affects only the current buffer, but
762 there is a default value which you can change as well. @xref{Locals}.
763 This variable applies to nonincremental searches also, including those
764 performed by the replace commands (@pxref{Replace}) and the minibuffer
765 history matching commands (@pxref{Minibuffer History}).
767 @node Replace, Other Repeating Search, Search Case, Search
768 @section Replacement Commands
770 @cindex search-and-replace commands
771 @cindex string substitution
772 @cindex global substitution
774 Global search-and-replace operations are not needed as often in Emacs
775 as they are in other editors@footnote{In some editors,
776 search-and-replace operations are the only convenient way to make a
777 single change in the text.}, but they are available. In addition to the
778 simple @kbd{M-x replace-string} command which is like that found in most
779 editors, there is a @kbd{M-x query-replace} command which asks you, for
780 each occurrence of the pattern, whether to replace it.
782 The replace commands normally operate on the text from point to the
783 end of the buffer; however, in Transient Mark mode, when the mark is
784 active, they operate on the region. The replace commands all replace
785 one string (or regexp) with one replacement string. It is possible to
786 perform several replacements in parallel using the command
787 @code{expand-region-abbrevs} (@pxref{Expanding Abbrevs}).
790 * Unconditional Replace:: Replacing all matches for a string.
791 * Regexp Replace:: Replacing all matches for a regexp.
792 * Replacement and Case:: How replacements preserve case of letters.
793 * Query Replace:: How to use querying.
796 @node Unconditional Replace, Regexp Replace, Replace, Replace
797 @subsection Unconditional Replacement
798 @findex replace-string
799 @findex replace-regexp
802 @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
803 Replace every occurrence of @var{string} with @var{newstring}.
804 @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
805 Replace every match for @var{regexp} with @var{newstring}.
808 To replace every instance of @samp{foo} after point with @samp{bar},
809 use the command @kbd{M-x replace-string} with the two arguments
810 @samp{foo} and @samp{bar}. Replacement happens only in the text after
811 point, so if you want to cover the whole buffer you must go to the
812 beginning first. All occurrences up to the end of the buffer are
813 replaced; to limit replacement to part of the buffer, narrow to that
814 part of the buffer before doing the replacement (@pxref{Narrowing}).
815 In Transient Mark mode, when the region is active, replacement is
816 limited to the region (@pxref{Transient Mark}).
818 When @code{replace-string} exits, it leaves point at the last
819 occurrence replaced. It sets the mark to the prior position of point
820 (where the @code{replace-string} command was issued); use @kbd{C-u
821 C-@key{SPC}} to move back there.
823 A numeric argument restricts replacement to matches that are surrounded
824 by word boundaries. The argument's value doesn't matter.
826 @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
827 @subsection Regexp Replacement
829 The @kbd{M-x replace-string} command replaces exact matches for a
830 single string. The similar command @kbd{M-x replace-regexp} replaces
831 any match for a specified pattern.
833 In @code{replace-regexp}, the @var{newstring} need not be constant: it
834 can refer to all or part of what is matched by the @var{regexp}.
835 @samp{\&} in @var{newstring} stands for the entire match being replaced.
836 @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
837 whatever matched the @var{d}th parenthesized grouping in @var{regexp}.
838 To include a @samp{\} in the text to replace with, you must enter
839 @samp{\\}. For example,
842 M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
846 replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
847 with @samp{cddr-safe}.
850 M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
854 performs the inverse transformation.
856 @node Replacement and Case, Query Replace, Regexp Replace, Replace
857 @subsection Replace Commands and Case
859 If the first argument of a replace command is all lower case, the
860 commands ignores case while searching for occurrences to
861 replace---provided @code{case-fold-search} is non-@code{nil}. If
862 @code{case-fold-search} is set to @code{nil}, case is always significant
866 In addition, when the @var{newstring} argument is all or partly lower
867 case, replacement commands try to preserve the case pattern of each
868 occurrence. Thus, the command
871 M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
875 replaces a lower case @samp{foo} with a lower case @samp{bar}, an
876 all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with
877 @samp{Bar}. (These three alternatives---lower case, all caps, and
878 capitalized, are the only ones that @code{replace-string} can
881 If upper-case letters are used in the replacement string, they remain
882 upper case every time that text is inserted. If upper-case letters are
883 used in the first argument, the second argument is always substituted
884 exactly as given, with no case conversion. Likewise, if either
885 @code{case-replace} or @code{case-fold-search} is set to @code{nil},
886 replacement is done without case conversion.
888 @node Query Replace,, Replacement and Case, Replace
889 @subsection Query Replace
890 @cindex query replace
893 @item M-% @var{string} @key{RET} @var{newstring} @key{RET}
894 @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
895 Replace some occurrences of @var{string} with @var{newstring}.
896 @item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET}
897 @itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
898 Replace some matches for @var{regexp} with @var{newstring}.
902 @findex query-replace
903 If you want to change only some of the occurrences of @samp{foo} to
904 @samp{bar}, not all of them, then you cannot use an ordinary
905 @code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}).
906 This command finds occurrences of @samp{foo} one by one, displays each
907 occurrence and asks you whether to replace it. A numeric argument to
908 @code{query-replace} tells it to consider only occurrences that are
909 bounded by word-delimiter characters. This preserves case, just like
910 @code{replace-string}, provided @code{case-replace} is non-@code{nil},
914 @findex query-replace-regexp
915 Aside from querying, @code{query-replace} works just like
916 @code{replace-string}, and @code{query-replace-regexp} works just like
917 @code{replace-regexp}. This command is run by @kbd{C-M-%}.
919 The things you can type when you are shown an occurrence of @var{string}
920 or a match for @var{regexp} are:
922 @ignore @c Not worth it.
923 @kindex SPC @r{(query-replace)}
924 @kindex DEL @r{(query-replace)}
925 @kindex , @r{(query-replace)}
926 @kindex RET @r{(query-replace)}
927 @kindex . @r{(query-replace)}
928 @kindex ! @r{(query-replace)}
929 @kindex ^ @r{(query-replace)}
930 @kindex C-r @r{(query-replace)}
931 @kindex C-w @r{(query-replace)}
932 @kindex C-l @r{(query-replace)}
938 to replace the occurrence with @var{newstring}.
941 to skip to the next occurrence without replacing this one.
944 to replace this occurrence and display the result. You are then asked
945 for another input character to say what to do next. Since the
946 replacement has already been made, @key{DEL} and @key{SPC} are
947 equivalent in this situation; both move to the next occurrence.
949 You can type @kbd{C-r} at this point (see below) to alter the replaced
950 text. You can also type @kbd{C-x u} to undo the replacement; this exits
951 the @code{query-replace}, so if you want to do further replacement you
952 must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart
953 (@pxref{Repetition}).
956 to exit without doing any more replacements.
958 @item .@: @r{(Period)}
959 to replace this occurrence and then exit without searching for more
963 to replace all remaining occurrences without asking again.
966 to go back to the position of the previous occurrence (or what used to
967 be an occurrence), in case you changed it by mistake. This works by
968 popping the mark ring. Only one @kbd{^} in a row is meaningful, because
969 only one previous replacement position is kept during @code{query-replace}.
972 to enter a recursive editing level, in case the occurrence needs to be
973 edited rather than just replaced with @var{newstring}. When you are
974 done, exit the recursive editing level with @kbd{C-M-c} to proceed to
975 the next occurrence. @xref{Recursive Edit}.
978 to delete the occurrence, and then enter a recursive editing level as in
979 @kbd{C-r}. Use the recursive edit to insert text to replace the deleted
980 occurrence of @var{string}. When done, exit the recursive editing level
981 with @kbd{C-M-c} to proceed to the next occurrence.
984 to edit the replacement string in the minibuffer. When you exit the
985 minibuffer by typing @key{RET}, the minibuffer contents replace the
986 current occurrence of the pattern. They also become the new
987 replacement string for any further occurrences.
990 to redisplay the screen. Then you must type another character to
991 specify what to do with this occurrence.
994 to display a message summarizing these options. Then you must type
995 another character to specify what to do with this occurrence.
998 Some other characters are aliases for the ones listed above: @kbd{y},
999 @kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and
1002 Aside from this, any other character exits the @code{query-replace},
1003 and is then reread as part of a key sequence. Thus, if you type
1004 @kbd{C-k}, it exits the @code{query-replace} and then kills to end of
1007 To restart a @code{query-replace} once it is exited, use @kbd{C-x
1008 @key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
1009 used the minibuffer to read its arguments. @xref{Repetition, C-x ESC
1012 See also @ref{Transforming File Names}, for Dired commands to rename,
1013 copy, or link files by replacing regexp matches in file names.
1015 @node Other Repeating Search,, Replace, Search
1016 @section Other Search-and-Loop Commands
1018 Here are some other commands that find matches for a regular
1019 expression. They all ignore case in matching, if the pattern contains
1020 no upper-case letters and @code{case-fold-search} is non-@code{nil}.
1021 Aside from @code{occur}, all operate on the text from point to the end
1022 of the buffer, or on the active region in Transient Mark mode.
1024 @findex list-matching-lines
1027 @findex delete-non-matching-lines
1028 @findex delete-matching-lines
1033 @item M-x occur @key{RET} @var{regexp} @key{RET}
1034 Display a list showing each line in the buffer that contains a match
1035 for @var{regexp}. To limit the search to part of the buffer, narrow
1036 to that part (@pxref{Narrowing}). A numeric argument @var{n}
1037 specifies to display @var{n} lines of context before and after each
1040 @kindex RET @r{(Occur mode)}
1041 The buffer @samp{*Occur*} containing the output serves as a menu for
1042 finding the occurrences in their original context. Click @kbd{Mouse-2}
1043 on an occurrence listed in @samp{*Occur*}, or position point there and
1044 type @key{RET}; this switches to the buffer that was searched and
1045 moves point to the original of the chosen occurrence.
1047 @item M-x list-matching-lines
1048 Synonym for @kbd{M-x occur}.
1050 @item M-x how-many @key{RET} @var{regexp} @key{RET}
1051 Print the number of matches for @var{regexp} that exist in the buffer
1052 after point. In Transient Mark mode, if the region is active, the
1053 command operates on the region instead.
1055 @item M-x flush-lines @key{RET} @var{regexp} @key{RET}
1056 Delete each line that contains a match for @var{regexp}, operating on
1057 the text after point. In Transient Mark mode, if the region is
1058 active, the command operates on the region instead.
1060 @item M-x keep-lines @key{RET} @var{regexp} @key{RET}
1061 Delete each line that @emph{does not} contain a match for
1062 @var{regexp}, operating on the text after point. In Transient Mark
1063 mode, if the region is active, the command operates on the region
1067 You can also search multiple files under control of a tags table
1068 (@pxref{Tags Search}) or through Dired @kbd{A} command
1069 (@pxref{Operating on Files}), or ask the @code{grep} program to do it
1070 (@pxref{Grep Searching}).