1 @c This is part of the Emacs manual.
2 @c Copyright (C) 1985, 86, 87, 93, 94, 95, 97, 2000, 2001
3 @c Free Software Foundation, Inc.
4 @c See file emacs.texi for copying conditions.
5 @node Search, Fixit, Display, Top
6 @chapter Searching and Replacement
8 @cindex finding strings within text
10 Like other editors, Emacs has commands for searching for occurrences of
11 a string. The principal search command is unusual in that it is
12 @dfn{incremental}; it begins to search before you have finished typing the
13 search string. There are also nonincremental search commands more like
14 those of other editors.
16 Besides the usual @code{replace-string} command that finds all
17 occurrences of one string and replaces them with another, Emacs has a fancy
18 replacement command called @code{query-replace} which asks interactively
19 which occurrences to replace.
22 * Incremental Search:: Search happens as you type the string.
23 * Nonincremental Search:: Specify entire string and then search.
24 * Word Search:: Search for sequence of words.
25 * Regexp Search:: Search for match for a regexp.
26 * Regexps:: Syntax of regular expressions.
27 * Search Case:: To ignore case while searching, or not.
28 * Replace:: Search, and replace some or all matches.
29 * Other Repeating Search:: Operating on all matches for some regexp.
32 @node Incremental Search, Nonincremental Search, Search, Search
33 @section Incremental Search
35 @cindex incremental search
36 An incremental search begins searching as soon as you type the first
37 character of the search string. As you type in the search string, Emacs
38 shows you where the string (as you have typed it so far) would be
39 found. When you have typed enough characters to identify the place you
40 want, you can stop. Depending on what you plan to do next, you may or
41 may not need to terminate the search explicitly with @key{RET}.
46 Incremental search forward (@code{isearch-forward}).
48 Incremental search backward (@code{isearch-backward}).
52 @findex isearch-forward
53 @kbd{C-s} starts an incremental search. @kbd{C-s} reads characters from
54 the keyboard and positions the cursor at the first occurrence of the
55 characters that you have typed. If you type @kbd{C-s} and then @kbd{F},
56 the cursor moves right after the first @samp{F}. Type an @kbd{O}, and see
57 the cursor move to after the first @samp{FO}. After another @kbd{O}, the
58 cursor is after the first @samp{FOO} after the place where you started the
59 search. At each step, the buffer text that matches the search string is
60 highlighted, if the terminal can do that; at each step, the current search
61 string is updated in the echo area.
63 If you make a mistake in typing the search string, you can cancel
64 characters with @key{DEL}. Each @key{DEL} cancels the last character of
65 search string. This does not happen until Emacs is ready to read another
66 input character; first it must either find, or fail to find, the character
67 you want to erase. If you do not want to wait for this to happen, use
68 @kbd{C-g} as described below.
70 When you are satisfied with the place you have reached, you can type
71 @key{RET}, which stops searching, leaving the cursor where the search
72 brought it. Also, any command not specially meaningful in searches
73 stops the searching and is then executed. Thus, typing @kbd{C-a} would
74 exit the search and then move to the beginning of the line. @key{RET}
75 is necessary only if the next command you want to type is a printing
76 character, @key{DEL}, @key{RET}, or another control character that is
77 special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s},
78 @kbd{C-y}, @kbd{M-y}, @kbd{M-r}, or @kbd{M-s}).
80 Sometimes you search for @samp{FOO} and find it, but not the one you
81 expected to find. There was a second @samp{FOO} that you forgot
82 about, before the one you were aiming for. In this event, type
83 another @kbd{C-s} to move to the next occurrence of the search string.
84 You can repeat this any number of times. If you overshoot, you can
85 cancel some @kbd{C-s} characters with @key{DEL}.
87 After you exit a search, you can search for the same string again by
88 typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
89 incremental search, and the second @kbd{C-s} means ``search again.''
91 To reuse earlier search strings, use the @dfn{search ring}. The
92 commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search
93 string to reuse. These commands leave the selected search ring element
94 in the minibuffer, where you can edit it. Type @kbd{C-s} or @kbd{C-r}
95 to terminate editing the string and search for it.
97 If your string is not found at all, the echo area says @samp{Failing
98 I-Search}. The cursor is after the place where Emacs found as much of your
99 string as it could. Thus, if you search for @samp{FOOT}, and there is no
100 @samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}.
101 At this point there are several things you can do. If your string was
102 mistyped, you can rub some of it out and correct it. If you like the place
103 you have found, you can type @key{RET} or some other Emacs command to
104 ``accept what the search offered.'' Or you can type @kbd{C-g}, which
105 removes from the search string the characters that could not be found (the
106 @samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in
107 @samp{FOOT}). A second @kbd{C-g} at that point cancels the search
108 entirely, returning point to where it was when the search started.
110 An upper-case letter in the search string makes the search
111 case-sensitive. If you delete the upper-case character from the search
112 string, it ceases to have this effect. @xref{Search Case}.
114 To search for a newline, type @kbd{C-j}. To search for another
115 control character, such as control-S or carriage return, you must quote
116 it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous
117 to its use for insertion (@pxref{Inserting Text}): it causes the
118 following character to be treated the way any ``ordinary'' character is
119 treated in the same context. You can also specify a character by its
120 octal code: enter @kbd{C-q} followed by a sequence of octal digits.
122 @cindex searching for non-ASCII characters
123 @cindex input method, during incremental search
124 To search for non-ASCII characters, you must use an input method
125 (@pxref{Input Methods}). If an input method is turned on in the
126 current buffer when you start the search, you can use it while you
127 type the search string also. Emacs indicates that by including the
128 input method mnemonic in its prompt, like this:
135 @findex isearch-toggle-input-method
136 @findex isearch-toggle-specified-input-method
137 where @var{im} is the mnemonic of the active input method. You can
138 toggle (enable or disable) the input method while you type the search
139 string with @kbd{C-\} (@code{isearch-toggle-input-method}). You can
140 turn on a certain (non-default) input method with @kbd{C-^}
141 (@code{isearch-toggle-specified-input-method}), which prompts for the
142 name of the input method. Note that the input method you turn on
143 during incremental search is turned on in the current buffer as well.
145 If a search is failing and you ask to repeat it by typing another
146 @kbd{C-s}, it starts again from the beginning of the buffer.
147 Repeating a failing reverse search with @kbd{C-r} starts again from
148 the end. This is called @dfn{wrapping around}, and @samp{Wrapped}
149 appears in the search prompt once this has happened. If you keep on
150 going past the original starting point of the search, it changes to
151 @samp{Overwrapped}, which means that you are revisiting matches that
152 you have already seen.
154 @cindex quitting (in search)
155 The @kbd{C-g} ``quit'' character does special things during searches;
156 just what it does depends on the status of the search. If the search has
157 found what you specified and is waiting for input, @kbd{C-g} cancels the
158 entire search. The cursor moves back to where you started the search. If
159 @kbd{C-g} is typed when there are characters in the search string that have
160 not been found---because Emacs is still searching for them, or because it
161 has failed to find them---then the search string characters which have not
162 been found are discarded from the search string. With them gone, the
163 search is now successful and waiting for more input, so a second @kbd{C-g}
164 will cancel the entire search.
166 You can change to searching backwards with @kbd{C-r}. If a search fails
167 because the place you started was too late in the file, you should do this.
168 Repeated @kbd{C-r} keeps looking for more occurrences backwards. A
169 @kbd{C-s} starts going forwards again. @kbd{C-r} in a search can be canceled
173 @findex isearch-backward
174 If you know initially that you want to search backwards, you can use
175 @kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r} as
176 a key runs a command (@code{isearch-backward}) to search backward. A
177 backward search finds matches that are entirely before the starting
178 point, just as a forward search finds matches that begin after it.
180 The characters @kbd{C-y} and @kbd{C-w} can be used in incremental
181 search to grab text from the buffer into the search string. This makes
182 it convenient to search for another occurrence of text at point.
183 @kbd{C-w} copies the word after point as part of the search string,
184 advancing point over that word. Another @kbd{C-s} to repeat the search
185 will then search for a string including that word. @kbd{C-y} is similar
186 to @kbd{C-w} but copies all the rest of the current line into the search
187 string. Both @kbd{C-y} and @kbd{C-w} convert the text they copy to
188 lower case if the search is currently not case-sensitive; this is so the
189 search remains case-insensitive.
191 The character @kbd{M-y} copies text from the kill ring into the search
192 string. It uses the same text that @kbd{C-y} as a command would yank.
193 @kbd{Mouse-2} in the echo area does the same.
196 When you exit the incremental search, it sets the mark to where point
197 @emph{was}, before the search. That is convenient for moving back
198 there. In Transient Mark mode, incremental search sets the mark without
199 activating it, and does so only if the mark is not already active.
201 @cindex lazy search highlighting
202 @vindex isearch-lazy-highlight
203 When you pause for a little while during incremental search, it
204 highlights all other possible matches for the search string. This
205 makes it easier to anticipate where you can get to by typing @kbd{C-s}
206 or @kbd{C-r} to repeat the search. The short delay before highlighting
207 other matches helps indicate which match is the current one.
208 If you don't like this feature, you can turn it off by setting
209 @code{isearch-lazy-highlight} to @code{nil}.
211 @vindex isearch-lazy-highlight-face
212 @cindex faces for highlighting search matches
213 You can control how does the highlighting of matches look like by
214 customizing the faces @code{isearch} (used for the current match) and
215 @code{isearch-lazy-highlight-face} (used for the other matches).
217 @vindex isearch-mode-map
218 To customize the special characters that incremental search understands,
219 alter their bindings in the keymap @code{isearch-mode-map}. For a list
220 of bindings, look at the documentation of @code{isearch-mode} with
221 @kbd{C-h f isearch-mode @key{RET}}.
223 @subsection Slow Terminal Incremental Search
225 Incremental search on a slow terminal uses a modified style of display
226 that is designed to take less time. Instead of redisplaying the buffer at
227 each place the search gets to, it creates a new single-line window and uses
228 that to display the line that the search has found. The single-line window
229 comes into play as soon as point gets outside of the text that is already
232 When you terminate the search, the single-line window is removed.
233 Then Emacs redisplays the window in which the search was done, to show
234 its new position of point.
236 @vindex search-slow-speed
237 The slow terminal style of display is used when the terminal baud rate is
238 less than or equal to the value of the variable @code{search-slow-speed},
241 @vindex search-slow-window-lines
242 The number of lines to use in slow terminal search display is controlled
243 by the variable @code{search-slow-window-lines}. Its normal value is 1.
245 @node Nonincremental Search, Word Search, Incremental Search, Search
246 @section Nonincremental Search
247 @cindex nonincremental search
249 Emacs also has conventional nonincremental search commands, which require
250 you to type the entire search string before searching begins.
253 @item C-s @key{RET} @var{string} @key{RET}
254 Search for @var{string}.
255 @item C-r @key{RET} @var{string} @key{RET}
256 Search backward for @var{string}.
259 To do a nonincremental search, first type @kbd{C-s @key{RET}}. This
260 enters the minibuffer to read the search string; terminate the string
261 with @key{RET}, and then the search takes place. If the string is not
262 found, the search command gets an error.
264 The way @kbd{C-s @key{RET}} works is that the @kbd{C-s} invokes
265 incremental search, which is specially programmed to invoke nonincremental
266 search if the argument you give it is empty. (Such an empty argument would
267 otherwise be useless.) @kbd{C-r @key{RET}} also works this way.
269 However, nonincremental searches performed using @kbd{C-s @key{RET}} do
270 not call @code{search-forward} right away. The first thing done is to see
271 if the next character is @kbd{C-w}, which requests a word search.
276 @findex search-forward
277 @findex search-backward
278 Forward and backward nonincremental searches are implemented by the
279 commands @code{search-forward} and @code{search-backward}. These
280 commands may be bound to keys in the usual manner. The feature that you
281 can get to them via the incremental search commands exists for
282 historical reasons, and to avoid the need to find suitable key sequences
285 @node Word Search, Regexp Search, Nonincremental Search, Search
289 Word search searches for a sequence of words without regard to how the
290 words are separated. More precisely, you type a string of many words,
291 using single spaces to separate them, and the string can be found even if
292 there are multiple spaces, newlines or other punctuation between the words.
294 Word search is useful for editing a printed document made with a text
295 formatter. If you edit while looking at the printed, formatted version,
296 you can't tell where the line breaks are in the source file. With word
297 search, you can search without having to know them.
300 @item C-s @key{RET} C-w @var{words} @key{RET}
301 Search for @var{words}, ignoring details of punctuation.
302 @item C-r @key{RET} C-w @var{words} @key{RET}
303 Search backward for @var{words}, ignoring details of punctuation.
306 Word search is a special case of nonincremental search and is invoked
307 with @kbd{C-s @key{RET} C-w}. This is followed by the search string,
308 which must always be terminated with @key{RET}. Being nonincremental,
309 this search does not start until the argument is terminated. It works
310 by constructing a regular expression and searching for that; see
313 Use @kbd{C-r @key{RET} C-w} to do backward word search.
315 @findex word-search-forward
316 @findex word-search-backward
317 Forward and backward word searches are implemented by the commands
318 @code{word-search-forward} and @code{word-search-backward}. These
319 commands may be bound to keys in the usual manner. The feature that you
320 can get to them via the incremental search commands exists for historical
321 reasons, and to avoid the need to find suitable key sequences for them.
323 @node Regexp Search, Regexps, Word Search, Search
324 @section Regular Expression Search
325 @cindex regular expression
328 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
329 denotes a class of alternative strings to match, possibly infinitely
330 many. In GNU Emacs, you can search for the next match for a regexp
331 either incrementally or not.
334 @findex isearch-forward-regexp
336 @findex isearch-backward-regexp
337 Incremental search for a regexp is done by typing @kbd{C-M-s}
338 (@code{isearch-forward-regexp}). This command reads a search string
339 incrementally just like @kbd{C-s}, but it treats the search string as a
340 regexp rather than looking for an exact match against the text in the
341 buffer. Each time you add text to the search string, you make the
342 regexp longer, and the new regexp is searched for. Invoking @kbd{C-s}
343 with a prefix argument (its value does not matter) is another way to do
344 a forward incremental regexp search. To search backward for a regexp,
345 use @kbd{C-M-r} (@code{isearch-backward-regexp}), or @kbd{C-r} with a
348 All of the control characters that do special things within an
349 ordinary incremental search have the same function in incremental regexp
350 search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the
351 search retrieves the last incremental search regexp used; that is to
352 say, incremental regexp and non-regexp searches have independent
353 defaults. They also have separate search rings that you can access with
354 @kbd{M-p} and @kbd{M-n}.
356 If you type @key{SPC} in incremental regexp search, it matches any
357 sequence of whitespace characters, including newlines. If you want
358 to match just a space, type @kbd{C-q @key{SPC}}.
360 Note that adding characters to the regexp in an incremental regexp
361 search can make the cursor move back and start again. For example, if
362 you have searched for @samp{foo} and you add @samp{\|bar}, the cursor
363 backs up in case the first @samp{bar} precedes the first @samp{foo}.
365 @findex re-search-forward
366 @findex re-search-backward
367 Nonincremental search for a regexp is done by the functions
368 @code{re-search-forward} and @code{re-search-backward}. You can invoke
369 these with @kbd{M-x}, or bind them to keys, or invoke them by way of
370 incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r
373 If you use the incremental regexp search commands with a prefix
374 argument, they perform ordinary string search, like
375 @code{isearch-forward} and @code{isearch-backward}. @xref{Incremental
378 @node Regexps, Search Case, Regexp Search, Search
379 @section Syntax of Regular Expressions
380 @cindex syntax of regexps
382 Regular expressions have a syntax in which a few characters are
383 special constructs and the rest are @dfn{ordinary}. An ordinary
384 character is a simple regular expression which matches that same
385 character and nothing else. The special characters are @samp{$},
386 @samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{]} and
387 @samp{\}. Any other character appearing in a regular expression is
388 ordinary, unless a @samp{\} precedes it. (When you use regular
389 expressions in a Lisp program, each @samp{\} must be doubled, see the
390 example near the end of this section.)
392 For example, @samp{f} is not a special character, so it is ordinary, and
393 therefore @samp{f} is a regular expression that matches the string
394 @samp{f} and no other string. (It does @emph{not} match the string
395 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches
396 only @samp{o}. (When case distinctions are being ignored, these regexps
397 also match @samp{F} and @samp{O}, but we consider this a generalization
398 of ``the same string,'' rather than an exception.)
400 Any two regular expressions @var{a} and @var{b} can be concatenated. The
401 result is a regular expression which matches a string if @var{a} matches
402 some amount of the beginning of that string and @var{b} matches the rest of
405 As a simple example, we can concatenate the regular expressions @samp{f}
406 and @samp{o} to get the regular expression @samp{fo}, which matches only
407 the string @samp{fo}. Still trivial. To do something nontrivial, you
408 need to use one of the special characters. Here is a list of them.
411 @item .@: @r{(Period)}
412 is a special character that matches any single character except a newline.
413 Using concatenation, we can make regular expressions like @samp{a.b}, which
414 matches any three-character string that begins with @samp{a} and ends with
418 is not a construct by itself; it is a postfix operator that means to
419 match the preceding regular expression repetitively as many times as
420 possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
423 @samp{*} always applies to the @emph{smallest} possible preceding
424 expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
425 @samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
427 The matcher processes a @samp{*} construct by matching, immediately,
428 as many repetitions as can be found. Then it continues with the rest
429 of the pattern. If that fails, backtracking occurs, discarding some
430 of the matches of the @samp{*}-modified construct in case that makes
431 it possible to match the rest of the pattern. For example, in matching
432 @samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
433 tries to match all three @samp{a}s; but the rest of the pattern is
434 @samp{ar} and there is only @samp{r} left to match, so this try fails.
435 The next alternative is for @samp{a*} to match only two @samp{a}s.
436 With this choice, the rest of the regexp matches successfully.@refill
439 is a postfix operator, similar to @samp{*} except that it must match
440 the preceding expression at least once. So, for example, @samp{ca+r}
441 matches the strings @samp{car} and @samp{caaaar} but not the string
442 @samp{cr}, whereas @samp{ca*r} matches all three strings.
445 is a postfix operator, similar to @samp{*} except that it can match the
446 preceding expression either once or not at all. For example,
447 @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
450 @cindex non-greedy regexp matching
451 are non-greedy variants of the operators above. The normal operators
452 @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as
453 much as they can, as long as the overall regexp can still match. With
454 a following @samp{?}, they are non-greedy: they will match as little
457 Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a}
458 and the string @samp{abbbb}; but if you try to match them both against
459 the text @samp{abbb}, @samp{ab*} will match it all (the longest valid
460 match), while @samp{ab*?} will match just @samp{a} (the shortest
464 is a postfix operator that specifies repetition @var{n} times---that
465 is, the preceding regular expression must match exactly @var{n} times
466 in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx}
469 @item \@{@var{n},@var{m}\@}
470 is a postfix operator that specifies repetition between @var{n} and
471 @var{m} times---that is, the preceding regular expression must match
472 at least @var{n} times, but no more than @var{m} times. If @var{m} is
473 omitted, then there is no upper limit, but the preceding regular
474 expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
475 equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
476 @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
479 is a @dfn{character set}, which begins with @samp{[} and is terminated
480 by @samp{]}. In the simplest case, the characters between the two
481 brackets are what this set can match.
483 Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
484 @samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
485 (including the empty string), from which it follows that @samp{c[ad]*r}
486 matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
488 You can also include character ranges in a character set, by writing the
489 starting and ending characters with a @samp{-} between them. Thus,
490 @samp{[a-z]} matches any lower-case ASCII letter. Ranges may be
491 intermixed freely with individual characters, as in @samp{[a-z$%.]},
492 which matches any lower-case ASCII letter or @samp{$}, @samp{%} or
495 Note that the usual regexp special characters are not special inside a
496 character set. A completely different set of special characters exists
497 inside character sets: @samp{]}, @samp{-} and @samp{^}.
499 To include a @samp{]} in a character set, you must make it the first
500 character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
501 include a @samp{-}, write @samp{-} as the first or last character of the
502 set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
505 To include @samp{^} in a set, put it anywhere but at the beginning of
508 When you use a range in case-insensitive search, you should write both
509 ends of the range in upper case, or both in lower case, or both should
510 be non-letters. The behavior of a mixed-case range such as @samp{A-z}
511 is somewhat ill-defined, and it may change in future Emacs versions.
514 @samp{[^} begins a @dfn{complemented character set}, which matches any
515 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
516 all characters @emph{except} letters and digits.
518 @samp{^} is not special in a character set unless it is the first
519 character. The character following the @samp{^} is treated as if it
520 were first (in other words, @samp{-} and @samp{]} are not special there).
522 A complemented character set can match a newline, unless newline is
523 mentioned as one of the characters not to match. This is in contrast to
524 the handling of regexps in programs such as @code{grep}.
527 is a special character that matches the empty string, but only at the
528 beginning of a line in the text being matched. Otherwise it fails to
529 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
530 the beginning of a line.
533 is similar to @samp{^} but matches only at the end of a line. Thus,
534 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
537 has two functions: it quotes the special characters (including
538 @samp{\}), and it introduces additional special constructs.
540 Because @samp{\} quotes special characters, @samp{\$} is a regular
541 expression that matches only @samp{$}, and @samp{\[} is a regular
542 expression that matches only @samp{[}, and so on.
545 Note: for historical compatibility, special characters are treated as
546 ordinary ones if they are in contexts where their special meanings make no
547 sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is
548 no preceding expression on which the @samp{*} can act. It is poor practice
549 to depend on this behavior; it is better to quote the special character anyway,
550 regardless of where it appears.@refill
552 For the most part, @samp{\} followed by any character matches only that
553 character. However, there are several exceptions: two-character
554 sequences starting with @samp{\} that have special meanings. The second
555 character in the sequence is always an ordinary character when used on
556 its own. Here is a table of @samp{\} constructs.
560 specifies an alternative. Two regular expressions @var{a} and @var{b}
561 with @samp{\|} in between form an expression that matches some text if
562 either @var{a} matches it or @var{b} matches it. It works by trying to
563 match @var{a}, and if that fails, by trying to match @var{b}.
565 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
566 but no other string.@refill
568 @samp{\|} applies to the largest possible surrounding expressions. Only a
569 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
572 Full backtracking capability exists to handle multiple uses of @samp{\|}.
575 is a grouping construct that serves three purposes:
579 To enclose a set of @samp{\|} alternatives for other operations.
580 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
583 To enclose a complicated expression for the postfix operators @samp{*},
584 @samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
585 @samp{bananana}, etc., with any (zero or more) number of @samp{na}
589 To record a matched substring for future reference.
592 This last application is not a consequence of the idea of a
593 parenthetical grouping; it is a separate feature that is assigned as a
594 second meaning to the same @samp{\( @dots{} \)} construct. In practice
595 there is usually no conflict between the two meanings; when there is
596 a conflict, you can use a ``shy'' group.
598 @item \(?: @dots{} \)
599 @cindex shy group, in regexp
600 specifies a ``shy'' group that does not record the matched substring;
601 you can't refer back to it with @samp{\@var{d}}. This is useful
602 in mechanically combining regular expressions, so that you
603 can add groups for syntactic purposes without interfering with
604 the numbering of the groups that were written by the user.
607 matches the same text that matched the @var{d}th occurrence of a
608 @samp{\( @dots{} \)} construct.
610 After the end of a @samp{\( @dots{} \)} construct, the matcher remembers
611 the beginning and end of the text matched by that construct. Then,
612 later on in the regular expression, you can use @samp{\} followed by the
613 digit @var{d} to mean ``match the same text matched the @var{d}th time
614 by the @samp{\( @dots{} \)} construct.''
616 The strings matching the first nine @samp{\( @dots{} \)} constructs
617 appearing in a regular expression are assigned numbers 1 through 9 in
618 the order that the open-parentheses appear in the regular expression.
619 So you can use @samp{\1} through @samp{\9} to refer to the text matched
620 by the corresponding @samp{\( @dots{} \)} constructs.
622 For example, @samp{\(.*\)\1} matches any newline-free string that is
623 composed of two identical halves. The @samp{\(.*\)} matches the first
624 half, which may be anything, but the @samp{\1} that follows must match
627 If a particular @samp{\( @dots{} \)} construct matches more than once
628 (which can easily happen if it is followed by @samp{*}), only the last
632 matches the empty string, but only at the beginning
633 of the buffer or string being matched against.
636 matches the empty string, but only at the end of
637 the buffer or string being matched against.
640 matches the empty string, but only at point.
643 matches the empty string, but only at the beginning or
644 end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
645 @samp{foo} as a separate word. @samp{\bballs?\b} matches
646 @samp{ball} or @samp{balls} as a separate word.@refill
648 @samp{\b} matches at the beginning or end of the buffer
649 regardless of what text appears next to it.
652 matches the empty string, but @emph{not} at the beginning or
656 matches the empty string, but only at the beginning of a word.
657 @samp{\<} matches at the beginning of the buffer only if a
658 word-constituent character follows.
661 matches the empty string, but only at the end of a word. @samp{\>}
662 matches at the end of the buffer only if the contents end with a
663 word-constituent character.
666 matches any word-constituent character. The syntax table
667 determines which characters these are. @xref{Syntax}.
670 matches any character that is not a word-constituent.
673 matches any character whose syntax is @var{c}. Here @var{c} is a
674 character that represents a syntax code: thus, @samp{w} for word
675 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
676 etc. Represent a character of whitespace (which can be a newline) by
677 either @samp{-} or a space character.
680 matches any character whose syntax is not @var{c}.
682 @cindex categories of characters
683 @cindex characters which belong to a specific language
684 @findex describe-categories
686 matches any character that belongs to the category @var{c}. For
687 example, @samp{\cc} matches Chinese characters, @samp{\cg} matches
688 Greek characters, etc. For the description of the known categories,
689 type @kbd{M-x describe-categories @key{RET}}.
692 matches any character that does @emph{not} belong to category
696 The constructs that pertain to words and syntax are controlled by the
697 setting of the syntax table (@pxref{Syntax}).
699 Here is a complicated regexp, used by Emacs to recognize the end of a
700 sentence together with any whitespace that follows. It is given in Lisp
701 syntax to enable you to distinguish the spaces from the tab characters. In
702 Lisp syntax, the string constant begins and ends with a double-quote.
703 @samp{\"} stands for a double-quote as part of the regexp, @samp{\\} for a
704 backslash as part of the regexp, @samp{\t} for a tab and @samp{\n} for a
708 "[.?!][]\"')]*\\($\\|\t\\| \\)[ \t\n]*"
712 This contains four parts in succession: a character set matching period,
713 @samp{?}, or @samp{!}; a character set matching close-brackets, quotes,
714 or parentheses, repeated any number of times; an alternative in
715 backslash-parentheses that matches end-of-line, a tab, or two spaces;
716 and a character set matching whitespace characters, repeated any number
719 To enter the same regexp interactively, you would type @key{TAB} to
720 enter a tab, and @kbd{C-j} to enter a newline. You would also type
721 single backslashes as themselves, instead of doubling them for Lisp syntax.
724 @c I commented this out because it is missing vital information
725 @c and therefore useless. For instance, what do you do to *use* the
726 @c regular expression when it is finished? What jobs is this good for?
730 @cindex authoring regular expressions
731 For convenient interactive development of regular expressions, you
732 can use the @kbd{M-x re-builder} command. It provides a convenient
733 interface for creating regular expressions, by giving immediate visual
734 feedback. The buffer from which @code{re-builder} was invoked becomes
735 the target for the regexp editor, which pops in a separate window. At
736 all times, all the matches in the target buffer for the current
737 regular expression are highlighted. Each parenthesized sub-expression
738 of the regexp is shown in a distinct face, which makes it easier to
739 verify even very complex regexps. (On displays that don't support
740 colors, Emacs blinks the cursor around the matched text, as it does
741 for matching parens.)
744 @node Search Case, Replace, Regexps, Search
745 @section Searching and Case
747 @vindex case-fold-search
748 Incremental searches in Emacs normally ignore the case of the text
749 they are searching through, if you specify the text in lower case.
750 Thus, if you specify searching for @samp{foo}, then @samp{Foo} and
751 @samp{foo} are also considered a match. Regexps, and in particular
752 character sets, are included: @samp{[ab]} would match @samp{a} or
753 @samp{A} or @samp{b} or @samp{B}.@refill
755 An upper-case letter anywhere in the incremental search string makes
756 the search case-sensitive. Thus, searching for @samp{Foo} does not find
757 @samp{foo} or @samp{FOO}. This applies to regular expression search as
758 well as to string search. The effect ceases if you delete the
759 upper-case letter from the search string.
761 If you set the variable @code{case-fold-search} to @code{nil}, then
762 all letters must match exactly, including case. This is a per-buffer
763 variable; altering the variable affects only the current buffer, but
764 there is a default value which you can change as well. @xref{Locals}.
765 This variable applies to nonincremental searches also, including those
766 performed by the replace commands (@pxref{Replace}) and the minibuffer
767 history matching commands (@pxref{Minibuffer History}).
769 @node Replace, Other Repeating Search, Search Case, Search
770 @section Replacement Commands
772 @cindex search-and-replace commands
773 @cindex string substitution
774 @cindex global substitution
776 Global search-and-replace operations are not needed as often in Emacs
777 as they are in other editors@footnote{In some editors,
778 search-and-replace operations are the only convenient way to make a
779 single change in the text.}, but they are available. In addition to the
780 simple @kbd{M-x replace-string} command which is like that found in most
781 editors, there is a @kbd{M-x query-replace} command which asks you, for
782 each occurrence of the pattern, whether to replace it.
784 The replace commands normally operate on the text from point to the
785 end of the buffer; however, in Transient Mark mode, when the mark is
786 active, they operate on the region. The replace commands all replace
787 one string (or regexp) with one replacement string. It is possible to
788 perform several replacements in parallel using the command
789 @code{expand-region-abbrevs} (@pxref{Expanding Abbrevs}).
792 * Unconditional Replace:: Replacing all matches for a string.
793 * Regexp Replace:: Replacing all matches for a regexp.
794 * Replacement and Case:: How replacements preserve case of letters.
795 * Query Replace:: How to use querying.
798 @node Unconditional Replace, Regexp Replace, Replace, Replace
799 @subsection Unconditional Replacement
800 @findex replace-string
801 @findex replace-regexp
804 @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
805 Replace every occurrence of @var{string} with @var{newstring}.
806 @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
807 Replace every match for @var{regexp} with @var{newstring}.
810 To replace every instance of @samp{foo} after point with @samp{bar},
811 use the command @kbd{M-x replace-string} with the two arguments
812 @samp{foo} and @samp{bar}. Replacement happens only in the text after
813 point, so if you want to cover the whole buffer you must go to the
814 beginning first. All occurrences up to the end of the buffer are
815 replaced; to limit replacement to part of the buffer, narrow to that
816 part of the buffer before doing the replacement (@pxref{Narrowing}).
817 In Transient Mark mode, when the region is active, replacement is
818 limited to the region (@pxref{Transient Mark}).
820 When @code{replace-string} exits, it leaves point at the last
821 occurrence replaced. It sets the mark to the prior position of point
822 (where the @code{replace-string} command was issued); use @kbd{C-u
823 C-@key{SPC}} to move back there.
825 A numeric argument restricts replacement to matches that are surrounded
826 by word boundaries. The argument's value doesn't matter.
828 @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
829 @subsection Regexp Replacement
831 The @kbd{M-x replace-string} command replaces exact matches for a
832 single string. The similar command @kbd{M-x replace-regexp} replaces
833 any match for a specified pattern.
835 In @code{replace-regexp}, the @var{newstring} need not be constant: it
836 can refer to all or part of what is matched by the @var{regexp}.
837 @samp{\&} in @var{newstring} stands for the entire match being replaced.
838 @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
839 whatever matched the @var{d}th parenthesized grouping in @var{regexp}.
840 To include a @samp{\} in the text to replace with, you must enter
841 @samp{\\}. For example,
844 M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
848 replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
849 with @samp{cddr-safe}.
852 M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
856 performs the inverse transformation.
858 @node Replacement and Case, Query Replace, Regexp Replace, Replace
859 @subsection Replace Commands and Case
861 If the first argument of a replace command is all lower case, the
862 commands ignores case while searching for occurrences to
863 replace---provided @code{case-fold-search} is non-@code{nil}. If
864 @code{case-fold-search} is set to @code{nil}, case is always significant
868 In addition, when the @var{newstring} argument is all or partly lower
869 case, replacement commands try to preserve the case pattern of each
870 occurrence. Thus, the command
873 M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
877 replaces a lower case @samp{foo} with a lower case @samp{bar}, an
878 all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with
879 @samp{Bar}. (These three alternatives---lower case, all caps, and
880 capitalized, are the only ones that @code{replace-string} can
883 If upper-case letters are used in the replacement string, they remain
884 upper case every time that text is inserted. If upper-case letters are
885 used in the first argument, the second argument is always substituted
886 exactly as given, with no case conversion. Likewise, if either
887 @code{case-replace} or @code{case-fold-search} is set to @code{nil},
888 replacement is done without case conversion.
890 @node Query Replace,, Replacement and Case, Replace
891 @subsection Query Replace
892 @cindex query replace
895 @item M-% @var{string} @key{RET} @var{newstring} @key{RET}
896 @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
897 Replace some occurrences of @var{string} with @var{newstring}.
898 @item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET}
899 @itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
900 Replace some matches for @var{regexp} with @var{newstring}.
904 @findex query-replace
905 If you want to change only some of the occurrences of @samp{foo} to
906 @samp{bar}, not all of them, then you cannot use an ordinary
907 @code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}).
908 This command finds occurrences of @samp{foo} one by one, displays each
909 occurrence and asks you whether to replace it. A numeric argument to
910 @code{query-replace} tells it to consider only occurrences that are
911 bounded by word-delimiter characters. This preserves case, just like
912 @code{replace-string}, provided @code{case-replace} is non-@code{nil},
916 @findex query-replace-regexp
917 Aside from querying, @code{query-replace} works just like
918 @code{replace-string}, and @code{query-replace-regexp} works just like
919 @code{replace-regexp}. This command is run by @kbd{C-M-%}.
921 The things you can type when you are shown an occurrence of @var{string}
922 or a match for @var{regexp} are:
924 @ignore @c Not worth it.
925 @kindex SPC @r{(query-replace)}
926 @kindex DEL @r{(query-replace)}
927 @kindex , @r{(query-replace)}
928 @kindex RET @r{(query-replace)}
929 @kindex . @r{(query-replace)}
930 @kindex ! @r{(query-replace)}
931 @kindex ^ @r{(query-replace)}
932 @kindex C-r @r{(query-replace)}
933 @kindex C-w @r{(query-replace)}
934 @kindex C-l @r{(query-replace)}
940 to replace the occurrence with @var{newstring}.
943 to skip to the next occurrence without replacing this one.
946 to replace this occurrence and display the result. You are then asked
947 for another input character to say what to do next. Since the
948 replacement has already been made, @key{DEL} and @key{SPC} are
949 equivalent in this situation; both move to the next occurrence.
951 You can type @kbd{C-r} at this point (see below) to alter the replaced
952 text. You can also type @kbd{C-x u} to undo the replacement; this exits
953 the @code{query-replace}, so if you want to do further replacement you
954 must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart
955 (@pxref{Repetition}).
958 to exit without doing any more replacements.
960 @item .@: @r{(Period)}
961 to replace this occurrence and then exit without searching for more
965 to replace all remaining occurrences without asking again.
968 to go back to the position of the previous occurrence (or what used to
969 be an occurrence), in case you changed it by mistake. This works by
970 popping the mark ring. Only one @kbd{^} in a row is meaningful, because
971 only one previous replacement position is kept during @code{query-replace}.
974 to enter a recursive editing level, in case the occurrence needs to be
975 edited rather than just replaced with @var{newstring}. When you are
976 done, exit the recursive editing level with @kbd{C-M-c} to proceed to
977 the next occurrence. @xref{Recursive Edit}.
980 to delete the occurrence, and then enter a recursive editing level as in
981 @kbd{C-r}. Use the recursive edit to insert text to replace the deleted
982 occurrence of @var{string}. When done, exit the recursive editing level
983 with @kbd{C-M-c} to proceed to the next occurrence.
986 to edit the replacement string in the minibuffer. When you exit the
987 minibuffer by typing @key{RET}, the minibuffer contents replace the
988 current occurrence of the pattern. They also become the new
989 replacement string for any further occurrences.
992 to redisplay the screen. Then you must type another character to
993 specify what to do with this occurrence.
996 to display a message summarizing these options. Then you must type
997 another character to specify what to do with this occurrence.
1000 Some other characters are aliases for the ones listed above: @kbd{y},
1001 @kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and
1004 Aside from this, any other character exits the @code{query-replace},
1005 and is then reread as part of a key sequence. Thus, if you type
1006 @kbd{C-k}, it exits the @code{query-replace} and then kills to end of
1009 To restart a @code{query-replace} once it is exited, use @kbd{C-x
1010 @key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
1011 used the minibuffer to read its arguments. @xref{Repetition, C-x ESC
1014 See also @ref{Transforming File Names}, for Dired commands to rename,
1015 copy, or link files by replacing regexp matches in file names.
1017 @node Other Repeating Search,, Replace, Search
1018 @section Other Search-and-Loop Commands
1020 Here are some other commands that find matches for a regular
1021 expression. They all ignore case in matching, if the pattern contains
1022 no upper-case letters and @code{case-fold-search} is non-@code{nil}.
1023 Aside from @code{occur}, all operate on the text from point to the end
1024 of the buffer, or on the active region in Transient Mark mode.
1026 @findex list-matching-lines
1029 @findex delete-non-matching-lines
1030 @findex delete-matching-lines
1035 @item M-x occur @key{RET} @var{regexp} @key{RET}
1036 Display a list showing each line in the buffer that contains a match
1037 for @var{regexp}. To limit the search to part of the buffer, narrow
1038 to that part (@pxref{Narrowing}). A numeric argument @var{n}
1039 specifies that @var{n} lines of context are to be displayed before and
1040 after each matching line.
1042 @kindex RET @r{(Occur mode)}
1043 The buffer @samp{*Occur*} containing the output serves as a menu for
1044 finding the occurrences in their original context. Click @kbd{Mouse-2}
1045 on an occurrence listed in @samp{*Occur*}, or position point there and
1046 type @key{RET}; this switches to the buffer that was searched and
1047 moves point to the original of the chosen occurrence.
1049 @item M-x list-matching-lines
1050 Synonym for @kbd{M-x occur}.
1052 @item M-x how-many @key{RET} @var{regexp} @key{RET}
1053 Print the number of matches for @var{regexp} that exist in the buffer
1054 after point. In Transient Mark mode, if the region is active, the
1055 command operates on the region instead.
1057 @item M-x flush-lines @key{RET} @var{regexp} @key{RET}
1058 Delete each line that contains a match for @var{regexp}, operating on
1059 the text after point. In Transient Mark mode, if the region is
1060 active, the command operates on the region instead.
1062 @item M-x keep-lines @key{RET} @var{regexp} @key{RET}
1063 Delete each line that @emph{does not} contain a match for
1064 @var{regexp}, operating on the text after point. In Transient Mark
1065 mode, if the region is active, the command operates on the region
1069 You can also search multiple files under control of a tags table
1070 (@pxref{Tags Search}) or through Dired @kbd{A} command
1071 (@pxref{Operating on Files}), or ask the @code{grep} program to do it
1072 (@pxref{Grep Searching}).