lispref/searching.texi

   1 @c -*-texinfo-*-
   2 @c This is part of the GNU Emacs Lisp Reference Manual.
   3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
   4 @c See the file elisp.texi for copying conditions.
   5 @setfilename ../info/searching
   6 @node Searching and Matching, Syntax Tables, Text, Top
   7 @chapter Searching and Matching
   8 @cindex searching
   9
  10   GNU Emacs provides two ways to search through a buffer for specified
  11 text: exact string searches and regular expression searches.  After a
  12 regular expression search, you can examine the @dfn{match data} to
  13 determine which text matched the whole regular expression or various
  14 portions of it.
  15
  16 @menu
  17 * String Search::         Search for an exact match.
  18 * Regular Expressions::   Describing classes of strings.
  19 * Regexp Search::         Searching for a match for a regexp.
  20 * Search and Replace::    Internals of @code{query-replace}.
  21 * Match Data::            Finding out which part of the text matched
  22                             various parts of a regexp, after regexp search.
  23 * Searching and Case::    Case-independent or case-significant searching.
  24 * Standard Regexps::      Useful regexps for finding sentences, pages,...
  25 @end menu
  26
  27   The @samp{skip-chars@dots{}} functions also perform a kind of searching.
  28 @xref{Skipping Characters}.
  29
  30 @node String Search
  31 @section Searching for Strings
  32 @cindex string search
  33
  34   These are the primitive functions for searching through the text in a
  35 buffer.  They are meant for use in programs, but you may call them
  36 interactively.  If you do so, they prompt for the search string;
  37 @var{limit} and @var{noerror} are set to @code{nil}, and @var{repeat}
  38 is set to 1.
  39
  40 @deffn Command search-forward string &optional limit noerror repeat
  41   This function searches forward from point for an exact match for
  42 @var{string}.  If successful, it sets point to the end of the occurrence
  43 found, and returns the new value of point.  If no match is found, the
  44 value and side effects depend on @var{noerror} (see below).
  45 @c Emacs 19 feature
  46
  47   In the following example, point is initially at the beginning of the
  48 line.  Then @code{(search-forward "fox")} moves point after the last
  49 letter of @samp{fox}:
  50
  51 @example
  52 @group
  53 ---------- Buffer: foo ----------
  54 @point{}The quick brown fox jumped over the lazy dog.
  55 ---------- Buffer: foo ----------
  56 @end group
  57
  58 @group
  59 (search-forward "fox")
  60      @result{} 20
  61
  62 ---------- Buffer: foo ----------
  63 The quick brown fox@point{} jumped over the lazy dog.
  64 ---------- Buffer: foo ----------
  65 @end group
  66 @end example
  67
  68   The argument @var{limit} specifies the upper bound to the search.  (It
  69 must be a position in the current buffer.)  No match extending after
  70 that position is accepted.  If @var{limit} is omitted or @code{nil}, it
  71 defaults to the end of the accessible portion of the buffer.
  72
  73 @kindex search-failed
  74   What happens when the search fails depends on the value of
  75 @var{noerror}.  If @var{noerror} is @code{nil}, a @code{search-failed}
  76 error is signaled.  If @var{noerror} is @code{t}, @code{search-forward}
  77 returns @code{nil} and does nothing.  If @var{noerror} is neither
  78 @code{nil} nor @code{t}, then @code{search-forward} moves point to the
  79 upper bound and returns @code{nil}.  (It would be more consistent now
  80 to return the new position of point in that case, but some programs
  81 may depend on a value of @code{nil}.)
  82
  83 If @var{repeat} is supplied (it must be a positive number), then the
  84 search is repeated that many times (each time starting at the end of the
  85 previous time's match).  If these successive searches succeed, the
  86 function succeeds, moving point and returning its new value.  Otherwise
  87 the search fails.
  88 @end deffn
  89
  90 @deffn Command search-backward string &optional limit noerror repeat
  91 This function searches backward from point for @var{string}.  It is
  92 just like @code{search-forward} except that it searches backwards and
  93 leaves point at the beginning of the match.
  94 @end deffn
  95
  96 @deffn Command word-search-forward string &optional limit noerror repeat
  97 @cindex word search
  98 This function searches forward from point for a ``word'' match for
  99 @var{string}.  If it finds a match, it sets point to the end of the
 100 match found, and returns the new value of point.
 101 @c Emacs 19 feature
 102
 103 Word matching regards @var{string} as a sequence of words, disregarding
 104 punctuation that separates them.  It searches the buffer for the same
 105 sequence of words.  Each word must be distinct in the buffer (searching
 106 for the word @samp{ball} does not match the word @samp{balls}), but the
 107 details of punctuation and spacing are ignored (searching for @samp{ball
 108 boy} does match @samp{ball.  Boy!}).
 109
 110 In this example, point is initially at the beginning of the buffer; the
 111 search leaves it between the @samp{y} and the @samp{!}.
 112
 113 @example
 114 @group
 115 ---------- Buffer: foo ----------
 116 @point{}He said "Please!  Find
 117 the ball boy!"
 118 ---------- Buffer: foo ----------
 119 @end group
 120
 121 @group
 122 (word-search-forward "Please find the ball, boy.")
 123      @result{} 35
 124
 125 ---------- Buffer: foo ----------
 126 He said "Please!  Find
 127 the ball boy@point{}!"
 128 ---------- Buffer: foo ----------
 129 @end group
 130 @end example
 131
 132 If @var{limit} is non-@code{nil} (it must be a position in the current
 133 buffer), then it is the upper bound to the search.  The match found must
 134 not extend after that position.
 135
 136 If @var{noerror} is @code{nil}, then @code{word-search-forward} signals
 137 an error if the search fails.  If @var{noerror} is @code{t}, then it
 138 returns @code{nil} instead of signaling an error.  If @var{noerror} is
 139 neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the
 140 end of the buffer) and returns @code{nil}.
 141
 142 If @var{repeat} is non-@code{nil}, then the search is repeated that many
 143 times.  Point is positioned at the end of the last match.
 144 @end deffn
 145
 146 @deffn Command word-search-backward string &optional limit noerror repeat
 147 This function searches backward from point for a word match to
 148 @var{string}.  This function is just like @code{word-search-forward}
 149 except that it searches backward and normally leaves point at the
 150 beginning of the match.
 151 @end deffn
 152
 153 @node Regular Expressions
 154 @section Regular Expressions
 155 @cindex regular expression
 156 @cindex regexp
 157
 158   A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
 159 denotes a (possibly infinite) set of strings.  Searching for matches for
 160 a regexp is a very powerful operation.  This section explains how to write
 161 regexps; the following section says how to search for them.
 162
 163 @menu
 164 * Syntax of Regexps::       Rules for writing regular expressions.
 165 * Regexp Example::          Illustrates regular expression syntax.
 166 @end menu
 167
 168 @node Syntax of Regexps
 169 @subsection Syntax of Regular Expressions
 170
 171   Regular expressions have a syntax in which a few characters are
 172 special constructs and the rest are @dfn{ordinary}.  An ordinary
 173 character is a simple regular expression that matches that character and
 174 nothing else.  The special characters are @samp{.}, @samp{*}, @samp{+},
 175 @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
 176 special characters will be defined in the future.  Any other character
 177 appearing in a regular expression is ordinary, unless a @samp{\}
 178 precedes it.
 179
 180 For example, @samp{f} is not a special character, so it is ordinary, and
 181 therefore @samp{f} is a regular expression that matches the string
 182 @samp{f} and no other string.  (It does @emph{not} match the string
 183 @samp{ff}.)  Likewise, @samp{o} is a regular expression that matches
 184 only @samp{o}.@refill
 185
 186 Any two regular expressions @var{a} and @var{b} can be concatenated.  The
 187 result is a regular expression that matches a string if @var{a} matches
 188 some amount of the beginning of that string and @var{b} matches the rest of
 189 the string.@refill
 190
 191 As a simple example, we can concatenate the regular expressions @samp{f}
 192 and @samp{o} to get the regular expression @samp{fo}, which matches only
 193 the string @samp{fo}.  Still trivial.  To do something more powerful, you
 194 need to use one of the special characters.  Here is a list of them:
 195
 196 @need 1200
 197 @table @kbd
 198 @item .@: @r{(Period)}
 199 @cindex @samp{.} in regexp
 200 is a special character that matches any single character except a newline.
 201 Using concatenation, we can make regular expressions like @samp{a.b}, which
 202 matches any three-character string that begins with @samp{a} and ends with
 203 @samp{b}.@refill
 204
 205 @item *
 206 @cindex @samp{*} in regexp
 207 is not a construct by itself; it is a suffix operator that means to
 208 repeat the preceding regular expression as many times as possible.  In
 209 @samp{fo*}, the @samp{*} applies to the @samp{o}, so @samp{fo*} matches
 210 one @samp{f} followed by any number of @samp{o}s.  The case of zero
 211 @samp{o}s is allowed: @samp{fo*} does match @samp{f}.@refill
 212
 213 @samp{*} always applies to the @emph{smallest} possible preceding
 214 expression.  Thus, @samp{fo*} has a repeating @samp{o}, not a
 215 repeating @samp{fo}.@refill
 216
 217 The matcher processes a @samp{*} construct by matching, immediately,
 218 as many repetitions as can be found.  Then it continues with the rest
 219 of the pattern.  If that fails, backtracking occurs, discarding some
 220 of the matches of the @samp{*}-modified construct in case that makes
 221 it possible to match the rest of the pattern.  For example, in matching
 222 @samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
 223 tries to match all three @samp{a}s; but the rest of the pattern is
 224 @samp{ar} and there is only @samp{r} left to match, so this try fails.
 225 The next alternative is for @samp{a*} to match only two @samp{a}s.
 226 With this choice, the rest of the regexp matches successfully.@refill
 227
 228 Nested repetition operators can be extremely slow if they specify
 229 backtracking loops.  For example, @samp{\(x+y*\)*a} could take hours to
 230 match the sequence @samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz}.  The
 231 slowness is because Emacs must try each imaginable way of grouping the
 232 35 @samp{x}'s before concluding that none of them can work.  To make
 233 sure your regular expressions run fast, check nested repetitions
 234 carefully.
 235
 236 @item +
 237 @cindex @samp{+} in regexp
 238 is a suffix operator similar to @samp{*} except that the preceding
 239 expression must match at least once.  So, for example, @samp{ca+r}
 240 matches the strings @samp{car} and @samp{caaaar} but not the string
 241 @samp{cr}, whereas @samp{ca*r} matches all three strings.
 242
 243 @item ?
 244 @cindex @samp{?} in regexp
 245 is a suffix operator similar to @samp{*} except that the preceding
 246 expression can match either once or not at all.  For example,
 247 @samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing
 248 else.
 249
 250 @item [ @dots{} ]
 251 @cindex character set (in regexp)
 252 @cindex @samp{[} in regexp
 253 @cindex @samp{]} in regexp
 254 @samp{[} begins a @dfn{character set}, which is terminated by a
 255 @samp{]}.  In the simplest case, the characters between the two brackets
 256 form the set.  Thus, @samp{[ad]} matches either one @samp{a} or one
 257 @samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s
 258 and @samp{d}s (including the empty string), from which it follows that
 259 @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
 260 @samp{caddaar}, etc.@refill
 261
 262 The usual regular expression special characters are not special inside a
 263 character set.  A completely different set of special characters exists
 264 inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill
 265
 266 @samp{-} is used for ranges of characters.  To write a range, write two
 267 characters with a @samp{-} between them.  Thus, @samp{[a-z]} matches any
 268 lower case letter.  Ranges may be intermixed freely with individual
 269 characters, as in @samp{[a-z$%.]}, which matches any lower case letter
 270 or @samp{$}, @samp{%}, or a period.@refill
 271
 272 To include a @samp{]} in a character set, make it the first character.
 273 For example, @samp{[]a]} matches @samp{]} or @samp{a}.  To include a
 274 @samp{-}, write @samp{-} as the first character in the set, or put it
 275 immediately after a range.  (You can replace one individual character
 276 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the
 277 @samp{-}.)  There is no way to write a set containing just @samp{-} and
 278 @samp{]}.
 279
 280 To include @samp{^} in a set, put it anywhere but at the beginning of
 281 the set.
 282
 283 @item [^ @dots{} ]
 284 @cindex @samp{^} in regexp
 285 @samp{[^} begins a @dfn{complement character set}, which matches any
 286 character except the ones specified.  Thus, @samp{[^a-z0-9A-Z]}
 287 matches all characters @emph{except} letters and digits.@refill
 288
 289 @samp{^} is not special in a character set unless it is the first
 290 character.  The character following the @samp{^} is treated as if it
 291 were first (thus, @samp{-} and @samp{]} are not special there).
 292
 293 Note that a complement character set can match a newline, unless
 294 newline is mentioned as one of the characters not to match.
 295
 296 @item ^
 297 @cindex @samp{^} in regexp
 298 @cindex beginning of line in regexp
 299 is a special character that matches the empty string, but only at the
 300 beginning of a line in the text being matched.  Otherwise it fails to
 301 match anything.  Thus, @samp{^foo} matches a @samp{foo} that occurs at
 302 the beginning of a line.
 303
 304 When matching a string instead of a buffer, @samp{^} matches at the
 305 beginning of the string or after a newline character @samp{\n}.
 306
 307 @item $
 308 @cindex @samp{$} in regexp
 309 is similar to @samp{^} but matches only at the end of a line.  Thus,
 310 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
 311
 312 When matching a string instead of a buffer, @samp{$} matches at the end
 313 of the string or before a newline character @samp{\n}.
 314
 315 @item \
 316 @cindex @samp{\} in regexp
 317 has two functions: it quotes the special characters (including
 318 @samp{\}), and it introduces additional special constructs.
 319
 320 Because @samp{\} quotes special characters, @samp{\$} is a regular
 321 expression that matches only @samp{$}, and @samp{\[} is a regular
 322 expression that matches only @samp{[}, and so on.
 323
 324 Note that @samp{\} also has special meaning in the read syntax of Lisp
 325 strings (@pxref{String Type}), and must be quoted with @samp{\}.  For
 326 example, the regular expression that matches the @samp{\} character is
 327 @samp{\\}.  To write a Lisp string that contains the characters
 328 @samp{\\}, Lisp syntax requires you to quote each @samp{\} with another
 329 @samp{\}.  Therefore, the read syntax for a regular expression matching
 330 @samp{\} is @code{"\\\\"}.@refill
 331 @end table
 332
 333 @strong{Please note:} For historical compatibility, special characters
 334 are treated as ordinary ones if they are in contexts where their special
 335 meanings make no sense.  For example, @samp{*foo} treats @samp{*} as
 336 ordinary since there is no preceding expression on which the @samp{*}
 337 can act.  It is poor practice to depend on this behavior; quote the
 338 special character anyway, regardless of where it appears.@refill
 339
 340 For the most part, @samp{\} followed by any character matches only
 341 that character.  However, there are several exceptions: characters
 342 that, when preceded by @samp{\}, are special constructs.  Such
 343 characters are always ordinary when encountered on their own.  Here
 344 is a table of @samp{\} constructs:
 345
 346 @table @kbd
 347 @item \|
 348 @cindex @samp{|} in regexp
 349 @cindex regexp alternative
 350 specifies an alternative.
 351 Two regular expressions @var{a} and @var{b} with @samp{\|} in
 352 between form an expression that matches anything that either @var{a} or
 353 @var{b} matches.@refill
 354
 355 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
 356 but no other string.@refill
 357
 358 @samp{\|} applies to the largest possible surrounding expressions.  Only a
 359 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
 360 @samp{\|}.@refill
 361
 362 Full backtracking capability exists to handle multiple uses of @samp{\|}.
 363
 364 @item \( @dots{} \)
 365 @cindex @samp{(} in regexp
 366 @cindex @samp{)} in regexp
 367 @cindex regexp grouping
 368 is a grouping construct that serves three purposes:
 369
 370 @enumerate
 371 @item
 372 To enclose a set of @samp{\|} alternatives for other operations.
 373 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
 374
 375 @item
 376 To enclose an expression for a suffix operator such as @samp{*} to act
 377 on.  Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any
 378 (zero or more) number of @samp{na} strings.@refill
 379
 380 @item
 381 To record a matched substring for future reference.
 382 @end enumerate
 383
 384 This last application is not a consequence of the idea of a
 385 parenthetical grouping; it is a separate feature that happens to be
 386 assigned as a second meaning to the same @samp{\( @dots{} \)} construct
 387 because there is no conflict in practice between the two meanings.
 388 Here is an explanation of this feature:
 389
 390 @item \@var{digit}
 391 matches the same text that matched the @var{digit}th occurrence of a
 392 @samp{\( @dots{} \)} construct.
 393
 394 In other words, after the end of a @samp{\( @dots{} \)} construct.  the
 395 matcher remembers the beginning and end of the text matched by that
 396 construct.  Then, later on in the regular expression, you can use
 397 @samp{\} followed by @var{digit} to match that same text, whatever it
 398 may have been.
 399
 400 The strings matching the first nine @samp{\( @dots{} \)} constructs
 401 appearing in a regular expression are assigned numbers 1 through 9 in
 402 the order that the open parentheses appear in the regular expression.
 403 So you can use @samp{\1} through @samp{\9} to refer to the text matched
 404 by the corresponding @samp{\( @dots{} \)} constructs.
 405
 406 For example, @samp{\(.*\)\1} matches any newline-free string that is
 407 composed of two identical halves.  The @samp{\(.*\)} matches the first
 408 half, which may be anything, but the @samp{\1} that follows must match
 409 the same exact text.
 410
 411 @item \w
 412 @cindex @samp{\w} in regexp
 413 matches any word-constituent character.  The editor syntax table
 414 determines which characters these are.  @xref{Syntax Tables}.
 415
 416 @item \W
 417 @cindex @samp{\W} in regexp
 418 matches any character that is not a word constituent.
 419
 420 @item \s@var{code}
 421 @cindex @samp{\s} in regexp
 422 matches any character whose syntax is @var{code}.  Here @var{code} is a
 423 character that represents a syntax code: thus, @samp{w} for word
 424 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
 425 etc.  @xref{Syntax Tables}, for a list of syntax codes and the
 426 characters that stand for them.
 427
 428 @item \S@var{code}
 429 @cindex @samp{\S} in regexp
 430 matches any character whose syntax is not @var{code}.
 431 @end table
 432
 433   The following regular expression constructs match the empty string---that is,
 434 they don't use up any characters---but whether they match depends on the
 435 context.
 436
 437 @table @kbd
 438 @item \`
 439 @cindex @samp{\`} in regexp
 440 matches the empty string, but only at the beginning
 441 of the buffer or string being matched against.
 442
 443 @item \'
 444 @cindex @samp{\'} in regexp
 445 matches the empty string, but only at the end of
 446 the buffer or string being matched against.
 447
 448 @item \=
 449 @cindex @samp{\=} in regexp
 450 matches the empty string, but only at point.
 451 (This construct is not defined when matching against a string.)
 452
 453 @item \b
 454 @cindex @samp{\b} in regexp
 455 matches the empty string, but only at the beginning or
 456 end of a word.  Thus, @samp{\bfoo\b} matches any occurrence of
 457 @samp{foo} as a separate word.  @samp{\bballs?\b} matches
 458 @samp{ball} or @samp{balls} as a separate word.@refill
 459
 460 @item \B
 461 @cindex @samp{\B} in regexp
 462 matches the empty string, but @emph{not} at the beginning or
 463 end of a word.
 464
 465 @item \<
 466 @cindex @samp{\<} in regexp
 467 matches the empty string, but only at the beginning of a word.
 468
 469 @item \>
 470 @cindex @samp{\>} in regexp
 471 matches the empty string, but only at the end of a word.
 472 @end table
 473
 474 @kindex invalid-regexp
 475   Not every string is a valid regular expression.  For example, a string
 476 with unbalanced square brackets is invalid (with a few exceptions, such
 477 as @samp{[]]}), and so is a string that ends with a single @samp{\}.  If
 478 an invalid regular expression is passed to any of the search functions,
 479 an @code{invalid-regexp} error is signaled.
 480
 481 @defun regexp-quote string
 482 This function returns a regular expression string that matches exactly
 483 @var{string} and nothing else.  This allows you to request an exact
 484 string match when calling a function that wants a regular expression.
 485
 486 @example
 487 @group
 488 (regexp-quote "^The cat$")
 489      @result{} "\\^The cat\\$"
 490 @end group
 491 @end example
 492
 493 One use of @code{regexp-quote} is to combine an exact string match with
 494 context described as a regular expression.  For example, this searches
 495 for the string that is the value of @code{string}, surrounded by
 496 whitespace:
 497
 498 @example
 499 @group
 500 (re-search-forward
 501  (concat "\\s-" (regexp-quote string) "\\s-"))
 502 @end group
 503 @end example
 504 @end defun
 505
 506 @node Regexp Example
 507 @comment  node-name,  next,  previous,  up
 508 @subsection Complex Regexp Example
 509
 510   Here is a complicated regexp, used by Emacs to recognize the end of a
 511 sentence together with any whitespace that follows.  It is the value of
 512 the variable @code{sentence-end}.
 513
 514   First, we show the regexp as a string in Lisp syntax to distinguish
 515 spaces from tab characters.  The string constant begins and ends with a
 516 double-quote.  @samp{\"} stands for a double-quote as part of the
 517 string, @samp{\\} for a backslash as part of the string, @samp{\t} for a
 518 tab and @samp{\n} for a newline.
 519
 520 @example
 521 "[.?!][]\"')@}]*\\($\\| $\\|\t\\|  \\)[ \t\n]*"
 522 @end example
 523
 524   In contrast, if you evaluate the variable @code{sentence-end}, you
 525 will see the following:
 526
 527 @example
 528 @group
 529 sentence-end
 530 @result{}
 531 "[.?!][]\"')@}]*\\($\\| $\\|  \\|  \\)[
 532 ]*"
 533 @end group
 534 @end example
 535
 536 @noindent
 537 In this output, tab and newline appear as themselves.
 538
 539   This regular expression contains four parts in succession and can be
 540 deciphered as follows:
 541
 542 @table @code
 543 @item [.?!]
 544 The first part of the pattern is a character set that matches any one of
 545 three characters: period, question mark, and exclamation mark.  The
 546 match must begin with one of these three characters.
 547
 548 @item []\"')@}]*
 549 The second part of the pattern matches any closing braces and quotation
 550 marks, zero or more of them, that may follow the period, question mark
 551 or exclamation mark.  The @code{\"} is Lisp syntax for a double-quote in
 552 a string.  The @samp{*} at the end indicates that the immediately
 553 preceding regular expression (a character set, in this case) may be
 554 repeated zero or more times.
 555
 556 @item \\($\\|@ $\\|\t\\|@ @ \\)
 557 The third part of the pattern matches the whitespace that follows the
 558 end of a sentence: the end of a line, or a tab, or two spaces.  The
 559 double backslashes mark the parentheses and vertical bars as regular
 560 expression syntax; the parentheses delimit a group and the vertical bars
 561 separate alternatives.  The dollar sign is used to match the end of a
 562 line.
 563
 564 @item [ \t\n]*
 565 Finally, the last part of the pattern matches any additional whitespace
 566 beyond the minimum needed to end a sentence.
 567 @end table
 568
 569 @node Regexp Search
 570 @section Regular Expression Searching
 571 @cindex regular expression searching
 572 @cindex regexp searching
 573 @cindex searching for regexp
 574
 575   In GNU Emacs, you can search for the next match for a regexp either
 576 incrementally or not.  For incremental search commands, see @ref{Regexp
 577 Search, , Regular Expression Search, emacs, The GNU Emacs Manual}.  Here
 578 we describe only the search functions useful in programs.  The principal
 579 one is @code{re-search-forward}.
 580
 581 @deffn Command re-search-forward regexp &optional limit noerror repeat
 582 This function searches forward in the current buffer for a string of
 583 text that is matched by the regular expression @var{regexp}.  The
 584 function skips over any amount of text that is not matched by
 585 @var{regexp}, and leaves point at the end of the first match found.
 586 It returns the new value of point.
 587
 588 If @var{limit} is non-@code{nil} (it must be a position in the current
 589 buffer), then it is the upper bound to the search.  No match extending
 590 after that position is accepted.
 591
 592 What happens when the search fails depends on the value of
 593 @var{noerror}.  If @var{noerror} is @code{nil}, a @code{search-failed}
 594 error is signaled.  If @var{noerror} is @code{t},
 595 @code{re-search-forward} does nothing and returns @code{nil}.  If
 596 @var{noerror} is neither @code{nil} nor @code{t}, then
 597 @code{re-search-forward} moves point to @var{limit} (or the end of the
 598 buffer) and returns @code{nil}.
 599
 600 If @var{repeat} is supplied (it must be a positive number), then the
 601 search is repeated that many times (each time starting at the end of the
 602 previous time's match).  If these successive searches succeed, the
 603 function succeeds, moving point and returning its new value.  Otherwise
 604 the search fails.
 605
 606 In the following example, point is initially before the @samp{T}.
 607 Evaluating the search call moves point to the end of that line (between
 608 the @samp{t} of @samp{hat} and the newline).
 609
 610 @example
 611 @group
 612 ---------- Buffer: foo ----------
 613 I read "@point{}The cat in the hat
 614 comes back" twice.
 615 ---------- Buffer: foo ----------
 616 @end group
 617
 618 @group
 619 (re-search-forward "[a-z]+" nil t 5)
 620      @result{} 27
 621
 622 ---------- Buffer: foo ----------
 623 I read "The cat in the hat@point{}
 624 comes back" twice.
 625 ---------- Buffer: foo ----------
 626 @end group
 627 @end example
 628 @end deffn
 629
 630 @deffn Command re-search-backward regexp &optional limit noerror repeat
 631 This function searches backward in the current buffer for a string of
 632 text that is matched by the regular expression @var{regexp}, leaving
 633 point at the beginning of the first text found.
 634
 635 This function is analogous to @code{re-search-forward}, but they are not
 636 simple mirror images.  @code{re-search-forward} finds the match whose
 637 beginning is as close as possible to the starting point.  If
 638 @code{re-search-backward} were a perfect mirror image, it would find the
 639 match whose end is as close as possible.  However, in fact it finds the
 640 match whose beginning is as close as possible.  The reason is that
 641 matching a regular expression at a given spot always works from
 642 beginning to end, and starts at a specified beginning position.
 643
 644 A true mirror-image of @code{re-search-forward} would require a special
 645 feature for matching regexps from end to beginning.  It's not worth the
 646 trouble of implementing that.
 647 @end deffn
 648
 649 @defun string-match regexp string &optional start
 650 This function returns the index of the start of the first match for
 651 the regular expression @var{regexp} in @var{string}, or @code{nil} if
 652 there is no match.  If @var{start} is non-@code{nil}, the search starts
 653 at that index in @var{string}.
 654
 655 For example,
 656
 657 @example
 658 @group
 659 (string-match
 660  "quick" "The quick brown fox jumped quickly.")
 661      @result{} 4
 662 @end group
 663 @group
 664 (string-match
 665  "quick" "The quick brown fox jumped quickly." 8)
 666      @result{} 27
 667 @end group
 668 @end example
 669
 670 @noindent
 671 The index of the first character of the
 672 string is 0, the index of the second character is 1, and so on.
 673
 674 After this function returns, the index of the first character beyond
 675 the match is available as @code{(match-end 0)}.  @xref{Match Data}.
 676
 677 @example
 678 @group
 679 (string-match
 680  "quick" "The quick brown fox jumped quickly." 8)
 681      @result{} 27
 682 @end group
 683
 684 @group
 685 (match-end 0)
 686      @result{} 32
 687 @end group
 688 @end example
 689 @end defun
 690
 691 @defun looking-at regexp
 692 This function determines whether the text in the current buffer directly
 693 following point matches the regular expression @var{regexp}.  ``Directly
 694 following'' means precisely that: the search is ``anchored'' and it can
 695 succeed only starting with the first character following point.  The
 696 result is @code{t} if so, @code{nil} otherwise.
 697
 698 This function does not move point, but it updates the match data, which
 699 you can access using @code{match-beginning} and @code{match-end}.
 700 @xref{Match Data}.
 701
 702 In this example, point is located directly before the @samp{T}.  If it
 703 were anywhere else, the result would be @code{nil}.
 704
 705 @example
 706 @group
 707 ---------- Buffer: foo ----------
 708 I read "@point{}The cat in the hat
 709 comes back" twice.
 710 ---------- Buffer: foo ----------
 711
 712 (looking-at "The cat in the hat$")
 713      @result{} t
 714 @end group
 715 @end example
 716 @end defun
 717
 718 @ignore
 719 @deffn Command delete-matching-lines regexp
 720 This function is identical to @code{delete-non-matching-lines}, save
 721 that it deletes what @code{delete-non-matching-lines} keeps.
 722
 723 In the example below, point is located on the first line of text.
 724
 725 @example
 726 @group
 727 ---------- Buffer: foo ----------
 728 We hold these truths
 729 to be self-evident,
 730 that all men are created
 731 equal, and that they are
 732 ---------- Buffer: foo ----------
 733 @end group
 734
 735 @group
 736 (delete-matching-lines "the")
 737      @result{} nil
 738
 739 ---------- Buffer: foo ----------
 740 to be self-evident,
 741 that all men are created
 742 ---------- Buffer: foo ----------
 743 @end group
 744 @end example
 745 @end deffn
 746
 747 @deffn Command flush-lines regexp
 748 This function is the same as @code{delete-matching-lines}.
 749 @end deffn
 750
 751 @defun delete-non-matching-lines regexp
 752 This function deletes all lines following point which don't
 753 contain a match for the regular expression @var{regexp}.
 754 @end defun
 755
 756 @deffn Command keep-lines regexp
 757 This function is the same as @code{delete-non-matching-lines}.
 758 @end deffn
 759
 760 @deffn Command how-many regexp
 761 This function counts the number of matches for @var{regexp} there are in
 762 the current buffer following point.  It prints this number in
 763 the echo area, returning the string printed.
 764 @end deffn
 765
 766 @deffn Command count-matches regexp
 767 This function is a synonym of @code{how-many}.
 768 @end deffn
 769
 770 @deffn Command list-matching-lines regexp nlines
 771 This function is a synonym of @code{occur}.
 772 Show all lines following point containing a match for @var{regexp}.
 773 Display each line with @var{nlines} lines before and after,
 774 or @code{-}@var{nlines} before if @var{nlines} is negative.
 775 @var{nlines} defaults to @code{list-matching-lines-default-context-lines}.
 776 Interactively it is the prefix arg.
 777
 778 The lines are shown in a buffer named @samp{*Occur*}.
 779 It serves as a menu to find any of the occurrences in this buffer.
 780 @kbd{C-h m} (@code{describe-mode} in that buffer gives help.
 781 @end deffn
 782
 783 @defopt list-matching-lines-default-context-lines
 784 Default value is 0.
 785 Default number of context lines to include around a @code{list-matching-lines}
 786 match.  A negative number means to include that many lines before the match.
 787 A positive number means to include that many lines both before and after.
 788 @end defopt
 789 @end ignore
 790
 791 @node Search and Replace
 792 @section Search and Replace
 793 @cindex replacement
 794
 795 @defun perform-replace from-string replacements query-flag regexp-flag delimited-flag &optional repeat-count map
 796 This function is the guts of @code{query-replace} and related commands.
 797 It searches for occurrences of @var{from-string} and replaces some or
 798 all of them.  If @var{query-flag} is @code{nil}, it replaces all
 799 occurrences; otherwise, it asks the user what to do about each one.
 800
 801 If @var{regexp-flag} is non-@code{nil}, then @var{from-string} is
 802 considered a regular expression; otherwise, it must match literally.  If
 803 @var{delimited-flag} is non-@code{nil}, then only replacements
 804 surrounded by word boundaries are considered.
 805
 806 The argument @var{replacements} specifies what to replace occurrences
 807 with.  If it is a string, that string is used.  It can also be a list of
 808 strings, to be used in cyclic order.
 809
 810 If @var{repeat-count} is non-@code{nil}, it should be an integer, the
 811 number of occurrences to consider.  In this case, @code{perform-replace}
 812 returns after considering that many occurrences.
 813
 814 Normally, the keymap @code{query-replace-map} defines the possible user
 815 responses for queries.  The argument @var{map}, if non-@code{nil}, is a
 816 keymap to use instead of @code{query-replace-map}.
 817 @end defun
 818
 819 @defvar query-replace-map
 820 This variable holds a special keymap that defines the valid user
 821 responses for @code{query-replace} and related functions, as well as
 822 @code{y-or-n-p} and @code{map-y-or-n-p}.  It is unusual in two ways:
 823
 824 @itemize @bullet
 825 @item
 826 The ``key bindings'' are not commands, just symbols that are meaningful
 827 to the functions that use this map.
 828
 829 @item
 830 Prefix keys are not supported; each key binding must be for a single event
 831 key sequence.  This is because the functions don't use read key sequence to
 832 get the input; instead, they read a single event and look it up ``by hand.''
 833 @end itemize
 834 @end defvar
 835
 836 Here are the meaningful ``bindings'' for @code{query-replace-map}.
 837 Several of them are meaningful only for @code{query-replace} and
 838 friends.
 839
 840 @table @code
 841 @item act
 842 Do take the action being considered---in other words, ``yes.''
 843
 844 @item skip
 845 Do not take action for this question---in other words, ``no.''
 846
 847 @item exit
 848 Answer this question ``no,'' and give up on the entire series of
 849 questions, assuming that the answers will be ``no.''
 850
 851 @item act-and-exit
 852 Answer this question ``yes,'' and give up on the entire series of
 853 questions, assuming that subsequent answers will be ``no.''
 854
 855 @item act-and-show
 856 Answer this question ``yes,'' but show the results---don't advance yet
 857 to the next question.
 858
 859 @item automatic
 860 Answer this question and all subsequent questions in the series with
 861 ``yes,'' without further user interaction.
 862
 863 @item backup
 864 Move back to the previous place that a question was asked about.
 865
 866 @item edit
 867 Enter a recursive edit to deal with this question---instead of any
 868 other action that would normally be taken.
 869
 870 @item delete-and-edit
 871 Delete the text being considered, then enter a recursive edit to replace
 872 it.
 873
 874 @item recenter
 875 Redisplay and center the window, then ask the same question again.
 876
 877 @item quit
 878 Perform a quit right away.  Only @code{y-or-n-p} and related functions
 879 use this answer.
 880
 881 @item help
 882 Display some help, then ask again.
 883 @end table
 884
 885 @node Match Data
 886 @section The Match Data
 887 @cindex match data
 888
 889   Emacs keeps track of the positions of the start and end of segments of
 890 text found during a regular expression search.  This means, for example,
 891 that you can search for a complex pattern, such as a date in an Rmail
 892 message, and then extract parts of the match under control of the
 893 pattern.
 894
 895   Because the match data normally describe the most recent search only,
 896 you must be careful not to do another search inadvertently between the
 897 search you wish to refer back to and the use of the match data.  If you
 898 can't avoid another intervening search, you must save and restore the
 899 match data around it, to prevent it from being overwritten.
 900
 901 @menu
 902 * Simple Match Data::     Accessing single items of match data,
 903                             such as where a particular subexpression started.
 904 * Replacing Match::       Replacing a substring that was matched.
 905 * Entire Match Data::     Accessing the entire match data at once, as a list.
 906 * Saving Match Data::     Saving and restoring the match data.
 907 @end menu
 908
 909 @node Simple Match Data
 910 @subsection Simple Match Data Access
 911
 912   This section explains how to use the match data to find the starting
 913 point or ending point of the text that was matched by a particular
 914 search, or by a particular parenthetical subexpression of a regular
 915 expression.
 916
 917 @defun match-beginning count
 918 This function returns the position of the start of text matched by the
 919 last regular expression searched for, or a subexpression of it.
 920
 921 If @var{count} is zero, then the value is the position of the start of
 922 the text matched by the whole regexp.  Otherwise, @var{count}, specifies
 923 a subexpression in the regular expresion.  The value of the function is
 924 the starting position of the match for that subexpression.
 925
 926 Subexpressions of a regular expression are those expressions grouped
 927 with escaped parentheses, @samp{\(@dots{}\)}.  The @var{count}th
 928 subexpression is found by counting occurrences of @samp{\(} from the
 929 beginning of the whole regular expression.  The first subexpression is
 930 numbered 1, the second 2, and so on.
 931
 932 The value is @code{nil} for a subexpression inside a
 933 @samp{\|} alternative that wasn't used in the match.
 934 @end defun
 935
 936 @defun match-end count
 937 This function returns the position of the end of the text that matched
 938 the last regular expression searched for, or a subexpression of it.
 939 This function is otherwise similar to @code{match-beginning}.
 940 @end defun
 941
 942   Here is an example of using the match data, with a comment showing the
 943 positions within the text:
 944
 945 @example
 946 @group
 947 (string-match "\\(qu\\)\\(ick\\)"
 948               "The quick fox jumped quickly.")
 949               ;0123456789
 950      @result{} 4
 951 @end group
 952
 953 @group
 954 (match-beginning 1)       ; @r{The beginning of the match}
 955      @result{} 4                 ;   @r{with @samp{qu} is at index 4.}
 956 @end group
 957
 958 @group
 959 (match-beginning 2)       ; @r{The beginning of the match}
 960      @result{} 6                 ;   @r{with @samp{ick} is at index 6.}
 961 @end group
 962
 963 @group
 964 (match-end 1)             ; @r{The end of the match}
 965      @result{} 6                 ;   @r{with @samp{qu} is at index 6.}
 966
 967 (match-end 2)             ; @r{The end of the match}
 968      @result{} 9                 ;   @r{with @samp{ick} is at index 9.}
 969 @end group
 970 @end example
 971
 972   Here is another example.  Point is initially located at the beginning
 973 of the line.  Searching moves point to between the space and the word
 974 @samp{in}.  The beginning of the entire match is at the 9th character of
 975 the buffer (@samp{T}), and the beginning of the match for the first
 976 subexpression is at the 13th character (@samp{c}).
 977
 978 @example
 979 @group
 980 (list
 981   (re-search-forward "The \\(cat \\)")
 982   (match-beginning 0)
 983   (match-beginning 1))
 984     @result{} (9 9 13)
 985 @end group
 986
 987 @group
 988 ---------- Buffer: foo ----------
 989 I read "The cat @point{}in the hat comes back" twice.
 990         ^   ^
 991         9  13
 992 ---------- Buffer: foo ----------
 993 @end group
 994 @end example
 995
 996 @noindent
 997 (In this case, the index returned is a buffer position; the first
 998 character of the buffer counts as 1.)
 999
1000 @node Replacing Match
1001 @subsection Replacing the Text That Matched
1002
1003   This function replaces the text matched by the last search with
1004 @var{replacement}.
1005
1006 @cindex case in replacements
1007 @defun replace-match replacement &optional fixedcase literal
1008 This function replaces the buffer text matched by the last search, with
1009 @var{replacement}.  It applies only to buffers; you can't use
1010 @code{replace-match} to replace a substring found with
1011 @code{string-match}.
1012
1013 If @var{fixedcase} is non-@code{nil}, then the case of the replacement
1014 text is not changed; otherwise, the replacement text is converted to a
1015 different case depending upon the capitalization of the text to be
1016 replaced.  If the original text is all upper case, the replacement text
1017 is converted to upper case.  If the first word of the original text is
1018 capitalized, then the first word of the replacement text is capitalized.
1019 If the original text contains just one word, and that word is a capital
1020 letter, @code{replace-match} considers this a capitalized first word
1021 rather than all upper case.
1022
1023 If @code{case-replace} is @code{nil}, then case conversion is not done,
1024 regardless of the value of @var{fixed-case}.  @xref{Searching and Case}.
1025
1026 If @var{literal} is non-@code{nil}, then @var{replacement} is inserted
1027 exactly as it is, the only alterations being case changes as needed.
1028 If it is @code{nil} (the default), then the character @samp{\} is treated
1029 specially.  If a @samp{\} appears in @var{replacement}, then it must be
1030 part of one of the following sequences:
1031
1032 @table @asis
1033 @item @samp{\&}
1034 @cindex @samp{&} in replacement
1035 @samp{\&} stands for the entire text being replaced.
1036
1037 @item @samp{\@var{n}}
1038 @cindex @samp{\@var{n}} in replacement
1039 @samp{\@var{n}}, where @var{n} is a digit, stands for the text that
1040 matched the @var{n}th subexpression in the original regexp.
1041 Subexpressions are those expressions grouped inside @samp{\(@dots{}\)}.
1042
1043 @item @samp{\\}
1044 @cindex @samp{\} in replacement
1045 @samp{\\} stands for a single @samp{\} in the replacement text.
1046 @end table
1047
1048 @code{replace-match} leaves point at the end of the replacement text,
1049 and returns @code{t}.
1050 @end defun
1051
1052 @node Entire Match Data
1053 @subsection Accessing the Entire Match Data
1054
1055   The functions @code{match-data} and @code{set-match-data} read or
1056 write the entire match data, all at once.
1057
1058 @defun match-data
1059 This function returns a newly constructed list containing all the
1060 information on what text the last search matched.  Element zero is the
1061 position of the beginning of the match for the whole expression; element
1062 one is the position of the end of the match for the expression.  The
1063 next two elements are the positions of the beginning and end of the
1064 match for the first subexpression, and so on.  In general, element
1065 @ifinfo
1066 number 2@var{n}
1067 @end ifinfo
1068 @tex
1069 number {\mathsurround=0pt $2n$}
1070 @end tex
1071 corresponds to @code{(match-beginning @var{n})}; and
1072 element
1073 @ifinfo
1074 number 2@var{n} + 1
1075 @end ifinfo
1076 @tex
1077 number {\mathsurround=0pt $2n+1$}
1078 @end tex
1079 corresponds to @code{(match-end @var{n})}.
1080
1081 All the elements are markers or @code{nil} if matching was done on a
1082 buffer, and all are integers or @code{nil} if matching was done on a
1083 string with @code{string-match}.  (In Emacs 18 and earlier versions,
1084 markers were used even for matching on a string, except in the case
1085 of the integer 0.)
1086
1087 As always, there must be no possibility of intervening searches between
1088 the call to a search function and the call to @code{match-data} that is
1089 intended to access the match data for that search.
1090
1091 @example
1092 @group
1093 (match-data)
1094      @result{}  (#<marker at 9 in foo>
1095           #<marker at 17 in foo>
1096           #<marker at 13 in foo>
1097           #<marker at 17 in foo>)
1098 @end group
1099 @end example
1100 @end defun
1101
1102 @defun set-match-data match-list
1103 This function sets the match data from the elements of @var{match-list},
1104 which should be a list that was the value of a previous call to
1105 @code{match-data}.
1106
1107 If @var{match-list} refers to a buffer that doesn't exist, you don't get
1108 an error; that sets the match data in a meaningless but harmless way.
1109
1110 @findex store-match-data
1111 @code{store-match-data} is an alias for @code{set-match-data}.
1112 @end defun
1113
1114 @node Saving Match Data
1115 @subsection Saving and Restoring the Match Data
1116
1117   When you call a function that may do a search, you may need to save
1118 and restore the match data around that call, if you want to preserve the
1119 match data from an earlier search for later use.  Here is an example
1120 that shows the problem that arises if you fail to save the match data:
1121
1122 @example
1123 @group
1124 (re-search-forward "The \\(cat \\)")
1125      @result{} 48
1126 (foo)                   ; @r{Perhaps @code{foo} does}
1127                         ;   @r{more searching.}
1128 (match-end 0)
1129      @result{} 61              ; @r{Unexpected result---not 48!}
1130 @end group
1131 @end example
1132
1133   You can save and restore the match data with @code{save-match-data}:
1134
1135 @defspec save-match-data body@dots{}
1136 This special form executes @var{body}, saving and restoring the match
1137 data around it.
1138 @end defspec
1139
1140   You can use @code{set-match-data} together with @code{match-data} to
1141 imitate the effect of the special form @code{save-match-data}.  This is
1142 useful for writing code that can run in Emacs 18.  Here is how:
1143
1144 @example
1145 @group
1146 (let ((data (match-data)))
1147   (unwind-protect
1148       @dots{}   ; @r{May change the original match data.}
1149     (set-match-data data)))
1150 @end group
1151 @end example
1152
1153   Emacs automatically saves and restores the match data when it runs
1154 process filter functions (@pxref{Filter Functions}) and process
1155 sentinels (@pxref{Sentinels}).
1156
1157 @ignore
1158   Here is a function which restores the match data provided the buffer
1159 associated with it still exists.
1160
1161 @smallexample
1162 @group
1163 (defun restore-match-data (data)
1164 @c It is incorrect to split the first line of a doc string.
1165 @c If there's a problem here, it should be solved in some other way.
1166   "Restore the match data DATA unless the buffer is missing."
1167   (catch 'foo
1168     (let ((d data))
1169 @end group
1170       (while d
1171         (and (car d)
1172              (null (marker-buffer (car d)))
1173 @group
1174              ;; @file{match-data} @r{buffer is deleted.}
1175              (throw 'foo nil))
1176         (setq d (cdr d)))
1177       (set-match-data data))))
1178 @end group
1179 @end smallexample
1180 @end ignore
1181
1182 @node Searching and Case
1183 @section Searching and Case
1184 @cindex searching and case
1185
1186   By default, searches in Emacs ignore the case of the text they are
1187 searching through; if you specify searching for @samp{FOO}, then
1188 @samp{Foo} or @samp{foo} is also considered a match.  Regexps, and in
1189 particular character sets, are included: thus, @samp{[aB]} would match
1190 @samp{a} or @samp{A} or @samp{b} or @samp{B}.
1191
1192   If you do not want this feature, set the variable
1193 @code{case-fold-search} to @code{nil}.  Then all letters must match
1194 exactly, including case.  This is a buffer-local variable; altering the
1195 variable affects only the current buffer.  (@xref{Intro to
1196 Buffer-Local}.)  Alternatively, you may change the value of
1197 @code{default-case-fold-search}, which is the default value of
1198 @code{case-fold-search} for buffers that do not override it.
1199
1200   Note that the user-level incremental search feature handles case
1201 distinctions differently.  When given a lower case letter, it looks for
1202 a match of either case, but when given an upper case letter, it looks
1203 for an upper case letter only.  But this has nothing to do with the
1204 searching functions Lisp functions use.
1205
1206 @defopt case-replace
1207 This variable determines whether the replacement functions should
1208 preserve case.  If the variable is @code{nil}, that means to use the
1209 replacement text verbatim.  A non-@code{nil} value means to convert the
1210 case of the replacement text according to the text being replaced.
1211
1212 The function @code{replace-match} is where this variable actually has
1213 its effect.  @xref{Replacing Match}.
1214 @end defopt
1215
1216 @defopt case-fold-search
1217 This buffer-local variable determines whether searches should ignore
1218 case.  If the variable is @code{nil} they do not ignore case; otherwise
1219 they do ignore case.
1220 @end defopt
1221
1222 @defvar default-case-fold-search
1223 The value of this variable is the default value for
1224 @code{case-fold-search} in buffers that do not override it.  This is the
1225 same as @code{(default-value 'case-fold-search)}.
1226 @end defvar
1227
1228 @node Standard Regexps
1229 @section Standard Regular Expressions Used in Editing
1230 @cindex regexps used standardly in editing
1231 @cindex standard regexps used in editing
1232
1233   This section describes some variables that hold regular expressions
1234 used for certain purposes in editing:
1235
1236 @defvar page-delimiter
1237 This is the regexp describing line-beginnings that separate pages.  The
1238 default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"});
1239 this matches a line that starts with a formfeed character.
1240 @end defvar
1241
1242 @defvar paragraph-separate
1243 This is the regular expression for recognizing the beginning of a line
1244 that separates paragraphs.  (If you change this, you may have to
1245 change @code{paragraph-start} also.)  The default value is
1246 @w{@code{"^[@ \t\f]*$"}}, which matches a line that consists entirely of
1247 spaces, tabs, and form feeds.
1248 @end defvar
1249
1250 @defvar paragraph-start
1251 This is the regular expression for recognizing the beginning of a line
1252 that starts @emph{or} separates paragraphs.  The default value is
1253 @w{@code{"^[@ \t\n\f]"}}, which matches a line starting with a space, tab,
1254 newline, or form feed.
1255 @end defvar
1256
1257 @defvar sentence-end
1258 This is the regular expression describing the end of a sentence.  (All
1259 paragraph boundaries also end sentences, regardless.)  The default value
1260 is:
1261
1262 @example
1263 "[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*"
1264 @end example
1265
1266 This means a period, question mark or exclamation mark, followed
1267 optionally by a closing parenthetical character, followed by tabs,
1268 spaces or new lines.
1269
1270 For a detailed explanation of this regular expression, see @ref{Regexp
1271 Example}.
1272 @end defvar